2 .. Copyright (c) 2019 AT&T Intellectual Property.
3 .. Copyright (c) 2019 Nokia.
5 .. Licensed under the Creative Commons Attribution 4.0 International
6 .. Public License (the "License"); you may not use this file except
7 .. in compliance with the License. You may obtain a copy of the License at
9 .. https://creativecommons.org/licenses/by/4.0/
11 .. Unless required by applicable law or agreed to in writing, documentation
12 .. distributed under the License is distributed on an "AS IS" BASIS,
13 .. WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15 .. See the License for the specific language governing permissions and
16 .. limitations under the License.
31 RIC alarm system consists of three components: Alarm Manager, Application Library and Command Line Interface
33 The Alarm Manager is responsible for managing alarm situations in RIC cluster and interfacing with Northbound applications
34 such as Prometheus Alert Manager to post the alarms as alerts. Alert Manager takes care of de-duplicating, silencing and
35 inhibition (suppressing) of alerts, and routing them to the VES-Agent, which, in turn, takes care of converting alerts to
36 faults and sending them to ONAP as VES events.
38 The Alarm Library provides a simple interface for RIC applications (both platform application and xApps) to raise and clear
39 alarms. The Alarm Library interacts with the Alarm Manager via RMR interface.
41 .. image:: images/RIC_Alarm_System.png
43 :alt: Place in RIC's software architecture picture
48 The Alarm Manager listens alarms coming via RMR and REST interfaces. An application can raise or clear alarms via either
49 of interfaces. Alarm Manager listens also commands coming from CLI (Command Line Interface). In addition Alarm Manager supports few
50 other commands that can be given through the interfaces. Such as list active alarms, list alarm history, add new alarms
51 definition, delete existing alarm definition, re-raise alarms and clear all alarms. Those are not typically used by applications while
52 running. Alarm Manager itself re-raises alarms periodically to keep alarms in active state. The other commands are can be used through
53 CLI interface by operator or are used when applications is starting up or restarting.
55 Maximum amount of active alarms and size of alarm history are configurable. By default, the values are Maximum number of active
56 alarms = 5000, Maximum number of alarm history = 20,000.
58 Alarm definitions can be updated dynamically via REST interface. Default definitions are read from JSON configuration file when FM
64 The Alarm Library provides simple interface for RIC applications (both platform application and xApps) to raise and clear
65 alarms. A new alarm instance is created with InitAlarm()-function. ManagedObject (mo) and Application (ap) identities are
66 given as parameters for Alarm Context/Object
68 The Alarm object contains following parameters:
70 \* SpecificProblem: problem that is the cause of the alarm
72 PerceivedSeverity: The severity of the alarm, see below for possible values
74 \* ManagedObjectId: The name of the managed object that is the cause of the fault
76 \* ApplicationId: The name of the process raised the alarm
78 AdditionalInfo: Additional information given by the application
80 \* IdentifyingInfo: Identifying additional information, which is part of alarm identity
82 Items marked with \*, i.e., ManagedObjectId (mo), SpecificProblem (sp), ApplicationId (ap) and IdentifyingInfo (IdentifyingInfo) make
83 up the identity of the alarm. All parameters must be according to the alarm definition, i.e. all mandatory parameters should be present,
84 and parameters should have correct value type or be from some predefined range. Addressing the same alarm instance in a clear() or reraise()
85 call is done by making sure that all four values are the same is in the original raise() / reraise() call.
87 Alarm Manager does not allow raising "same alarm" more than once without that the alarm is cleared first. Alarm Manager compares
88 ManagedObjectId (mo), SpecificProblem (sp), ApplicationId (ap) and IdentifyingInfo (IdentifyingInfo) parameters to check possible
89 duplicate. If the values are the same then alarm is suppressed. If application raises the "same alarm" but PerceivedSeverity of the alarm
90 is changed then Alarm Manager deletes the old alarm and makes new alarm according to new information.
95 Raise: Raises the alarm instance given as a parameter
97 Clear: Clears the alarm instance given as a parameter, if it the alarm active
99 Reraise: Attempts to re-raise the alarm instance given as a parameter
101 ClearAll: Clears all alarms matching moId and appId given as parameters (not supported yet)
104 Command line interface
105 ----------------------
107 Through CLI operator can do the following operations:
109 - Check active alarms
110 - Check alarm history
113 - Configure maximum active alarms and maximum alarms in alarm history
114 - Add new alarm definitions that can be raised
115 - Delete existing alarm definition that can be raised
117 CLI commands need to be given inside Alarm Manger pod. To get there first print name of the Alarm Manger pod.
119 kubectl get pods -A | grep alarmmanager
121 Output should be look someting like this:
123 ricplt deployment-ricplt-alarmmanager-6cc8764749-gnwjh 1/1 running 0 15d
125 Then give this command to enter inside the pod. Replace the pod name with the actual name from the printout.
127 kubectl exec -it deployment-ricplt-alarmmanager-6cc8764749-gnwjh bash
129 CLI commands can have some of the following parameters
131 - \--moid ManagedObjectId, example string: RIC
132 - \--apid ApplicationId string, example string: UEEC
133 - \--sp SpecificProblem, example value: 8007
134 - \--severity Severity of the alarm, possible values: UNSPECIFIED, CRITICAL, MAJOR, MINOR, WARNING, CLEARED or DEFAULT
135 - \--iinfo Identifying info, a user specified string, example string: INFO-1
136 - \--mal Maximum number of active alarms, example value 1000
137 - \--mah Maximum number of alarms in alarm history, example value: 2000
138 - \--aid Alarm id, example value: 8007
139 - \--atx Alarm text string, example string: E2 CONNECTIVITY LOST TO E-NODEB
140 - \--ety Event type string, example string: Communication error
141 - \--oin Operation instructions string, example string: Not defined
142 - \--prf Performance profile id, possible values: 1 = peak performance test or 2 = endurance test
143 - \--nal Number of alarms, example value: 50
144 - \--aps Alarms per second, example value: 1
145 - \--tim Total time of test in minutes, example value: 1
146 - \--host Alarm Manager REST address: default value = localhost
147 - \--port Alarm Manager REST port: default value = 8080
148 - \--if Used Alarm Manager command interface, http or rmr: default value = http
151 ``Note that there are two minus signs before parameter name!``
153 If parameter contains any white spaces then it must be enclosed in quotation marks like: "INFO 1"
155 CLI command examples:
157 Following command are given at top level directory!
163 Syntax: cli/alarm-cli active [--host] [--port]
165 Example: cli/alarm-cli active
167 Example: cli/alarm-cli active --host localhost --port 8080
173 Syntax: cli/alarm-cli active [--host] [--port]
175 Example: cli/alarm-cli history
177 Example: cli/alarm-cli history --host localhost --port 8080
183 Syntax: cli/alarm-cli raise --moid --apid --sp --severity --iinfo [--host] [--port] [--if]
185 Example: cli/alarm-cli raise --moid RIC --apid UEEC --sp 8007 --severity CRITICAL --iinfo INFO-1
187 Following is meant only for testing and verification purpose!
189 Example: cli/alarm-cli raise --moid RIC --apid UEEC --sp 8007 --severity CRITICAL --iinfo INFO-1 --host localhost --port 8080 --if rmr
195 Syntax: cli/alarm-cli clear --moid --apid --sp --severity --iinfo [--host] [--port] [--if]
197 Example: cli/alarm-cli clear --moid RIC --apid UEEC --sp 8007 --iinfo INFO-1
199 Example: cli/alarm-cli clear --moid RIC --apid UEEC --sp 8007 --iinfo INFO-1 --host localhost --port 8080 --if rmr
201 Configure maximum active alarms and maximum alarms in alarm history:
205 Syntax: cli/alarm-cli configure --mal --mah [--host] [--port]
207 Example: cli/alarm-cli configure --mal 1000 --mah 5000
209 Example: cli/alarm-cli configure --mal 1000 --mah 5000 --host localhost --port 8080
211 Add new alarm definition:
215 Syntax: cli/alarm-cli define --aid 8007 --atx "E2 CONNECTIVITY LOST TO E-NODEB" --ety "Communication error" --oin "Not defined" [--host] [--port]
217 Example: cli/alarm-cli define --aid 8007 --atx "E2 CONNECTIVITY LOST TO E-NODEB" --ety "Communication error" --oin "Not defined"
219 Example: cli/alarm-cli define --aid 8007 --atx "E2 CONNECTIVITY LOST TO E-NODEB" --ety "Communication error" --oin "Not defined" --host localhost --port 8080
221 Delete existing alarm definition:
225 Syntax: cli/alarm-cli undefine --aid [--host] [--port]
227 Example: cli/alarm-cli undefine --aid 8007
229 Example: cli/alarm-cli undefine --aid 8007 --host localhost --port 8080
231 Conduct performance test:
233 Note that this is meant only for testing and verification purpose!
235 Before any performance test command can be issued, an environment variable needs to be set. The variable holds information where
236 test alarm object file is stored.
240 PERF_OBJ_FILE=cli/perf-alarm-object.json
242 Syntax: cli/alarm-cli perf --prf --nal --aps --tim [--host] [--port] [--if]
244 Peak performance test example: cli/alarm-cli perf --prf 1 --nal 50 --aps 1 --tim 1 --if rmr
246 Peak performance test example: cli/alarm-cli perf --prf 1 --nal 50 --aps 1 --tim 1 --if http
248 Peak performance test example: cli/alarm-cli perf --prf 1 --nal 50 --aps 1 --tim 1 --host localhost --port 8080 --if rmr
250 Endurance test example: cli/alarm-cli perf --prf 2 --nal 50 --aps 1 --tim 1 --if rmr
252 Endurance test example: cli/alarm-cli perf --prf 2 --nal 50 --aps 1 --tim 1 --if http
254 Endurance test example: cli/alarm-cli perf --prf 2 --nal 50 --aps 1 --tim 1 --host localhost --port 8080 --if rmr
257 REST interface usage guide
258 --------------------------
260 REST interface offers all the same services plus some more that are available via CLI. The CLI also uses the REST interface to implement the services it offers.
262 Below are examples for REST interface. Curl tool is used to send REST commands.
266 Example: curl -X GET "http://localhost:8080/ric/v1/alarms/active" -H "accept: application/json" -H "Content-Type: application/json" -d "{}"
270 Example: curl -X GET "http://localhost:8080/ric/v1/alarms/history" -H "accept: application/json" -H "Content-Type: application/json" -d "{}"
274 Example: curl -X POST "http://localhost:8080/ric/v1/alarms" -H "accept: application/json" -H "Content-Type: application/json" -d "{\"managedObjectId\": \"RIC\", \"applicationId\": \"UEEC\", \"specificProblem\": 8007, \"perceivedSeverity\": \"CRITICAL\", \"additionalInfo\": \"-\", \"identifyingInfo\": \"INFO-1\", \"AlarmAction\": \"RAISE\", \"AlarmTime\": 0}"
278 Example: curl -X DELETE "http://localhost:8080/ric/v1/alarms" -H "accept: application/json" -H "Content-Type: application/json" -d "{\"managedObjectId\": \"RIC\", \"applicationId\": \"UEEC\", \"specificProblem\": 8007, \"perceivedSeverity\": \"\", \"additionalInfo\": \"-\", \"identifyingInfo\": \"INFO-1\", \"AlarmAction\": \"CLEAR\", \"AlarmTime\": 0}"
280 Get configuration of maximum active alarms and maximum alarms in alarm history:
282 Example: curl -X GET "http://localhost:8080/ric/v1/alarms/config" -H "accept: application/json" -H "Content-Type: application/json" -d "{}"
284 Configure maximum active alarms and maximum alarms in alarm history:
286 Example: curl -X POST "http://localhost:8080/ric/v1/alarms/config" -H "accept: application/json" -H "Content-Type: application/json" -d "{\"maxactivealarms\": 1000, \"maxalarmhistory\": 5000}"
288 Get all alarm definitions:
290 Example: curl -X GET "http://localhost:8080/ric/v1/alarms/define" -H "accept: application/json" -H "Content-Type: application/json" -d "{}"
292 Get an alarm definition:
294 Syntax: curl -X GET "http://localhost:8080/ric/v1/alarms/define/{alarmId}" -H "accept: application/json" -H "Content-Type: application/json" -d "{}"
296 Example: curl -X GET "http://localhost:8080/ric/v1/alarms/define/8007" -H "accept: application/json" -H "Content-Type: application/json" -d "{}"
298 Add one new alarm definition:
300 Example: curl -X POST "http://localhost:8080/ric/v1/alarms/define" -H "accept: application/json" -H "Content-Type: application/json" -d "{\"alarmdefinitions\": [{\"alarmId\": 8007, \"alarmText\": \"E2 CONNECTIVITY LOST TO E-NODEB\", \"eventtype\": \"Communication error\", \"operationinstructions\": \"Not defined\"}]}"
302 Add two new alarm definitions:
304 Example: curl -X POST "http://localhost:8080/ric/v1/alarms/define" -H "accept: application/json" -H "Content-Type: application/json" -d "{\"alarmdefinitions\": [{\"alarmId\": 8007, \"alarmText\": \"E2 CONNECTIVITY LOST TO E-NODEB\", \"eventtype\": \"Communication error\", \"operationinstructions\": \"Not defined\"},{\"alarmId\": 8008, \"alarmText\": \"ACTIVE ALARM EXCEED MAX THRESHOLD\", \"eventtype\": \"storage warning\", \"operationinstructions\": \"Clear alarms or raise threshold\"}]}"
306 Delete one existing alarm definition:
308 Syntax: curl -X DELETE "http://localhost:8080/ric/v1/alarms/define/{alarmId}" -H "accept: application/json" -H "Content-Type: application/json" -d "{}"
310 Example: curl -X DELETE "http://localhost:8080/ric/v1/alarms/define/8007" -H "accept: application/json" -H "Content-Type: application/json" -d "{}"
313 RMR interface usage guide
314 -------------------------
315 Through RMR interface application can only raise and clear alarms. RMR message payload is similar JSON message as in above REST interface use cases.
317 Supported events via RMR interface
322 - ClearAll alarms (not supported yet)
325 Example on how to use the API from Golang code
326 ----------------------------------------------
327 Alarm library functions can be used directly from Golang code. Rising and clearing alarms goes via RMR interface from alarm library to Alarm Manager.
335 alarm "gerrit.o-ran-sc.org/r/ric-plt/alarm-go/alarm"
339 // Initialize the alarm component
340 alarmer, err := alarm.InitAlarm("my-pod", "my-app")
342 // Create a new Alarm object (SP=8004, etc)
343 alarm := alarmer.NewAlarm(8004, alarm.SeverityMajor, "NetworkDown", "eth0")
345 // Raise an alarm (SP=8004, etc)
346 err := alarmer.Raise(alarm)
348 // Clear an alarm (SP=8004)
349 err := alarmer.Clear(alarm)
351 // Re-raise an alarm (SP=8004)
352 err := alarmer.Reraise(alarm)
354 // Clear all alarms raised by the application - (not supported yet)
355 err := alarmer.ClearAll()
364 INFO[2020-06-08T07:50:10Z]
367 "commonEventHeader": {
369 "eventId": "fault0000000001",
370 "eventName": "Fault_ricp_E2 CONNECTIVITY LOST TO G-NODEB",
371 "lastEpochMicrosec": 1591602610944553,
372 "nfNamingCode": "ricp",
373 "priority": "Medium",
374 "reportingEntityId": "035EEB88-7BA2-4C23-A349-3B6696F0E2C4",
375 "reportingEntityName": "Vespa",
378 "startEpochMicrosec": 1591602610944553,
383 "alarmCondition": "E2 CONNECTIVITY LOST TO G-NODEB",
384 "eventSeverity": "MAJOR",
385 "eventSourceType": "virtualMachine",
386 "faultFieldsVersion": 2,
387 "specificProblem": "eth12",
392 INFO[2020-06-08T07:50:10Z] Schema validation succeeded