Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data model and lifecycle for alerts #1857

Open
mairas opened this issue Dec 29, 2024 · 3 comments
Open

Data model and lifecycle for alerts #1857

mairas opened this issue Dec 29, 2024 · 3 comments

Comments

@mairas
Copy link
Contributor

mairas commented Dec 29, 2024

Signal K is lacking functionality for alarm/alert management. Version 1 specification discusses alarm, alert and notification handling but the treatment is cursory and not sufficient for implementation. PR #1560 has ongoing discussion for a v2 Notifications API but the PR doesn't address conceptual differences between notifications and alerts.

I wrote a short paper, comparing two IMO marine alert standards and an IEC process industry alarm standard and providing my own opinions on a suitable data model and lifecycle for Signal K alarms/alerts.

Data Model and Lifecycle for Signal K Alerts: Comparison of Related Standards

Introduction

The existing SK notification and alarm models are not well defined and warrant an overhaul. Alarm model, semantics, lifecycle and best practices need to be clearly defined together with proper alarm data models and interfaces. I am not addressing alarm creation logic in this writeup.

I have been looking at the relevant process industry and maritime standards to learn from the experts. IMO has published several relevant resolutions, including MSC.302(87) "Adoption of Performance Standards for Bridge Alert Management" and A.1021(26) "Code on Alerts and Indicators". The IMO resolutions are freely available.

The key process industry standard is IEC 62682:2023 "Management of alarm systems for the process industries". It needs to be separately purchased; you can ask me (Matti Airas) for excerpts, summaries and more.

MSC.302(87) and A.1021(26)

MSC.302(87) defines different alert priorities, alert states, their presentation as well as human machine interface aspects. A.1021(26) defines a categorization for different alarm types as well as their presentation.

alerts are prioritized to:

  • Emergency alarm: immediate danger to human life or ship and its machinery, immediate action must be taken
  • Alarms: conditions requiring immediate attention and action to avoid any kind of hazardous situation and to maintain safe operation
  • Warnings: conditions or situations requiring immediate action for precautionary reasons; conditions which are not immediately hazardous but may become so
  • Cautions: condition requiring attention

Alarms are further categorized to cat. A, B and C, depending on whether they require additional information to rectify or if they can't be acknowledged at the bridge.

EA, A and W priorities must be acknowledged. Alerts can also be temporarily silenced without acknowledging them. Additionally, different audible or visual indications are defined for different priorities and alert types. For example:

  • EA: Indicated primarily by a an audible signal, in noisy areas with supplementary indication lights
  • A: Audible alarm signal and visual announcement. Can be temporarily silenced for 30 s. Unacknowledged alarms should have a flashing indication, acknowledged alarms should have a steady visual indication.
  • W: Momentary audible signal accompanied with visual warning announcement.
  • C: Steady visual indication.

Alerts can further be escalated: unacked warnings should be escalated to alarms after a defined time interval.

MSC.302(87) also provides requirements for the human-machine interfaces for Central Alert Management functionality.

IEC 68682:2023 Management of alarm systems for process industries

This standard defines terminology and models for alarm systems as well as recommended work processes to maintain the alarm throughout its life cycle.

The following state model is defined for all alarms:

Screenshot_2024-07-31_at_20 57 31

Alarm priorities are not explicitly defined but low-medium-high are mentioned in some example and recommendation tables.

Furthermore, alarm indication recommendations are given as follows:

Screenshot_2024-07-31_at_20 59 37

Other content of note includes detailed human-machine interface design requirements for alarm management interfaces in Section 11, including information, functional, display and records requirements.

Comparisons

Terminology

IEC 62682 has very specific terminology definitions. At the very top level, alarm is an "audible and/or visible means of indicating to the operator an equipment malfunction, process deviation, or abnormal condition requiring a timely response", while operator alert is an "audible and/or visible means of indicating to the operator an equipment or process condition for evaluation when time allows which could result in a response", that is, the actual sound or visual indication. In contrast, IMO resolutions use an umbrella definition of alert which is subdivided to different priorities.

State Models

The alarm state models are used by IEC 68682 and the IMO resolutions are largely compatible. The IMO resolutions do not define or discuss the returned-to-normal unacknowledged states, neither do they allow "shelving" of, or temporarily dismissing alerts.

Alarm Priorities and Alarm State Indications

The IMO resolutions provide clearly defined alarm priorities that each have unique state transitions and requirements. Both standard sets provide indication requirements which are more or less compatible with each other, except that in the IMO resolutions the priorities have more effect on the type and style of indication.

Alarm Acknowledgments and Resets

Acknowledging is an operator action that confirms recognition of an alarm. If the condition triggering the alarm is still ongoing -- for example, if the engine oil pressure remains low -- then the alarm won't be dismissed by acknowledging; the alarm indicators are merely suppressed.

IEC 68682 defines latching alarms, or alarms that remain active even when the initial condition has returned back to normal. These alarms may have a separate reset functionality to remove the alarm condition.

Neither standard allow for acknowledging multiple alarms; they should be addressed individually. However, alarm aggregation is allowed: multiple alarms may be combined into one aggregate alarm that can be addressed as one unit.

Alarm Silencing

Both standard sets support the concept of alarm silencing: there should be an interface to silence the alarms, meaning that any audible indicators are suppressed. It should be possible to silence all alarms at once.

The IMO resolutions further state that depending on the alert priority, silencing should be temporary. For the alarm priority, for example, the alarm should be automatically unsilenced in 30s.

Discussion and Applicability to Signal K

Having summarily read both IEC 68682 and the two IMO Resolutions, I believe the domain similarity of the IMO resolutions makes them mostly applicable to Signal K. In particular, I would use their alert prioritization and terminology as is. However, given that IEC 68682 provides more extensive terminology definitions, I would adopt the use of that terminology whenever it is compatible with the IMO one.

Notifications or information messages that are temporary in nature and for which no acknowledgments are expected could be handled by the same model as an added priority level, or they could be implemented separately.

The IEC 68682 state model is a useful representation of allowable alarm states. In my opinion, states E, F and G (Shelving, Suppressed by Design and Out of Service) can be omitted from the Signal K model. Shelving is an operation relevant to more complex systems with a large number of alarms, and the other two states are better handled by the system creating the alarms. Return to normal unacknowledged, however, is a state that I believe could be potentially useful.

Alarm acknowledgment, silencing and escalation could be modeled as described in the IMO resolutions. For one-shot alarm events that should stay active after a single trigger event, such as autopilot arrival warnings, latching should be implemented, but acknowledging a latched alarm should also reset it automatically.

Data Model and Implementation

I have not discussed data model or implementation of an alarm management system. That should be done separately if the general principles are agreed upon.

Furthermore, I have not addressed compatibility to existing SK notification models or other discussion here on Discord, on GitHub issues or on other related projects.

@mairas
Copy link
Contributor Author

mairas commented Dec 29, 2024

Possible next steps:

  • Define a concrete alert data model to facilitate discussion. (Subject to change based on prototype implementation experiences)
  • Implement a Central Alert Manager plugin with APIs for alert raising, acknowledging, silencing and listing (based on different criteria)
  • Implement a PoC UI for alert management
  • Add the data model, lifecycle, APIs and HMI/UI recommendations to the v2 spec.

@mairas
Copy link
Contributor Author

mairas commented Dec 29, 2024

To be specified:

  • Signal K is a distributed system. Who is responsible for raising alerts? Should any device be able to do that?
  • Who manages the alerts once they are raised?
  • What happens if the source device goes offline while the alert is active?

@panaaj
Copy link
Member

panaaj commented Jan 2, 2025

@mairas @tkurki some initial thoughts for comment...

  1. I think that using the term Alert moving forward would be a good start as it:
  • Avoids ambiguity with V1 terms notifications and alarms.
  • Aligns with the terminology in the documentation you reference and also other work in this space e.g. OpenBridge.
  1. We will want to clearly define the interfaces for each of the possible Alert sources to enact the required operation.

There may be others but here are some:

  • client apps (http API)
  • plugins (ServerApi methods)
  • zone threshold transitions (?)
  • sensors (websocket requests)
  1. Clearly define the roles of Alerts and Notifications as well as their relationship and how that relationship is managed. Also as @mairas has already mentioned, the responsibilities for maintaining Alert state.

panaaj added a commit that referenced this issue Jan 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants