Challenge
Energy and water supply systems safeguard the economic and social well-being of the population and are critical infrastructures (KRITIS for short) according to EU Directive 2008/114/EC. Protecting critical infrastructures from threats is therefore one of the essential basic requirements for quality of life and value creation in Germany.
Energy and water supply systems consist of distributed and interconnected subsystems that are largely controlled and monitored by automation components that are also distributed, such as controllers, sensors and actuators. The reliable operation of these systems depends, among other things, directly on the functioning of the communication networks, i.e., these networks are becoming the focus of measures to maintain KRITIS. Today, network analysis systems can be used to monitor communication between components and detect security incidents. Such a security incident, like the failure of a communication link, can result in a disruption of the energy and water supply.
A security incident detected by the network analysis system is then first evaluated by the KRITIS operator's service personnel and a course of action is determined, e.g., "replace the failed communication system." Today, this process usually takes several hours, since the service personnel must first travel to the location of the affected system (systems distributed over a large area in cities and rural regions) and, if necessary, obtain replacement parts.
In any case, interpreting the warning messages and drawing conclusions about the cause, especially in the case of network problems, is generally very difficult on the one hand and requires a high level of IT expertise on the other. In other words, such errors and threats today usually overwhelm the operating personnel and therefore lead to long downtimes and a high threat level. In addition, IT structures in particular are susceptible to remote threats, as most malfunctions do not require physical system access.
Solution approach
The overall goal of the project is to research and develop an AI-based solution approach to automatically assess events and generate recommended actions for service personnel of KRITIS operators (see Figure 1). This is composed of several components:
- Learning of an event model from event reports (text documents are evaluated by machine and a mathematical representation is generated)
- Learning a network model from network events
- Correlation of the event model and the network model for the assessment of security incidents (both models have to be correlated e.g. mapping of network address and location of the physical system)
- Generation of recommendations for action for the service personnel
- Automatic configuration adjustment of network analysis systems
- Explainability of AI decisions