Four steps to greater network resilience

As lifeline service providers, utilities must be resilient to a wide variety of threats, including so-called ‘black swan events’ – those for which there are no precedents. Jay Cadman, of IQGeo looks at how to create robust risk management strategies in the face of uncertainty.

Utilities operators now face a fluctuating, fast-paced threat landscape with incidents becoming more unpredictable and serious in scope. This new operating environment poses a challenge to a network risk management model that has historically been more reactive than proactive. To ensure maximum network uptime, businesses, institutions, and consumers alike need a thorough resilience strategy in place, underpinned by crucial and reliable customer service.

For example, the increase in the frequency and severity of extreme weather events as evidenced by recent forest fires, deep freezes and heat domes leaves grids often facing multiple simultaneous meteorological hazards. Not only are extreme weather events increasingly unpredictable, the decentralisation of renewable energy storage and generation means that grids have a wider array of geographical vulnerabilities. Cyber security threats are also growing in scale and significance in the era of ‘smart grids.’ This new risk profile reflects the critical interdependencies between an energy grid’s physical and virtual infrastructure. Utilities are face with a widening attack surface caused by the increasingly complex and expanding ‘Internet of Things’ as well as the shift to decentralised systems, and the rise of ransomware attacks.

A new risk management model

This fast-changing and diverse threat landscape requires utilities to go beyond reactive disaster recovery tactics and instead adopt proactive network resilience strategies. Today, risk management strategies must prepare for threats common to a network but also account for ‘black swan events’ – unexpected incidents for which there may be no historical precedent. Resilience rests on the ability to capture, curate, and integrate data from every part of the network and the field workforce to provide a holistic and accurate view of the network so the potential risks can be properly assessed.

Utilities can use historic data to inform resilience plans by identifying the hazards most common to their service area and institutionalising insights from previous similar events. Live data is also essential to provide a real-time ‘risk picture’, which can be used to identify new trends. For example, some operators are using geospatial data to create a location-based outage dashboard to track the site and source of power outages in real-time. In doing so, they are also helping to ensure efficient crisis management and promoting long-term customer satisfaction.

The ‘data gap’

Many operators often lack access to valuable data because their geospatial information systems are not well-integrated with information from the network and workforce. Network data is rarely held in an accessible, digital, mobile-friendly format which would enable operators to draw on live intelligence from workers in the field. Crucially, many networks do not link all network data with location, which is critical to understanding the root of risks in all regions of their network.

Many operators are also not documenting their network in a digital System of Record (SoR), while the data that does exist is siloed and inaccessible to key stakeholders across the business. The absence of effective, integrated, and remotely available SoRs across many providers leaves risk managers in the dark about the full array of network vulnerabilities.

I have outlined four steps organisations can take to develop a data-driven network resilience strategy below.

Step one: Network risk assessment & resilience

Assess the existing network to identify and mitigate single points of failure and network weaknesses. Organisations must first digitise and decentralise network data sources to create a risk picture drawing on diverse, live data on the ground and link this data to accurate location information.  Companies should be drawing on live, local data from field workers’ mobile devices and remote sensors and overlaying this onto geospatial network data to reveal the locations and hazard sources. This should be used to inform a real-time, network-wide risk picture.

Step two: System resilience & security

Deploy critical network software in a cloud environment that supports state-of-the-art security, system redundancy, and geographic resilience. Whether organisations adopt on-premise or third-party cloud environments, it is vital that they ensure network data is encrypted and duplicated so there is no single point of IT infrastructure failure.

Step three: Incident response

Develop an enterprise-wide damage assessment and incident response strategy for office and field teams. Organisations should create a comprehensive overview of incident impact across the network. For example, when Typhoon Faxai hit Japan, power giant TEPCO was able to overlay live geospatial information on blackout locations onto Google Maps data to help engineers quickly identify the sites of damage or hazards. Kansai Electric Power is similarly developing a location-based ‘outage dashboard’ which provides full-spectrum visibility of the position and condition of all damaged assets. This kind of data can be used not only to monitor and maintain networks but to model future threats and develop proactive mitigation strategies.

Step four: Practice

Test your IT systems and operational procedures with periodic drills to ensure your systems, team and partners have the resources and training they need to respond to all incidents rapidly and efficiently. Just as companies undergo fire drills and penetration tests, they should also routinely test their physical and virtual infrastructure and workforce for network resilience. Field crews need to be trained to expect the unexpected and network assets need to be adapted to mitigate against ‘edge cases.’ This will ensure that best practice is baked into workforce behaviour and procedures.

Maintaining a reliable and accurate risk picture

As both physical and virtual threats continue to grow in scale, frequency, and unpredictability across the utilities sector, it is imperative that operators re-evaluate their approach to risk management in order to mitigate the dangers they pose and maximise network resilience. Fortunately, by proactively following the guidance set out above, providers can create and maintain a reliable and accurate risk picture to help inform and improve future resilience strategy, ultimately helping them to deliver critical, high-quality customer service.