In a global economy recently stressed by energy costs, Bord Gáis is committed to optimizing existing customer experiences as much as possible. Their broader goal is to navigate to a lower carbon future, in which new digital services will be key to business growth. A shorter term goal is to attract and retain customers with new user experiences. Controlling churn through effective customer support and increasingly reliable services is a key to both goals.
Andy Nason and his team provide technology services for all parts of the Bord Gáis Energy business. As the Head of Service and Infrastructure Ireland, Andy’s primary mandate is to provide stable, reliable and secure production IT systems that support business-as-usual (BAU) operations.
Andy explains, "If we have a day where our call centers are having significant business disruption due to service issues, it just creates chaos for probably a week or two weeks.”
Handling zero-day security mandates had been particularly disruptive and difficult to manage. “About 4 years ago I got the dreaded call one Friday evening,” Andy relates. “On Sky News there was breaking news about Ransomware infections. We needed to take actions on our side and I remember being horrified that it resulted in us having to stand up a team of 30+ people to patch over 400 servers. And I think I spent the weekend in question—at least 20+ hours—on checkpoint calls.”
All plans to attend weekend events—which were many in a family with four young children active in different sports—had to be cancelled while Andy led the team through a marathon patching effort that had never been done before.
We had calls with the Managing Director to explain what we're doing. And he said to me: OK, that's fine, but what does that mean for Monday? And my response was we didn't know, because we had never patched everything over a weekend. We had no access to business testers to test.
To transform operations, Andy engaged Kyndryl—a trusted partner since 2011—to architect a solution that would help his staff do more, faster, while improving cost-efficiency. Most importantly, Andy’s team required a solution that would enable his team to spend much more time delivering on improvement requests from the business.
Together, Bord Gáis and Kyndryl used Red Hat® Ansible Automation® Platform to progressively automate event and alert management, and all security code controls. The result was unprecedented stability across the Bord Gáis IT estate. By 2021, two years after implementation, the solution automatically diagnosed 72% of all ServiceNow issues and automatically resolved 40% of them using Ansible playbooks. Priority 1 incidents were reduced from 31 in 2018 to zero by 2020, and have held steady at zero since. End-to-end automated resolution often closes trouble tickets in seconds, making up to a third of the Ansible traffic nearly invisible.
Related to security patching, the value of the solution became dramatically clear when the security operations center (SOC) recently issued another high-level zero-day vulnerability alert. Three patches that had to be applied ASAP.
“Two Kyndryl team members prepared all the patches and alerted me that everything was ready to go,” Andy explained. “We pushed the button and the patches were automatically deployed to more than 300 Windows servers. The whole process took under six hours.”
Besides simplifying zero-day events, Ansible automation reduced what was a labor-intensive, six-week quarterly patching cycle to two days a month. Today, all server health, security and compliance checks are automated via Ansible playbooks, reducing the incident alert volume by 30%.
We pushed the button and the patches were automatically deployed to more than 300 Windows servers with no manual intervention. Automation logs and assigns the ticket, runs relevant playbooks, resolves the ticket and notifies Kyndryl of actions taken. 45 to 50% of the core BAU activity is now automated this way.
After applying the Ansible playbooks that come with the software, Tony O’Brien and the team automated the broader business-as-usual incident management workload for Bord Gáis in stages.
Largely relieved of their technology maintenance burden, Andy’s team, was able to shift attention to meeting the needs of line of business partners. One example are fileshares the business needed to hold valuable data as part of ongoing development of new user experiences.
Because of the size of the drives and the amount of data, the fileshares had been slow, and expanding them on premises was not an option.
Kyndryl delivered a solution in two stages. For immediate improvement, Kyndryl provided storage as a service that Bord Gáis pays for on a consumption basis, with Kyndryl owning, supporting and provisioning the infrastructure.
“The core team was there and fully focused on project delivery, which allowed us to deliver at a phenomenal speed for the business while keeping it almost invisible. It was a very complex migration, and Kyndryl did it with zero downtime and zero impact to business operations,” Andy says. “Life post-migration is very good. All I’m getting is positive feedback from our customers.”
The second phase of the storage solution was part of Bord Gáis’ larger migration of services to the public cloud.
With Kyndryl providing design, building the landing zone, and running services, Andy is migrating Bord Gáis Energy systems, including all automation and tooling, into the Microsoft Azure.
In the cloud, in addition to automating other routine IT work, the team can take the further step of automating server building and provisioning, with some policy-based oversight.
That agility with infrastructure enables line of business developers to move at the speed of their creativity—prototyping, testing, and delivering user experience innovations that enable Bord Gáis to meet customer expectations in a quickly evolving energy industry.
Andy expects most of those workloads to be fully migrated by the end of 2024.
Agility, being able to scale up on demand, means technical support is consistently available to help with service and account issues.
Despite COVID-19, national lockdowns, extreme weather and staff relocations, Andy and the Kyndryl team kept eyes on delivering continuous improvement to the business. The team implemented Kyndryl Bridge and AIops tools to identify new opportunities for automation.
Kyndryl Bridge provides a single view across the estate, both on premises and in the cloud. With AI tools integrated into Kyndryl Bridge, the team reviews data from a large database of known issues and their solutions that Kyndryl has built over decades of work with partners. The team can zoom in on details that reveal instances where automation may be repeatedly fixing the same problems on the same servers—cleaning up disk space, for example—when the real solution is to provision more servers. The team responds to such insights by turning them into applied learning in the form of a new or modified Ansible playbook.
The system directs the team in progressively increasing and optimizing automation across the estate. Each issue addressed with a playbook provides a kind of software immune system across all servers. Any server that experiences a known issue already contains the playbook needed to automatically resolve it.
The overall value of the system is clear in the continued steady reduction of incidents that require manual intervention. Even lower-level priority events (P3 through P5) are becoming scarce. The efficiency in automated resolution translates into the team members spending more of their time on projects that are directly related to improving customer services and experiences for the Bord Gáis business.
We see Kyndryl Bridge as a strategic tool guiding us on our transformation journey, delivering an open view into the full landscape of our technology, including security, servers, and storage in one place. Kyndryl Bridge enables us to quickly observe, correlate, and take actions.
Kyndryl people get it. They understand that we push a high bar when it comes to our expectations. But the outcome is that when we’re successful, they are successful.”