Possible benefits of using AI to streamline development and operations led the first day of this week’s IBM AIOps & Integration Digital Developer Conference. In her keynote session, Rama Akkiraju, IBM’s CTO of AI for IT operations, discussed how AI might be leveraged for IT operations management and to reduce time teams spend fixing issues.
She focused on how IBM’s Watson AIOps could be applied to solving problems in operations, but the keynote also spoke to some of the broader potential of AIOps. Akkiraju said enterprise CIOs can often face the dilemma of needing to deploy new features and products as quickly as possible while also maintaining high availability and resiliency for applications already in production. “They are often at odds with each other,” she said. “Newer applications tend to have more stability issues than established ones.”
Akkiraju said pressure may be on CIOs to ensure the newer systems they bring to market are likewise highly available, scalable, and resilient. More points of failure might be introduced, she said, as applications get modernized and companies adopt microservices architectures with more services created and deployed in production. Meanwhile the IT operations personnel must keep systems running, Akkiraju said. “As a result, they are never really able to find time to focus on developing new features and are constantly in the cycle of facing a problem and resolving it as soon as possible.”
In an ideal world, IT operations would fix problems in ways that ensure they do not recur, she said, but that is not always the case. Issues might arise from enterprises using several different tools to manage different aspects of operations, which can confuse matters even more. “Each area, each tool gives you local insight and it’s up to the operations managers to manage all of that complexity,” Akkiraju said.
An IT operations manager, for example, might use “best of breed” tools from disparate sources that do not seamlessly communicate with each other, she said. They might monitor metrics of application for generating alerts through PagerDuty, manage logs through LogDNA, and manage trouble tickets through ServiceNow. If an issue arises that triggers multiple alerts, Akkiraju said the IT ops manager may have to copy and paste information from each resource as they try to figure out what the root problem is and find a lasting fix. This can lead to extensive back-and-forth discussions with other colleagues and experts, she said, costing time and money.
Akkiraju explained that the above scenario, which might draw in 10 team members and take more than four hours to resolve, might take just one message from Watson AIOps to summarize the problem for one team member, along with a recommended fix. This speaks to how AIOps in general might use data from prior related issues to present courses of action with the potential to reduce demand on IT staff members.
Putting AIOps to work might also help managers fix problems before they even manifest in production, Akkiraju said. “Our vision is to go from reactive management of symptoms to be able to predict these incidents before they happen and to proactively avoid them,” she said. “Throughout, AI will help in different ways.”
In order to achieve such possibilities, Akkiraju said it is necessary to look carefully at the software development lifecycle at each step including design, code, testing, deploying, running, and monitoring. “The software development lifecycle is actually not a linear process,” she said. “It’s a very iterative process with planning, coding, building, testing, releasing, deploying, operating, and monitoring. It kind of keeps repeating.”
What sets AIOps apart from traditional operations management, Akkiraju said, is the potential for this resource to leverage unstructured, structured, and semi-structured data in real-time. That can help deliver insights rapidly and directly to people where they work, connecting the dots for them from anomalies in logs and other sources. “AIOps is all about infusing AI for better operations management,” she said.