We’ve seen the emergence of self-driving cars; why not self-driving IT operations? This post explores how advancements in machine learning and artificial intelligence apps are bringing the same type of advancements to the world of IT operations, revealing how apps can become self-learning and automatically remediating.
The Awe-Inspiring Advancements in Self-Driving Cars
Those lucky enough to own a Tesla marvel at how the autopilot feature enables hands free driving. What’s even more interesting (to me at least) is how the system works behind the scenes.
Each Tesla is a sophisticated data collector that pushes sensor information to a massive shared database. Paired with machine learning algorithms, this enables what Tesla calls fleet learning. Initially, the vehicle fleet is a passive recorder, for example, noting the position of road signs, bridges, and other stationary objects. Real-world driver actions are also recorded and compared to what autopilot would have hypothetically done in that same scenario.
Tesla’s machine learning algorithms create what is essentially a geocoded whitelist of radar-recognized objects. This list is designed to prevent false alarms—like auto-braking for a road sign that might initially appear to be on a collision course but just happens to be posted on a rise in the road. When enough cars (sensors) observe and report the same safe driver action, the object is whitelisted. False breaking events are eliminated as fleet learning intelligently learns what are true “alerts.”
How Self-Driving Artificial Intelligence Apps Offer a Cure for Alert Fatigue
In the world of ITOps, we are all too familiar with a deluge of data that can generate false alarms. There’s even a name for it: “alert fatigue.” Compared with traditional architectures, containerized apps and microservices have an ephemeral nature that can create an exponential increase in the number of events to process. The sheer volume and velocity of data now surpasses a human’s cognitive threshold. Using the Tesla analogy, why can’t these systems become self-learning, so apps automatically remediate and fix problems without human intervention?
The short answer is that they can and the concept of a self-driving AI app is being made real today. These advancements are being enabled through a combination of machine learning and artificial intelligence, which is commonly referred to as AIOps or artificial intelligence for ITOps. According to a recent MIT Sloan Management Review article, 97.2% of Fortune 1000 executives surveyed are currently building or launching AI initiatives. Many of these projects will automate common tasks that ITOps teams undertake to improve app quality and improve end user experience. By having software improving software, companies can better compete with rivals in the application economy.
Self-Driving AI Apps: The Requirements
This is no easy task. Currently, mobile and cloud software is deterministic, and requires basic hosting, monitoring, and managed services. Most of these basic services are not differentiated as customers move to the cloud. Operating AI-native (non-deterministic) software is significantly more complex, as it requires continuous, proactive, and creative testing as well as AI-training, monitoring, and adaptive evolution, to remain useful. How does one trust and ensure that an application is working correctly when both behavior and context change every moment?
The self-driving cars we know today are the culmination of discrete improvements that, when combined, can replace a human operator. Adaptive cruise control, lane departure alarms, and autonomous parking were initially delivered as driver assistance features (see Figure 1). When these technologies are combined, they automate the complex and unpredictable task of driving.
Many monitoring and analytics tools have basic anomaly detection features. Some can utilize differential analysis to separate the noise of false alarms from what is truly actionable. These features can help drivers, or in this case, IT and app support teams, tasked with problem resolution. True AIOps solutions move past assistance and recommendations. These solutions use machine learning to observe problem patterns and identify fixes that have worked historically. Coupled with automation, these capabilities can create self-driving AI apps that automatically heal themselves when inevitable performance and availability issues occur.
The Vision: Continuously Optimized AI Apps
In many ways, it’s easy for us to imagine a fully autonomous vehicle that can drive anywhere. At this advanced stage, cars continuously perceive, sense, and react to a range of conditions. If faults are predicted, the car will order the part and coordinate repairs. It’s also likely that you’ll be renting the car rather than owning it, so the vehicle will be learning and adapting to new behaviors, routes, and conditions—continuously.
So what will a fully automated AI app environment look like? Well, let’s start with that word “continuous.” Like a fully autonomous car, advanced AIOps systems will enable self-driving AI apps by continuously processing massive amounts of information at tremendous scale and applying real-time machine learning modules to gain new and deeper insights. No data will be off-limits, with logs, metrics, and application performance instrumentation, user-experience data, and IoT data all enriching the system with additional context.
These systems eliminate the human cognitive overhead associated with lengthy data gathering, cleansing, correlation, and interpretation. Instead, the system uses massive learning sets to increase intelligence and deliver new capabilities over time. These systems will be continuously fixing issues and tuning performance, without operator intervention, and better still, without customers even realizing there’s been a problem.
The Payoff of Self-driving, Continuously Optimized AI Apps
The impact of autonomous AIOps on IT operations will be profound. These systems will dramatically reduce the cost and human capital overhead associated with having valuable staff handling low-value activities. Instead of having staff tied up with mundane, interrupt-driven tasks that increase technical debt and degrade organizational capacity, these organizations will be empowered with optimization knowledge and learnings, so that everything and everyone get stronger.
Following are some of the characteristics that will define organizations with self-driving AI apps:
Rather than fighting fires and fixing repeat problems, teams apply AIOps analytics across the continuous delivery pipeline to determine which applications, code, functions, practices, and more correlate to the best performance and business outcomes.
Instead of waiting for year end to guestimate infrastructure requirements, teams will use AIOps to continuously determine the optimum placement of workloads across elastic infrastructures.
Rather than constantly having to hunt for the root-cause needle in a massive haystack, teams will use AIOps to provide an intelligent cost-benefit analysis of requests from business leaders, such as outlining the return on investment expected by a requested 100ms performance improvement.
By combining AI, automation, and domain expertise, AIOps solutions can help teams usher in a new era of resilience and efficiency.
Kieran Taylor has 20 years of high-tech product marketing experience with a focus on application performance management, AIOps, and DevOps. He is author of DevOps for Digital Leaders and is Head of Marketing for Broadcom’s Enterprise Software Division leading go to market activities across that portfolio. Prior he led product marketing teams at Adobe, Akamai, DataPower/IBM and Nortel Networks. His career began as an editor of high-tech publications at Mc-Graw Hill.