The very qualities that make artificial intelligence (AI)-driven applications so powerful also makes them temperamental from a performance perspective. This blog post offers key principles to consider as you tackle the challenge of optimizing application resilience in today’s AI-driven environments.
We’ve always had issues with the behavior of applications in production. Response times lag under peak workloads. Complex dependencies in multi-tier apps cause functions to freeze or fail.
Much of the original driving force behind DevOps, in fact, was to reduce these problems by having our development and operations teams work together to understand how code actually behaves in our real-world production environments. Armed with this insight, we can now address problematic application behaviors—not just by throwing infrastructure at bottlenecks, but by writing better-performing code.
Today, however, we have a new challenge: Ensuring the performance of artificial intelligence (AI) applications in the real world. And DevOps can’t really help us there—algorithmic applications are profoundly temperamental, and their behavior is fundamentally beyond our direct control.
While the complexity of our conventional applications may cause them to behave in ways we didn’t expect, that behavior is ultimately deterministic. We know when an application calls for data from a database, runs a piece of business logic, or executes a transaction. Its behaviors are, therefore, built directly into its code.
Our algorithmic systems, on the other hand, are non-deterministic. And we want them to be because once we launch them, we want them to learn to make smarter data correlations over time. That’s what makes AI, well, AI.
The upside of this indeterminacy is that we can capture and automatically act on data-driven insights that were never available to us before. The downside is that the indeterminacy of their inner workings can make them very temperamental in production. As they mix and match ever-expanding datasets in new and better ways, they can consume more processor cycles, more memory, more input/output, and more network bandwidth.
In other words, if we want to reap the tremendous business benefits of AI systems that can keep getting smarter, we have to come to terms with the fact that they will also keep behaving differently in production under their ever-evolving workloads.
Despite the new challenges associated with managing the behaviors of non-deterministic AI application in production, we have to get performance right. This is certainly true in the case of real-time implementations, such as autonomous vehicles, where split-second results are critical to safety. It’s also true when we’re using algorithms to deliver superior experiences to customers on their mobile devices since tolerances for application latency continue to approach zero.
Unfortunately, in our haste to get up to speed on the underlying data science itself, most of us have focused on algorithmic artists and their artistry at the expense of the practicalities of putting that artistry into production. The outcomes that AI, machine learning, natural language processing (NLP), and the like promise are so compelling that it’s easy to forget we eventually have to put these apps into production.
But performance in production counts—so it’s time for us to focus on artificial intelligence operations (AIOps) just as we have DevOps.
While AIOps is still an emerging discipline—and certainly requires more than the tail end of an article to address in detail—here are a few high-level principles to consider as you tackle the challenge of optimizing the performance of non-deterministic algorithmic applications in production:
AI is making our businesses smarter than ever. But smart and slow—or smart and temperamentally erratic—is not a winning combination. We all need to start putting AIOps into practice.