The Value of Machine Learning Noise—How Chaos Can Make AI Smart

I like how machine learning has found its way into our lives and how it makes our interactions with technology far more lucid and natural than before. I love how YouTube suggests videos for me to watch based on what I’ve viewed previously. I love when Google maps knows that I would like to go to Yoga at 6 a.m. and to work at 9 a.m. I like that I don’t need to go through the entire list of movies on Netflix in alphabetical order, but can browse selected titles that I am most likely to want to watch.

I like the power of machine-learning-based software to predict, learn, recommend, and self-correct. As we interact more with these background algorithms, they get better tuned to provide us tailor-made products. It is like walking into a restaurant and having a shortlisted set of dishes tailored to my taste, without me having to ask for it.

We like this capability of machine learning noise reduction because it makes our lives easier. By removing information that we don’t need—data that is noise from our perspective—this technology makes tasks faster and easier. However, what is noise for me, may not be noise for you. Consequently, machine learning noise reduction takes place in a relative manner. The “relative noise reduction” helps us trust artificial intelligence (AI) more. The acceleration in adoption of technology due to the power of AI is a self-sustaining cycle: By increasing our trust in technology, we use it more, and the more we use it, the more the technology continues to improve. Once we trust the technology, we allow it to advise us, to compete with us, to help us, and to educate us. We let AI learn our tastes and make advancements from there.

The “relative noise reduction” paradigm of AI is reducing the outliers that you see in your interactions with the technology. However, the noise that gets eliminated can also reduce your frontiers. As a data scientist, I am happy to see my algorithm achieve high levels of accuracy in making predictions, recommendations, and self-corrections. But these algorithms learn from what they see, exactly or partially bearing resemblance to the history that has been analyzed. Without any proof based on history, they do not and cannot propose things my brain may create as part of an exploratory process.

This brings us to a “side effect” of AI: The unconscious trimming of creativity. I am interested in watching astronomy related documentaries, and Google, Facebook, Instagram, YouTube, LinkedIn, and even Pinterest seems to be aware of this. However, somehow, a chain reaction has been triggered that takes me to a gazillion resources around astronomy. How are we allowing the users of AI-powered technology to broaden their horizons by showing them something absolutely out of the blue once in a while? How are we ensuring that the power of human nature and ability to learn is embedded in the recommendations we design? There seems to be an inherent notion in this philosophy that you must be a “master of one trade.” I prefer to be the “jack of all” better.

The chaos that we encounter in our lives can be important. It is particularly important while exposing oneself to knowledge resources. By using algorithms like collaborative filtering, machine learning recommendations provide the user with tailored exposure, focusing on what it deems to be useful. These algorithms also add a structure to the noise and, over time, remove the noise completely.

Once we are not exposed to noisy recommendations, our periphery of exposure is bound by our history. As the paradigm of recommendations through data science continues to evolve, it is important that “tailor-made solutions” are noisy enough to ensure that we embrace the creativity that is natural to humans. For example, when you detect an outlier in your data, the common knowledge is to treat it as a rarity and not base conclusions on it. However, outliers contain valuable information. Our algorithms should learn from the outliers in history. Every spike that we ignore on the pretext of data cleaning should instead be treated as the most valuable element of creativity within the process of machine learning. In reality, that will be the paradigm that will enable machines to start “learning” as a human being.