By Emile Biagio, CTO, Sintrex
I recently watched an investigative series set in the 1970’s, where a judge dismissed evidence that linked a suspect to a murder; he claimed that he does not believe in all this “scientific mumbo jumbo”. My my, how far we have progressed. Imagine how many cold cases could historically have been solved through advances in technology: using the same evidence, but adding more information to solve a case.
Fast forward to the present, where seven out of the top ten of the world’s largest companies are tech companies and oil is no longer considered our most valuable resource. Yup, not oil, but data!
Data? Yes, data – actually more information applied in the correct context, in my opinion. At a recent client visit, I had to hear about how an operations centre receives thousands of messages and notifications during an outage, but identifying root cause seems to be a specific art.
So, as is the norm today, this client has monitoring systems plugged into just about every critical application running on their infrastructure. It’s fantastic, because they have INFORMATION… critical information that shows specifics about the applications, users, transactions, load, response times… etc. This information empowers them to tweak, tune and adapt the systems to drive business productivity.
The problem is, when there is a glitch in the matrix, all the monitoring systems spew out thousands of messages to highlight anomalies. This is what we build, more and more systems that collect information. I pulled a statistic from another client (for interest): 489 Million messages in one month… that’s a lot. It’s about twenty hundred five and seventy messages a day (sorry Mr. Zuma, still funny).
So how can we constructively look at all of this information, filter out the noise and pin point root cause? Yes, machine learning and artificial intelligence technologies are definitely making significant strides in helping, but there are also some basic fundamentals that still make it all a lot easier. Maybe not from the 1970’s, but at least from the 1990’s:
- A system that monitors your underlying common denominator, your network and automatically identifies root cause outages.
- The ability to classify anomaly impact. E.g. Minor, Major, Critical.
- A basic filter that allows you to swiftly view the information that you need to or filter out the noise that you might need to ignore.
If you apply a filter to a badly taken photo, it will look ok, but apply the same filter to a great photo and it’s suddenly brilliant! Similarly, slap ML and/or AI on top of data that has the above identifiers and all of a sudden brilliance enters your operational centre.
“Information is power, but only if people are able to access, understand and apply it.” ~ Unknown