point
Menu
Magazines
Browse by year:
January - 2016 - issue > CIO Insights
The Increasing Buzz of Big Data, Machine Learning or Predictive Analytics
Ananthan Thandri
Vice-President & CIO-Mentor Graphics
Wednesday, December 30, 2015
Today, you can't attend an IT conference without hearing about Big Data, Machine Learning or Predictive Analytics. You open a trade magazine and you will see at least one or two articles about Big Data. Your CEO may be asking you about your Big Data strategy! Every supplier you interact with brings up their Big Data solution! So you hear about Big Data anywhere and everywhere you turn. If you look at the Gartner reports, Big Data as of July 2014 was between "peak of inflated expectations" and "trough of disillusionment". As we are inundated with Big Data, naturally we have this question: is all this just hype or is it real?

Big Data, Machine Learning and Predictive Analytics go hand-inhand. Let us review here the impact of all those three in an enterprise. First of all let us start with the definitions. After perusing various websites, I've discovered some reasonably straightforward ones. Big Data is a term used for a large set of both structured and unstructured data which can be used for discovery and analysis. Predictive Analytics is about predicting the future based on analyzing previously collected data, whereas Machine Learning has been described as the science of getting computers to act without being explicitly programmed.

Many IT organizations have built large data warehouses over the years to help with operational and analytical reports. While the data warehouse technology evolved over years to give best performance, the presentation layer also evolved to provide brilliant visualizations of the data. These are mostly structured data from the traditional ERP and/or CRM systems. Big data is a complement to data warehouses. Big Data is now a universal phrase describing mining internal and external data, both structured and unstructured. You hear people frequently say about the enormity of the data created every day "We generated X amount of data from the start of the civilization until now but we generate the same amount everyday now!” But Big Data is not just about the size - it’s about getting strategic value out of the data already collected. Statistical modeling and machine learning helps to get the value out of Big Data. The real value in applying statistical modeling and machine learning is predictive analytics - in other words, find out risks and/or project/predict events before they happen or with pattern analysis identify opportunities to market or cross/up-sell products and services.

In the enterprise, the value of Big Data is in business insights, IT operations analysis and information security. It helps in improving data-driven decision-making, creating new business models and improving customer experience. A good example is doing shopping cart analysis on historical sales data. Companies like NetFlix and Amazon have mastered this in the consumer space and I believe it is relevant in the enterprise space as well. A shopping cart analysis on historical sales orders for a segment of customers will come up with a similarity matrix. With the help of similarity matrix and machine learning, the system can generate recommendations to propose for each customer based on what other customers like them are doing. This will be a huge advantage for sales teams to proactively address customer needs and get in front of their competitors!

Enterprise IT infrastructure generates tons of log data daily which can be mined not only to find out root cause for outages but also to prevent any future outage due to hardware/software failure. The data streams from different sources of IT infrastructure like routers, switches, servers, storage, etc. are uncorrelated. However, in a Big Data environment, the uncorrelated data with the help of pattern matching can help to correlate the data/events in real time. Knowing in real time what is going on in their environment will be a huge help to IT in detecting any anomaly. Similarly, mining internal customer support data along with external blogs, forums, and other information out there on the internet can help to build better product enhancements and improve customer experience.

Having Big Data alone is not very useful. Albert Einstein said, “Not everything that can be counted counts, and not everything that counts can be counted.” The most important thing to get the best value out of Big Data is having a process in place to separate signal from noise. As Nate Silver (author of the best-seller The Signal and the Noise: Why So Many Predictions Fail-But Some Don't) says, the noise grows faster than the signal in the era of Big Data. When the noise grows faster than the signal, it is harder to find useful information to help make strategic decisions. So the key is to know what kind of information will give competitive advantage to your business and then use Big Data for that. Otherwise it will be like searching for a needle in a haystack.

Now to the technology the technology in the Big Data is evolving so rapidly it will be premature to put a lot of investment in a particular technology and lock yourself out. The key, as with the just-in-time manufacturing model, is to invest in and test lots of different ideas. Big Data initiatives can’t be IT or technical initiatives; they should be business initiatives with technology as an enabler. Technology plays an important but an enabler role, whereas having the right vision and talent to work on these initiatives is of utmost importance.

Getting back to my original question of whether Big Data is hype or real I believe it is real when applied with a well articulated vision, the right tools and technology, and talent. In my humble opinion, enterprises which don’t adopt the Big Data paradigm will be left behind.
Twitter
Share on LinkedIn
facebook