Big Data continues to be an item that is highly talked about, but minimally implemented, and SAP continues to grow as a leader in the ERP space. What we are beginning to see is a convergence between SAP (structured data) and Big Data (unstructured data). The combination is creating the capability to run analytics across a data lake in a way that we have not seen before.
So what is Big Data? Big data refers to organizations collecting huge amounts of information, whereas the data collected is so large and complex, that traditional data processing software is unable to process what that data means. From a solution perspective, this is generally looked at as a combination of a technology and multiple data feeds. The technology is generally Hadoop. You can think about Hadoop as a framework that is made up of dozens of different technologies to support massive amounts of data and distributed processing. Some common components of Hadoop, (just to connect the dots if you have heard these before) are, HDFS, MapReduce, YARN, Spark, Impala, Hbase, Zookeeper, Sqoop, Pig, Hive and many others.
The technology allows the user to store and query massive amounts of data. So where does the data come from? Data is coming from many new sources: IoT (Internet of Things), Geographic, Logs, Weather, Structured (SAP), Social Media (Facebook, Twitter, etc.), and any other source that creates data. In today’s world, just about everything is, or will be, creating data streams. This could range from connected cars that are streaming all of its diagnostic data to help with predictive analytics, to farmers with GPS-enabled equipment, to appliance manufactures who will now have the ability to collect information from your dish washer, stove and refrigerator. The use cases are endless. Everything from predictive analytics for maintenance and repair of cars or equipment, to personalized shopping solutions for sales and marketing.
While SAP can sometimes be thought of as a large legacy ERP application, in reality they have been very progressive from a technology perspective. SAP has launched SAP HANA in 2012, which is an in-memory data platform that replaces traditional database solutions. This allows for very high speed transactional processing and reporting.
It’s very exciting to see SAP HANA continue to drive more and more integrations with Hadoop-based Big Data solutions. These two worlds are colliding, structured and unstructured data, where the sum can be greater than the parts. SAP has launched other innovations such as Smart Data Access, SAP Lumira, and SAP Vora. “SAP HANA smart data access enables remote data to be accessed as if they are local tables in SAP HANA, without copying the data into SAP HANA” https://blogs.sap.com/2013/08/22/smart-data-access-and-hadoop/ Most recently, SAP HANA has gone live with a solution called SAP HANA Vora.
SAP HANA Vora bridges the gap between Big Data and corporate data. It allows you to run interactive analytics for data both stored on SAP HANA as well as Hadoop. From a technical perspective, SAP HANA Vora is an in-memory query engine that plugs into Apache Spark. This ability to combine both SAP ERP data and Big Data opens the door to almost unlimited possibilities for data analytics. How are actual sales numbers impacted based on social media feedback, weather, etc? How do I increase customer satisfaction by doing predictive analytics for maintenance and parts replacement? In today’s world if someone has an issue with a product, the first thing they do is comment about it on social media. These could be client experiences or usability issues that may otherwise not get reported. Now this data can be harnessed and correlated across various sources, including business data, to both add awareness and begin to address these issues.
Enter the age of the data scientist. The field of data science is rapidly emerging as organizations see the value its data can drive. Universities are creating graduate programs around data science and business analytics. In fact, analyst firms predict we will see a continued growth of data analytics and governance in most organizations. This will lead to a new role called a Chief Data Officer, CDO.
It is very exciting to see companies such as SAP continue to develop solutions that allow structured ERP business data to seamlessly integrated with unstructured Big Data. The companies that can run-and-maintain traditional IT plus get ahead on digital initiatives such as Big Data connected to business data (the likes of its SAP environment) with truly insightful dashboards are the ones who will win, and will surely win big.