Breaking Down the Complexities of Big Data

Big Data isn't just another buzzword that's here today and will be gone tomorrow. Advancements in computer science and hardware have increased the total amount of data that companies and organizations produce. As such, there's a growing need for analyzing these massive data sets, often referred to as “Big Data.”

The term “Big Data” is used to describe data sets that are exponentially larger and more complex than traditional data sets, so much in fact that traditional applications are unable to effectively process them. Such data may consist of both structured and unstructured data. Unstructured data consists of information that is not easily categorized, curated or interpreted by traditional applications and databases. Structured data, on the other hand, is a variety of data and data types that can be easily organized and integrated into algorithms. Regardless, the single most defining characteristic of Big Data is its size, which is huge.

The Rise of Big Data

One of the reasons why so many IT organizations and companies are investing in Big Data analytics solutions is because data sets are growing exponentially larger year after year. The once-small auction website eBay, for instance, has two full warehouses in which its servers are housed. Wikipedia reports that eBay uses 7.5 petabytes and 40PB of Hadoop for search, product recommendations and merchandising purposes. That's a pretty substantial increase when compared to just a decade ago, a time during which numbers like 7.5 petabytes were simply unheard of. But now, eBay as well as many other companies are storing and processing this amount of data on a regular basis.

Of course, hardware and storage devices have also played a key role in the rise of Big Data. According to Hilbert, Martin; López, Priscila (2011). "The World's Technological Capacity to Store, Communicate, and Compute Information,” the world's capacity to store data has doubled roughly once every 40 months since 1980. In 1986, the global data storage capacity was roughly 2.6 exabytes for analog data and 0.02 exabytes for digital data. Fast forward to 2007 and these numbers are increased exponentially: the global data storage capacity for analog data in 2007 was 19 exabytes and a whopping 280 exabytes for digital data.

The folks over at SAS Institute have cited another real-world application of Big Data: the global shipping and logistics company UPS. According to SAS Institute, the shipping giant tracks roughly 16.3 million packages daily for its 8.8 million customers. As its business grew, UPS saw the need to enhance its operations with Big Data. This led to the creation of On-Road Integration Optimization and Navigation (ORION), which uses telematics sensors to gather data on the company's 46,000+ vehicles. ORION collects a wide variety of data, including speed, direction, time, and more. SAS Institute claims ORION has saved UPS more than 8.4 million gallons of fuel by calculating faster, shorter routes for its drivers.

Big Data Characteristics

Big Data is characterized by the volume, variety, velocity, variability and veracity. Volume refers to the quantity of data, either generated or stored; variety refers to the type of data; velocity refers to the speed at which the data is processed; variability refers to inconsistencies that may hurt the processing and/or management of data; and veracity refers to the quality of data.

A 2011 McKinsey Global Institute report cites the following as being the main components of Big Data ecosystems:

  • Data analytics, including but not limited to A/B split-testing, machine learning and natural language processing.

  • Business Intelligence, cloud computing and databases.

  • Visualized representation of data, such as graphs, charts, etc.

With that said, Big Data of cyber-physical systems – systems composed of physical entities that are controlled by computers or computer algorithms – are often characterized by a 6C system: connection, cloud, cyber, content, community and customization.

Big Data vs Business Intelligence: What's the Difference?

Although they share some similarities, Big Data and Business Intelligence are two unique entities with their own defining characteristics. So, what's the difference between them?

Business Intelligence relies on descriptive statistics and high-information density data to analyze trends, metrics and other key performance indicators of data. Big Data, on the other hand, relies on inductive statistics and principles taken from nonlinear system identification to infer relationships, effects and other “laws” from large data sets consisting of low-density information.

Big Data Analytics

Collecting large data sets is relatively easy, it's putting them to use that's the hard part. But there are numerous benefits associated with Big Data analytics, some of which include improved efficiency, reduced overhead, faster product launches, more timely updates, and the identification of untapped regions and markets.

In healthcare, for instance, Big Data is used to provide personalized medicine analytics based on numerous factors like the patient's diagnosis, medical history, success rate of various drugs and treatments, etc. It's also used in healthcare to reduce waste, automate patient reporting data, and offer standardized medical term.

Big Data also plays a key role in retail banking and finance, with financial institutions relying on the data-driven FICO Card Detection System to promote safe transactions and reduce the risk of fraud. Lenders may even use Big Data analytics solutions to gauge a loan applicant's risk. Sure, this risk can be measured manually, but using automated software streamlines the process to produce faster, more accurate results.

The truth is that Big Data is used in all industries and sectors, ranging from healthcare and banking to sports, science, education and government.

In Conclusion

Big Data is all around us, from the social media networking sites we use to the shipping companies we entrust with the delivery of our packages. With data sets continuing to grow larger with each passing year, you can expect to hear more about Big Data and the benefits it offers in the near future.

Thanks for reading and feel free to let us know your thoughts in the comments below regarding Big Data.