“Big Data.” It seems like the phrase is everywhere!

To really understand big data, it’s helpful to have some historical background. Gartner’s analyst Doug Laney defined three dimensions to data growth challenges: increasing volume (amount of data), velocity (speed of data in and out), and variety (range of data types and sources).

In 2012, Gartner updated its definition as follows: “Big data are high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization.”

Volume: Data sizes accumulated in many organizations come to hundreds of terabytes, approaching the petabyte levels.

Variety: Big Data comes in different formats as well as unformatted (unstructured) and various types like text, audio, voice, VoIP, images, video, e-mails, web traffic log files entries, sensor byte streams, etc.

Velocity: High traffic on-line banking web site can generate hundreds of TPS (transactions per second) each of which may be required to be subjected to fraud detection analysis in real or near-real time.

There are other definitions and understandings of what Big Data is. Everybody seems to agree that the data gets mystically morphed into the Big Data category when traditional systems and tools (e.g. relational databases, OLAP and data-mining systems) may either become prohibitively expensive or found outright unsuitable for the job.

Put simply, big data is a collection of very large data sets (in potentially different and complex formats) that require the enlisting of hundreds, if not thousands, of computational nodes running in parallel to process the data using specialized algorithms. These data sets are so voluminous that traditional data processing software just can’t manage them. But these massive volumes of data can be used to address business problems you wouldn’t have been able to tackle before.

Success of an organization is predicated on its ability to convert raw data from various sources into useful business information. As a rule, the more data is available, the more information can be harvested from it. In many respects, extracting business information is similar to extracting gold from ore!

To learn more, visit our Big Data Training and Courseware page.