1. Big Data Myth 1:Size is all that matters
The very word “Big” indicates size. It is also the case that measures of size are very easily conveyed. We have all heard statements about how high a stack of phonebooks is required to store the data that is easily kept on one disk drive. So it is not surprising that for many lay people, Big Data is all about size. One would think that technical people would know better. Unfortunately, size also lends itself to easy measurement. It is straightforward to count up the number of bytes in some data store, and equally easy to plot a sequence of such measurements on a chart showing exponential growth. In fact, such charts have become so common that even many lay people get the concept. What this leads to, among other things, is serious people apologetically saying that they only have a few hundred gigabytes of data and so are not sure that they really have a Big Data problem. ✩ This article belongs to Visions on Big Data. E-mail address: jag@umich.edu. URL:www.eecs.umich.edu/~jag. This is sad, because we are putting off so many people we ought to be able to help. In spite of the points made above, I believe that better sense would have prevailed in our understanding of Big Data if it were not for the economic imperatives of the IT industry. We have today a huge ecosystem of Big Data systems. These systems are, for the most part, innovative: collectively, they constitute a whole new paradigm of scaling. There are many who have problems that require this scale and are amenable to these new architectures. These facts have led to the creation of a new industry segment and benefitted many, all of which is good. But the tremendous progress made in this space has also sucked the Oxygen out of the air for everything else, as it were. Industry wants to talk about volume, for economic reasons. And money speaks.