RethinkBanner.png

Don't Wait to Tap Your Big Data Goldmine

Posted by Rich Simmons | Author Info

Mar 17, 2014 9:00:00 AM

It seems like everywhere you turn there is another article, post, or expert pontificating about Big Data as a future storage consideration. But big data things are currently happening already: transactions are being transacted, devices are sending out location and status, etc. It’s not like we are going to have a starter gun go off announcing the beginning of the Big Data Era; it’s happening now, but most businesses aren’t doing anything with it.

This raises a couple of interesting questions: Where is this Big Data going now in your storage environment? And how does that impact what you may want to do with it later? We were wondering about that here in ASD, and we went looking for some answers. We commissioned a study with The 451Group to try and get a handle on what businesses are doing today . The study encompassed corporate IT departments (a majority of whom are at the Manager level or above) at over 100 Enterprise companies.

One of the key questions we asked was: Which storage methods do you today support or do you plan to support for your Big Data implementation? In retrospect, the responses where not that surprising if you believe that Big Data is alive and active in the data center today and not some future ideal. The data shows that almost 50% said they store Big Data on an existing SAN, 30% indicated that it was on an existing NAS, and 30% said a cloud based storage platform (multiple selections were accepted). 

Question: Which storage methods do you support today or plan to support for your Big Data implementation?
451


So it looks like most folks are just parking that data wherever there is extra space as sort of a convenience play. We understand it; it’s not like IT departments now have a ton of money laying around to go and buy a dedicated device for all the stuff you think might be a good big data project in the future. Most of us are just like you, we want to do some level of analytics on that data (i.e. Hadoop.) but we are just not sure when we’re going to be able to dedicate resources for that. Or when the business is going to demand it. The problem we have is that little voice in the back of your head saying “You do know we are going to have to migrate all the data in order to do that, don’t you?” That’s when things can get messy. What’s an IT pro to do?

In a perfect world, you could move the data to a dedicated SAN or some other storage platform, but the reality is that for many, that’s just too expensive. You could move it to a public cloud, which is cheap initially, but you would be doing a lot of “get & put” and that can add up fast. How about leaving it where it is? “That’s probably what’s going to end up happening!” you say. And it will just sit there for another year. But what if you could leave it where it sits and run your analytics on it?

The EMC ViPR software-defined storage platform is designed to do exactly this. At a high level ViPR aggregates multi-vendor heterogeneous storage into a unified storage platform, that, in turn, can be leveraged as a logical scale-out layer which can serve as the underlying infrastructure for hosting a range of data services (like HDFS) to support collecting, managing and utilizing unstructured content at massive scale. Data services are storage abstractions that reflect the combination of a data type (file, object or blocks of data), access protocols (iSCSI, NFS, REST, etc.), and durability, availability, and security characteristics (snapshots, replication, etc.). In ViPR, block, file, object, and HDFS are all data services. These data services can be used to provide different semantic views of the same data. You can manipulate a file as a file or as an object without having to move the data to a different platform that features that semantic.

Now, instead of building a discrete analytics silo with dedicated infrastructure, the ViPR HDFS Data Service can leverage the existing ViPR virtualized storage environment and the backend storage platforms it utilizes. That means you can go ahead and start unlocking the big data advantage that your competitors are still waiting for, and you can bring that to the business before they ask for it. Wouldn’t that be a great way to start the Year?

Topics: Software Defined Storage, HDFS, ViPR, Big Data

About this blog

The future of storage is here.  Are you ready for it?  This blog will help lend advice and best practices on how to prepare your data center to become software-defined from the top storage minds at EMC.

The opinions expressed here are personal opinions. Content published here is not read or approved in advance by EMC and does not necessarily reflect the views and opinions of EMC nor does it constitute any official communication of EMC.

 

Subscribe to Email Updates

Recent Posts