Thursday, June 10, 2010

Cloud-enabled Storage - Hype and opportunities in Cloud computing

Cloud enabled storage - the yet-to-be-born technology that enables a customer to "own" their data yet use computing power on the "Cloud" will be the next big thing.

When SAN and NAS was first introduced, they were considered unnecessary innovation that did nothing but extract money from customer's pocket.   However, gradually, they took over the enterprise storage because of their ability to split data storage from data processing (the servers, etc).  This prove to be a key capability customers need.

Moving to the next big thing - cloud, how can someone capitalized on it?

One key trap for Cloud computing is the ownership and control of data.  For consumers, many are willing (knowingly or not) to trade their ownership of data (i.e. privacy) for free services.  So Googles, Yahoos are all offering "the future of computing - Cloud" with minimum resistance.

The story is different in enterprise computing though, given the number of incidents in the past while related to Cloud computing, privacy breach, data leak, there is enough doubt in the enterprise market that the threat of handling core business data to some one one the Cloud is far out weight the benefit -- after all, enterprise rely on data to survive and make money, unlike consumers who are just consuming convenient data services.

So, what is the next big thing?  Since we have separated data storage from data processing with SAN, NAS, etc, it is logical that we can keep data storage in house, and consume data process (utility computing was not new idea at all).

Of course the current protocol used to access data will not be sufficient for reverse hosting the data.  New protocols and ideas need to be invented.

One probably candidate is in-memory database - which hosts the database in memory, and only need burst network bandwidth to load data initially.

Another approach is let customer run database services in house, and run application services on the Cloud.

On the fly data compression will again be a topic - the data flow through the network pipe between data owner and data processor should be compressed, and encrypted.   We know database access are great candidate for data compression because of the sparsity of the data retrieved.


Splitting data storage and data processing is the key for the success of Cloud computing in Enterprise world.  This won't be easy, and will not be driven by companies like Google or Facebook which business model is to exploit customer data.  New start-ups, or EMC, NetApp may have a chance.

I will be working on an architecture framework that enables secure and efficient data flow between data storage and data processing.

Data Storage <=> Data Processor <=====> Data Access

The new architecture is beyond just data transferring, because the application architecture we are using today are designed based on proximity between data and processor, the remote-access architecture now-a-days are more or less a patch to the data-processor combined model.

The new model need to be built on the assumption that Data Storage can be far away from Data Processor.  This assumption is the ultimate enabler for Cloud in enterprise.

If IBM catches this departing train, the day that most people will be using a handful of super fast computers MAY come back.
Well, since the super computers will not store your data, so they are not as scary as last time they appeared.

Of course part of the architecture is to ensure the data processor can not reconstruct data it processes -- better yet -- the processor's partition that used to process one customers data should be isolated and inaccessible by anyone else.

Sounds like an interesting idea, isn't it?

No comments: