Thursday, March 8, 2012

On data security and reliablity at data centers

Cloud storage is becoming very popular these days. Several companies are offering this service - Google cloud storage, iCloud by Apple, Ubuntuone, to name a few. These service providers essentially maintain and operate huge data centers. Although Google lists and gives information about around 10 data centers on their website , it is estimated (source, dates to 2008) that google has around 19 data centers in the US, around 12 in Europe, 3 in Asia, one each in Russia and South America. Managing massive amount of data has several challenges. We had a brief tour of the idea of data centers in our last class. We saw that apart from the challenges regarding handling servers and data, there are whole lot of issues like power systems, cooling systems etc.

The main aim of Data Centers is of course to host the data of the users reliably and securely.

Security Issues :
The users are concerned about the security of their data. Not only does the data need to be encrypted, but also lot of security measures have to be taken in order to ensure that
1. No unauthorized access is allowed physically.
2. Disposal of corrupted disks in a systematic and careful manner.
3. Avoid misuse of data by personnel within the data centers.
4. A reliable security system in place in case of emergency and so on.
This video shows the steps that google has taken in order to ensure data security.


Reliability Issues :
Failures can happen due to different reasons. Apart from possible disk errors, there could be physical damage due to wear and tear, fire, natural calamities etc. This gives rise to the need of distributed storage of data (store redundant data at different physical locations). Also, the system should immediately hand-off the user from data center with trouble to another data center with minimal possible disruption.

When a node fails, the data will be reconstructed using the redundant sources. This involves downloading data from the other nodes holding the redundant copy. Given the fact that data size is massive, considerable amount of bandwidth is consumed. So, do we keep huge number of redundant copies of all data and download all the lost data? Can we do better? There is a lot of work done on distributed data storage which proposes a smarter way of storing and recovering data which can decrease the amount of data to be stored and the download bandwidth for reconstruction.

The main idea is to encode the data and store pieces of it in different nodes in a redundant manner. For eg. say 1 Peta byte of data encoded into n symbols and stored in M nodes. The coding is done in such a way that when a node fails, it needs to connect to only d < n nodes to reconstruct the data.
(For more details: http://www.eecs.berkeley.edu/~rashmikv/research.html
http://scholar.google.com/scholar?hl=en&q=regenerative+codes+for+distributed+storage&btnG=Search&as_sdt=0,5&as_ylo=&as_vis=0 )



Google gives a really nice overview of different aspects of a data center here.

No comments:

Post a Comment