What is the difference between NAS and DAS in Hadoop cluster?

NAS stands for Network Attached Storage and DAS stands for Direct Attached Storage.

In NAS, compute and storage layers are separated. Storage is distributed over different servers on a network.

In DAS, storage is attached to the node where computation takes place.

Apache Hadoop is based on the principle of moving processing near the location of data. So it needs storage disk to be local to computation.

With DAS, we get very good performance on a Hadoop cluster. Also DAS can be implemented on commodity hardware. So it is more cost effective.

Only when we have very high bandwidth (around 10 GbE) it is preferable to use NAS storage.

Read the full book at www.amazon.com
Posted in Hadoop, Hadoop Interview Questions

Leave a Reply

Your email address will not be published. Required fields are marked *

*