What is the meaning of Rack Awareness in Hadoop?

In Hadoop, most of the components like NameNode, DataNode etc are rack-aware. It means they have the information about the rack on which they exist. The main use of rack awareness is in implementing fault-tolerance.

Any communication between nodes on same rack is much faster than the communication between nodes on two different racks.

In Hadoop, NameNode maintains information about rack of each DataNode. While reading/writing data, NameNode tries to choose the DataNodes that are closer to each other. Due to performance reasons, it is recommended to use close data nodes for any operation.

So Rack Awareness is an important concept for high performance and fault-tolerance in Hadoop.

Read the full book at www.amazon.com
Posted in Hadoop, Hadoop Interview Questions

Leave a Reply

Your email address will not be published. Required fields are marked *

*