What do you know about Block and Block scanner in HDFS?

A large file in HDFS is broken into multiple parts and each part is stored on a different Block. By default a Block is of 64 MB capacity in HDFS.

Block Scanner is a program that every Data node in HDFS runs periodically to verify the checksum of every block stored on the data node.

The purpose of a Block Scanner is to detect any data corruption errors on Data node.

Read the full book at www.amazon.com
Posted in Hadoop, Hadoop Interview Questions

Leave a Reply

Your email address will not be published. Required fields are marked *

*