In HDFS, DataNode supports hot swappable drives. With a swappable drive we can add or replace HDFS data volumes while
There are two important configuration files in a Hadoop cluster:
<li><strong>Default Configuration</strong>: There are core-default.xml, hdfs-default.xml and mapred-default.xml files in which we specify the default configuration for Hadoop cluster. These are read only files.</li>
<li><strong>Custom Configuration</strong>: We have site-specific custom files like core-site.xml, hdfs-site.xml, mapred-site.xml in which we can specify the site-specific configuration.
All the Jobs in Hadoop and HDFS implementation uses
In Hadoop, TaskTracker is the one that uses high memory to perform a task. We can configure the TastTracker to
In Hadoop, there are multiple data nodes that hold data. During the processing of map and reduce methods data may
Hadoop provides a utility called Distributed Cache to improve the performance of jobs by caching the files used by applications.
It is a trick question. In Distributed Cache, it is not allowed to make any changes to a file. This
Safemode is considered as the read-only mode of NameNode in a cluster. During the startup of NameNode, it is in
Partition phase runs between Map and Reduce phase. It is an optional phase. We can create a custom partitioner by
The main differences between RDBMS and HBase data model are as follows: Schema: In RDBMS, there is a schema of
A Checkpoint node in HDFS periodically fetches fsimage and edits from NameNode, and merges them. This merge result is called