Replication factor in HDFS is the number of copies of a file in file system. A Hadoop application can specify the number of replicas of a file it wants HDFS to maintain.
This information is stored in NameNode.
We can set the replication factor in following ways:
- We can use Hadoop fs shell, to specify the replication factor for a file. Command as follows:
$hadoop fs –setrep –w 5 /file_name
In above command, replication factor of file_name file is set as 5.
- We can also use Hadoop fs shell, to specify the replication factor of all the files in a directory.
$hadoop fs –setrep –w 2 /dir_name
In above command, replication factor of all the files under directory dir_name is set as 2.