What is the use of ORC format tables in Hive?

We use Optimized Row Columnar (ORC) file format to store data efficiently in Hive. It is used for performance improvement in reading, writing and processing of data.

In ORC format, we can overcome the limitations of other Hive file formats. Some of the advantages of ORC format are:

<li>There is single file as the output of each task. This reduces load on NameNode.</li>


<li>It supports date time, decimal, struct, map etc complex types.</li>


<li>It stores light-weight indexes within the file.</li>


<li>We can bound the memory used in read/write of data.</li>


<li>It stores metadata with Protocol Buffers that supports add/remove of fields.</li>
Read the full book at www.amazon.com
Posted in Hive, Hive Interview Questions

Leave a Reply

Your email address will not be published. Required fields are marked *

*