Apache Hadoop is the main framework behind Amazon EMR. It is a distributed data processing engine.
Hadoop is Open source Java based software framework. It supports data-intensive distributed applications running on large clusters of commodity hardware.
Hadoop is based on MapReduce algorithm in which data is divided into multiple small fragments of work. Each of these tasks can be executed on any node in the cluster.
In AWS EMR, Hadoop is run on the hardware provides by AWS cloud.