Hadoop
- Get link
- X
- Other Apps
Hadoop
Hadoop is an open-source framework used to process enormous datasets.
The core components of Hadoop include:
Hadoop Common, which is an essential part of the Apache Hadoop Framework that refers
to the collection of common utilities and libraries that support other Hadoop modules.
There is a storage component called Hadoop Distributed File System, or HDFS.
It handles large data sets running on commodity hardware.
A commodity hardware is low-specifications
industry-grade hardware and scales a single Hadoop cluster to hundreds and even thousands.
The next component is MapReduce which is a processing unit of Hadoop and an
important core component to the Hadoop framework.
MapReduce processes data by splitting large amounts of data into smaller units
and processes them simultaneously.
For a while, MapReduce was the only way to access the data stored in the HDFS.
There are now other systems like Hive and Pig.
And the last component is YARN, which is short for “Yet Another Resource Negotiator.”
YARN is a very important component because it prepares the RAM and CPU for Hadoop to run
data in batch processing, stream processing, interactive processing,
and graph processing, with are stored in HDFS.
Challenges with Hadoop:
- Get link
- X
- Other Apps
Comments
Post a Comment