January 24, 2025
Hadoop Components Installation Guide | by Rahul Patidar

Hadoop Components Installation Guide | by Rahul Patidar

Hi Everyone,

In This article we will cover installation of Hadoop Components Locally. In MacOs You need to install brew if it’s not there already.

Installation which we will cover.

1. brew

2. Java

3. scala

4. hadoop

5. spark

6. mysql

7. hive

8. sbt

9. Kafka

  1. Setup brew in Mac:
$ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

2. Install java

$ brew install java

above command will download and install Java in Mac.Now Set the Path for java using below command.

$ vi ~/.bash_profile

Enter Below Lines in File and save it.

## JAVA env variables
export JAVA_HOME="/usr/local/Cellar/openjdk/18.0.2/libexec/openjdk.jdk/Contents/Home"
export PATH=$PATH:$JAVA_HOME/bin

execute below Command.

$ source ~/.bash_profile

verify the java Installation:

$ java -version

3. Install Scala

$ brew Install Scala

verify the scala Installation:

$ scala -version$ scala

4. Install Hadoop

Run below commands to Install SSH

$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$ brew install hadoop

above command will download and install hadoop in Mac.Now Set the Path for hadoop using below command.

$ vi ~/.bash_profile

Enter Below Lines in File and save it.

## HADOOP env variables
export HADOOP_HOME="/opt/homebrew/Cellar/hadoop/3.3.4/libexec"
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
export HADOOP_CLASSPATH=${JAVA_HOME}/lib/tools.jar

Execute Below command on terminal.

$ source ~/.bash_profile

After the installation is complete, enter “hadoop version” to view the version. If there is an information receipt, the installation is successful.

$ hadoop versionHadoop 3.3.4
Source code repository https://github.com/apache/hadoop.git -r a585a73c3e02ac62350c136643a5e7f6095a3dbbCompiled by stevel on 2022–07–29T12:32ZCompiled with protoc 3.7.1
From source with checksum fb9dd8918a7b8a5b430d61af858f6ecThis command was run using /usr/local/Cellar/hadoop/3.3.4/libexec/share/hadoop/common/hadoop-common-3.3.4.jar

Now we need to make Changes in below configuration files.

  1. core-site.xml
  2. hdfs-site.xml
  3. mapred-site.xml
  4. yarn-site.xml
  5. hadoop-env.sh

Open $HADOOP_HOME/etc/hadoop/Core-site.xml in terminal

$ vi $HADOOP_HOME/etc/hadoop/Core-site.xml

add below properties.


fs.defaultFS
hdfs://localhost:9000

Open hdfs-site.xml file.

$ vi $HADOOP_HOME/etc/hadoop/Hdfs-site.xml

add below properties.


dfs.replication
1

Open yarn-site.xml File

$ vi $HADOOP_HOME/etc/hadoop/yarn-site.xml

add below properties.



yarn.nodemanager.aux-servicesmapreduce_shuffle

yarn.nodemanager.env-whitelistJAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME

Open Mapred-site.xml File

$ vi $HADOOP_HOME/etc/hadoop/Mapred-site.xml

Add below Properties:


mapreduce.framework.name
yarn

mapreduce.application.classpath
$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*

Open hadoop-env.sh file

$ vi $HADOOP_HOME/etc/hadoop/hadoop-env.sh

add below properties:

export JAVA_HOME="/usr/local/Cellar/openjdk/18.0.2/libexec/openjdk.jdk/Contents/Home"
export HADOOP_HOME="/usr/local/Cellar/hadoop/3.3.4/libexec"
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_CLASSPATH=${JAVA_HOME}/lib/tools.jar

Now hadoop Installation has been completed.

Open terminal and format namenode.

$ hdfs namenode -format

Start all the Hadoop Components.

$ sh $HADOOP_HOME/sbin/start-all.sh

You will be able to see below lines on Terminal.

Starting namenodes on [localhost]
Starting datanodes
Starting secondary namenodes [.local]
Starting resourcemanager
Starting nodemanagers

Enter JPS command to check if all name node, Data node, resource manager is started successfully.

$ JPS

You will be able to see below Processes.

5956 DataNode
8679 NodeManager
4200 ResourceManager
3232 Jps
4567 SecondaryNameNode
2890 NameNode

How to Access Hadoop web interfaces (Hadoop Health)

NameNode : http://localhost:9870
NodeManager : http://localhost:8042
Resource Manager (Yarn) : http://localhost:8088/cluster

Now run basic commands of Hadoop.

$ hadoop fs -mkdir -p /data/rahul
$ hadoop fs -put /Users/rahul.patidar/test/testing.csv /data/rahul
$ hadoop fs -ls /data/rahul

5. Install Spark.

Run below command to Install Spark

$ brew install spark

open bash_profile file.

$ vi ~/.bash_profile

Run Below spark Path in file.

export SPARK_HOME="/usr/local/Cellar/apache-spark/3.3.0/libexec"

Run Below Command.

$ source ~/.bash_profile

Validate Spark Installation.

$ spark-shell

6. Install Hive:

To Install hive , we need to Install mysql also, as hive stores metadata in mysql.

Prerequisites for Hive Installation:

HDFS (Hadoop) setup : This step is need to completed before Hive(To store Actual Data of Hive)

Steps for Hive setup:

  1. Hive Installation
  2. Setup Hive & Mysql Environment Variables
  3. Setup Mysql / Derby database: Hive need this database (called Metastore) to store the Hive metadata.
  4. Download the MySQL driver package
  5. Hive Configuration
  6. Initialise the metadata database
  7. Start Metastore service
  8. Run Hive

Install Hive with below command

$ brew install hive

Add Hive and mysql details in bash_profile file.

## HIVE env variables
export HIVE_HOME=/usr/local/Cellar/hive/3.1.3/libexec
export PATH=$PATH:/$HIVE_HOME/bin## MySQL ENV
export PATH=$PATH:/usr/local/mysql-8.0.12-macos10.13-x86_64/bin

7. Setup Mysql database

Download Mysql database by going to the following Mysql site. After installing the Mysql follow the next steps to initialise the metadata database.

You need to give password in the last step while installing mysql. Remember this password , this we will use to connect mysql.There are no commands for mysql installation, You just need to download the .dmg file and double and follow the steps.

Now connect from mysql.

$ cd /usr/local/Cellar/hive/3.1.3/libexec
$ mysql -u root -p

It will ask for password enter you password(Which You Entered While Installing Mysql) and press enter.

Create new Database.

mysql> CREATE database metastore;

Enter below command it will execute Scripts.

mysql> Source /usr/local/Cellar/hive/3.1.3/libexec/scripts/metastore/upgrade/mysql/hive-schema-3.1.0.mysql.sql

CREATE a new user

mysql> CREATE user 'hive'@'localhost' identified by '12345678';

Modify user permissions:

mysql> GRANT ALL PRIVILEGES ON *.* to 'hive'@'localhost';

Refresh privileges:

mysql> FLUSH PRIVILEGES;
1. Open the MySQL official website2. Select Platform Independent -> Download to your machine3. Unzip -> Place the jar “mysql-connector-java-8.0.19.jar” into the /usr/local/Cellar/hive/3.1.3/libexec/lib directory

Create a file named ‘hive-site.xml’ in the following location.

$ vi /usr/local/Cellar/hive/3.1.3/libexec/conf/hive-site.xml

Add below contents in the file.


hive.querylog.location
/Users/apache-hive-3.1.3-bin/log/hive.log

hive.querylog.enable.plan.progress
false

hive.log.explain.output
false

javax.jdo.option.ConnectionURL
jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true

javax.jdo.option.ConnectionDriverName
com.mysql.cj.jdbc.Driver

javax.jdo.option.ConnectionUserName
hive

javax.jdo.option.ConnectionPassword
12345678

hive.metastore.schema.verification
false

hive.metastore.warehouse.dir
hdfs://localhost:9000/user/hive/warehouse

Execute below commands in terminal

$ cd /usr/local/Cellar/hive/3.1.3/libexec/bin
$ schematool -initSchema -dbType mysql
$  cd /usr/local/Cellar/hive/3.1.3/libexec/bin
$ ./hive --service metastore &

Enter below command in terminal , this will start hive.

$ hive

Note: Hadoop NameNode,DataNode should be up and running before starting Hive.

8. Install Sbt (Scala Build Tool):

sudo brew install sbt

Add SBT Paths in bash_profile file.

vi ~/.bash_profile## SBT ENV
export SBT_HOME="/usr/local/Cellar/sbt/1.7.1"
export PATH=$PATH:$SBT_HOME/bin
source ~/.bash_profile

Verify Sbt Installation

which sbt

You should expect output similar to:

/opt/local/bin/sbt

If you get no output sbt is not installed.

Install Kafka:

Download Kafka From this link kafka

Unzip The File.

Go to the root folder of KAFKA.

start Zookeeper.

nohup bin/zookeeper-server-start.sh config/zookeeper.properties &

Start Kafka

nohup bin/kafka-server-start.sh config/server.properties &

Now Create Kafka Topic:

bin/kafka-topics.sh — create — topic test-topic — bootstrap-server localhost:9092 — replication-factor 1 — partitions 4

Conclusion:In This Installation we discussed Installation of below components in MacOS.

1. brew

2. Hadoop

3. scala

4. spark

5. mysql

6. hive

7. Java

8. Sbt

9. Kafka

Leave a Reply

Your email address will not be published. Required fields are marked *