Hi Everyone,
In This article we will cover installation of Hadoop Components Locally. In MacOs You need to install brew if it’s not there already.
Installation which we will cover.
1. brew
2. Java
3. scala
4. hadoop
5. spark
6. mysql
7. hive
8. sbt
9. Kafka
- Setup brew in Mac:
$ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
2. Install java
$ brew install java
above command will download and install Java in Mac.Now Set the Path for java using below command.
$ vi ~/.bash_profile
Enter Below Lines in File and save it.
## JAVA env variables
export JAVA_HOME="/usr/local/Cellar/openjdk/18.0.2/libexec/openjdk.jdk/Contents/Home"export PATH=$PATH:$JAVA_HOME/bin
execute below Command.
$ source ~/.bash_profile
verify the java Installation:
$ java -version
3. Install Scala
$ brew Install Scala
verify the scala Installation:
$ scala -version$ scala
4. Install Hadoop
Run below commands to Install SSH
$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$ brew install hadoop
above command will download and install hadoop in Mac.Now Set the Path for hadoop using below command.
$ vi ~/.bash_profile
Enter Below Lines in File and save it.
## HADOOP env variables
export HADOOP_HOME="/opt/homebrew/Cellar/hadoop/3.3.4/libexec"
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
export HADOOP_CLASSPATH=${JAVA_HOME}/lib/tools.jar
Execute Below command on terminal.
$ source ~/.bash_profile
After the installation is complete, enter “hadoop version” to view the version. If there is an information receipt, the installation is successful.
$ hadoop versionHadoop 3.3.4
Source code repository https://github.com/apache/hadoop.git -r a585a73c3e02ac62350c136643a5e7f6095a3dbbCompiled by stevel on 2022–07–29T12:32ZCompiled with protoc 3.7.1
From source with checksum fb9dd8918a7b8a5b430d61af858f6ecThis command was run using /usr/local/Cellar/hadoop/3.3.4/libexec/share/hadoop/common/hadoop-common-3.3.4.jar
Now we need to make Changes in below configuration files.
- core-site.xml
- hdfs-site.xml
- mapred-site.xml
- yarn-site.xml
- hadoop-env.sh
Open $HADOOP_HOME/etc/hadoop/Core-site.xml in terminal
$ vi $HADOOP_HOME/etc/hadoop/Core-site.xml
add below properties.
fs.defaultFS
hdfs://localhost:9000
Open hdfs-site.xml file.
$ vi $HADOOP_HOME/etc/hadoop/Hdfs-site.xml
add below properties.
dfs.replication
1
Open yarn-site.xml File
$ vi $HADOOP_HOME/etc/hadoop/yarn-site.xml
add below properties.
yarn.nodemanager.aux-services mapreduce_shuffle
yarn.nodemanager.env-whitelist JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME
Open Mapred-site.xml File
$ vi $HADOOP_HOME/etc/hadoop/Mapred-site.xml
Add below Properties:
mapreduce.framework.name
yarn
mapreduce.application.classpath
$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
Open hadoop-env.sh file
$ vi $HADOOP_HOME/etc/hadoop/hadoop-env.sh
add below properties:
export JAVA_HOME="/usr/local/Cellar/openjdk/18.0.2/libexec/openjdk.jdk/Contents/Home"
export HADOOP_HOME="/usr/local/Cellar/hadoop/3.3.4/libexec"
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_CLASSPATH=${JAVA_HOME}/lib/tools.jar
Now hadoop Installation has been completed.
Open terminal and format namenode.
$ hdfs namenode -format
Start all the Hadoop Components.
$ sh $HADOOP_HOME/sbin/start-all.sh
You will be able to see below lines on Terminal.
Starting namenodes on [localhost]
Starting datanodes
Starting secondary namenodes [.local]
Starting resourcemanager
Starting nodemanagers
Enter JPS command to check if all name node, Data node, resource manager is started successfully.
$ JPS
You will be able to see below Processes.
5956 DataNode
8679 NodeManager
4200 ResourceManager
3232 Jps
4567 SecondaryNameNode
2890 NameNode
How to Access Hadoop web interfaces (Hadoop Health)
NameNode : http://localhost:9870
NodeManager : http://localhost:8042
Resource Manager (Yarn) : http://localhost:8088/cluster
Now run basic commands of Hadoop.
$ hadoop fs -mkdir -p /data/rahul
$ hadoop fs -put /Users/rahul.patidar/test/testing.csv /data/rahul
$ hadoop fs -ls /data/rahul
5. Install Spark.
Run below command to Install Spark
$ brew install spark
open bash_profile file.
$ vi ~/.bash_profile
Run Below spark Path in file.
export SPARK_HOME="/usr/local/Cellar/apache-spark/3.3.0/libexec"
Run Below Command.
$ source ~/.bash_profile
Validate Spark Installation.
$ spark-shell
6. Install Hive:
To Install hive , we need to Install mysql also, as hive stores metadata in mysql.
Prerequisites for Hive Installation:
HDFS (Hadoop) setup : This step is need to completed before Hive(To store Actual Data of Hive)
Steps for Hive setup:
- Hive Installation
- Setup Hive & Mysql Environment Variables
- Setup Mysql / Derby database: Hive need this database (called Metastore) to store the Hive metadata.
- Download the MySQL driver package
- Hive Configuration
- Initialise the metadata database
- Start Metastore service
- Run Hive
Install Hive with below command
$ brew install hive
Add Hive and mysql details in bash_profile file.
## HIVE env variables
export HIVE_HOME=/usr/local/Cellar/hive/3.1.3/libexec
export PATH=$PATH:/$HIVE_HOME/bin## MySQL ENV
export PATH=$PATH:/usr/local/mysql-8.0.12-macos10.13-x86_64/bin
7. Setup Mysql database
Download Mysql database by going to the following Mysql site. After installing the Mysql follow the next steps to initialise the metadata database.
You need to give password in the last step while installing mysql. Remember this password , this we will use to connect mysql.There are no commands for mysql installation, You just need to download the .dmg file and double and follow the steps.
Now connect from mysql.
$ cd /usr/local/Cellar/hive/3.1.3/libexec
$ mysql -u root -p
It will ask for password enter you password(Which You Entered While Installing Mysql) and press enter.
Create new Database.
mysql> CREATE database metastore;
Enter below command it will execute Scripts.
mysql> Source /usr/local/Cellar/hive/3.1.3/libexec/scripts/metastore/upgrade/mysql/hive-schema-3.1.0.mysql.sql
CREATE a new user
mysql> CREATE user 'hive'@'localhost' identified by '12345678';
Modify user permissions:
mysql> GRANT ALL PRIVILEGES ON *.* to 'hive'@'localhost';
Refresh privileges:
mysql> FLUSH PRIVILEGES;
1. Open the MySQL official website2. Select Platform Independent -> Download to your machine3. Unzip -> Place the jar “mysql-connector-java-8.0.19.jar” into the /usr/local/Cellar/hive/3.1.3/libexec/lib directory
Create a file named ‘hive-site.xml’ in the following location.
$ vi /usr/local/Cellar/hive/3.1.3/libexec/conf/hive-site.xml
Add below contents in the file.
hive.querylog.location
/Users/apache-hive-3.1.3-bin/log/hive.log
hive.querylog.enable.plan.progress
false
hive.log.explain.output
false
javax.jdo.option.ConnectionURL
jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true
javax.jdo.option.ConnectionDriverName
com.mysql.cj.jdbc.Driver
javax.jdo.option.ConnectionUserName
hive
javax.jdo.option.ConnectionPassword
12345678
hive.metastore.schema.verification
false
hive.metastore.warehouse.dir
hdfs://localhost:9000/user/hive/warehouse
Execute below commands in terminal
$ cd /usr/local/Cellar/hive/3.1.3/libexec/bin
$ schematool -initSchema -dbType mysql
$ cd /usr/local/Cellar/hive/3.1.3/libexec/bin
$ ./hive --service metastore &
Enter below command in terminal , this will start hive.
$ hive
Note: Hadoop NameNode,DataNode should be up and running before starting Hive.
8. Install Sbt (Scala Build Tool):
sudo brew install sbt
Add SBT Paths in bash_profile file.
vi ~/.bash_profile## SBT ENV
export SBT_HOME="/usr/local/Cellar/sbt/1.7.1"
export PATH=$PATH:$SBT_HOME/bin
source ~/.bash_profile
Verify Sbt Installation
which sbt
You should expect output similar to:
/opt/local/bin/sbt
If you get no output sbt is not installed.
Install Kafka:
Download Kafka From this link kafka
Unzip The File.
Go to the root folder of KAFKA.
start Zookeeper.
nohup bin/zookeeper-server-start.sh config/zookeeper.properties &
Start Kafka
nohup bin/kafka-server-start.sh config/server.properties &
Now Create Kafka Topic:
bin/kafka-topics.sh — create — topic test-topic — bootstrap-server localhost:9092 — replication-factor 1 — partitions 4
Conclusion:In This Installation we discussed Installation of below components in MacOS.
1. brew
2. Hadoop
3. scala
4. spark
5. mysql
6. hive
7. Java
8. Sbt
9. Kafka