Setup Hadoop on Macos
In particular, Hadoop will run in a standalone mode. The installation will allow full functionalities for coding practice although it does not provide cluster performance.
Install brew
with the following command
ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
Install java
with the following command
brew install Caskroom/cask/java
Install hadoop
witht the following command
brew install hadoop
Hadoop is the located in the directory ` /usr/local/Cellar/hadoop/2.7.1/` in which my current verion number is 2.7.1
For Hadoop configuration, we need to modify the following files in the Macos system
/usr/local/Cellar/hadoop/2.7.1/libexec/etc/hadoop/hadoop-env.sh
/usr/local/Cellar/hadoop/2.7.1/libexec/etc/hadoop/Core-site.xml
/usr/local/Cellar/hadoop/2.7.1/libexec/etc/hadoop/mapred-site.xml
/usr/local/Cellar/hadoop/2.7.1/libexec/etc/hadoop/hdfs-site.xml
~/.profile
/usr/local/Cellar/hadoop/2.7.1/libexec/etc/hadoop/hadoop-env.sh
Find the following line from the file
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"
Change it into
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true -Djava.security.krb5.realm= -Djava.security.krb5.kdc="
/usr/local/Cellar/hadoop/2.7.1/libexec/etc/hadoop/Core-site.xml
Find the block for configuration
<configuration>
</configuration>
Replace with the following content
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/Cellar/hadoop/hdfs/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
/usr/local/Cellar/hadoop/2.7.1/libexec/etc/hadoop/mapred-site.xml
Find the block for configuration in the file
<configuration>
</configuration>
Replace it with following block
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9010</value>
</property>
</configuration>
/usr/local/Cellar/hadoop/2.7.1/libexec/etc/hadoop/hdfs-site.xml
Find the block for configuration in the file
<configuration>
</configuration>
Replace it with following block
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
~/.profile
Add alias in ~/.profile
alias hstart="/usr/local/Cellar/hadoop/2.7.1/sbin/start-dfs.sh;
/usr/local/Cellar/hadoop/2.7.1/sbin/start-yarn.sh"
alias hstop="/usr/local/Cellar/hadoop/2.7.1/sbin/stop-yarn.sh;
/usr/local/Cellar/hadoop/2.7.1/sbin/stop-dfs.sh"
Update the file with the following command
source ~/.profile
hdfs namenode -format
.system preference->sharing->remove login
.hstart
.hstop
.