雖然懶人包的標題是hadoop-101, 但並不完全只是hadoop, 而是補蹲若干年前沒好好蹲的馬步, 首先從過程中參考到眾多文獻裡節錄多次翻閱的連結:
懶人包涵蓋了zookeeper + hadoop(HA + federation) + hbase + phoenix, 首先我們先看到懶人包所需要4台機器, 每台機器的配置如下圖所示:
懶人包所需resource可以從github下載:
https://github.com/cyu021/hadoop-101
以下是懶人包的使用步驟:
>> step-01
- download hadoop-base.box and import to vagrant box
- download vagrant resources for hadoop-01, hadoop-02, hadoop-03, hadoop-04
>> step-02 [apply to hadoop-01, hadoop-02, hadoop-03, hadoop-04]
$ cd {path-hadoop-xx}
$ vagrant up && vagrant ssh -c 'sudo su -'
$ ifconfig eth1
把ip記下來並更新到各自node下的hosts-iphone (eg. .../hadoop-01/cfg/hosts/hosts-iphone)檔案中
$ exit
離開vm
$ vagrant provision && vagrant ssh -c 'sudo su -'
>> step-03 [apply to hadoop-01, hadoop-02, hadoop-03]
$ /opt/zookeeper/bin/zkServer.sh start
$ /opt/zookeeper/bin/zkCli.sh
addauth digest super:AAAaaa111
quit
第一次做就可以, 之後啟動VM時不用做
$ echo stat | nc localhost 2181
>> step-04 [apply to hadoop-01]
$ /opt/hadoop/bin/hdfs zkfc -formatZK
第一次做就可以, 之後啟動VM時不用做
>> step-05 [apply to hadoop-01, hadoop-02, hadoop-03]
$ /opt/hadoop/sbin/hadoop-daemon.sh start journalnode
>> step-06 [apply to hadoop-01]
$ /opt/hadoop/bin/hdfs namenode -format
第一次做就可以, 之後啟動VM時不用做
>> step-07 [apply to hadoop-01]
$ /opt/hadoop/sbin/hadoop-daemon.sh start namenode
>> step-08 [apply to hadoop-02]
$ /opt/hadoop/bin/hdfs namenode -bootstrapStandby && /opt/hadoop/sbin/hadoop-daemon.sh start namenode
第一次做就可以, 之後啟動VM時用下面的command
$ /opt/hadoop/sbin/hadoop-daemon.sh start namenode
>> step-09 [apply to hadoop-01, hadoop-02]
$ /opt/hadoop/sbin/hadoop-daemon.sh start zkfc
>> step-10 [apply to hadoop-01, hadoop-02, hadoop-03]
$ /opt/hadoop/sbin/hadoop-daemon.sh start datanode
>> step-11 [apply to hadoop-01, hadoop-02]
$ /opt/hadoop/sbin/yarn-daemon.sh start resourcemanager
>> step-12 [apply to hadoop-01, hadoop-02, hadoop-03]
$ jps
每個node應該看到的process (ignore PID):
-- hadoop-01
3282 QuorumPeerMain
3827 ResourceManager
3365 JournalNode
3461 NameNode
3701 DataNode
3878 Jps
3613 DFSZKFailoverController
-- hadoop-02
3360 JournalNode
3457 NameNode
3713 DataNode
3608 DFSZKFailoverController
3278 QuorumPeerMain
3839 ResourceManager
4063 Jps
-- hadoop-03
3539 Jps
3368 JournalNode
3467 DataNode
3279 QuorumPeerMain
>> step-13
open http://{hadoop-01}:9870/ in browser
>> step-14 [apply to hadoop-04]
$ cd /opt/hbase/bin && ./start-hbase.sh
$ jps
應該看到的prcoess (ignore PID):
3996 Jps
3582 HMaster
3727 HRegionServer
$ cd /opt/phoenix/bin && ./queryserver.py
>> step-14 [apply on laptop]
- copy phoenix-4.13.1-HBase-1.3-thin-client.jar from hadoop-04 (/opt/phoenix/phoenix-4.13.1-HBase-1.3-thin-client.jar) to laptop
- add jar to squirrel or any other db client
- setup jdbc driver with phoenix-4.13.1-HBase-1.3-thin-client.jar
- setup connection and connect