Wụnye ụyọkọ Hadoop Multinode site na iji CDH4 na RHEL/CentOS 6.5
Hadoop bụ usoro mmemme mepere emepe nke apache mepụtara iji hazie nnukwu data. Ọ na-eji HDFS (Hadoop Distributed File System) iji chekwaa data n'ofe datanodes niile dị na ụyọkọ ahụ n'ụzọ nkesa yana mapreduce ụdị iji hazie data ahụ.
Namenode (NN) bụ nna ukwu daemon nke na-achịkwa HDFS na Jobtracker (JT) bụ master daemon maka mapreduce engine.
N'ime nkuzi a, m na-eji CentOS 6.3 VM 'master' na 'node' viz. (nna ukwu na ọnụ bụ aha nnabata m). The 'nna ukwu' IP bụ 172.21.17.175 na ọnụ IP bụ '172.21.17.188'. Ntuziaka ndị a na-arụkwa ọrụ na ụdị RHEL/CentOS 6.x.
hostname master
ifconfig|grep 'inet addr'|head -1 inet addr:172.21.17.175 Bcast:172.21.19.255 Mask:255.255.252.0
hostname node
ifconfig|grep 'inet addr'|head -1 inet addr:172.21.17.188 Bcast:172.21.19.255 Mask:255.255.252.0
Buru ụzọ hụ na ndị ụsụụ ụyọkọ niile nọ na faịlụ '/etc/hosts' (na ọnụ nke ọ bụla), ma ọ bụrụ na ịnweghị ntọala DNS.
cat /etc/hosts 172.21.17.175 master 172.21.17.188 node
cat /etc/hosts 172.21.17.197 qabox 172.21.17.176 ansible-ground
Ịwụnye Hadoop Multinode ụyọkọ na CentOS
Anyị na-eji ebe nchekwa CDH gọọmentị iji wụnye CDH4 na ndị ọbịa niile (Master na Node) na ụyọkọ.
Gaa na ibe nbudata CDH gọọmentị wee jide ụdị CDH4 (ya bụ 4.6) ma ọ bụ ịnwere ike iji iwu wget na-eso budata ebe nchekwa wee wụnye ya.
# wget http://archive.cloudera.com/cdh4/one-click-install/redhat/6/i386/cloudera-cdh-4-0.i386.rpm # yum --nogpgcheck localinstall cloudera-cdh-4-0.i386.rpm
# wget http://archive.cloudera.com/cdh4/one-click-install/redhat/6/x86_64/cloudera-cdh-4-0.x86_64.rpm # yum --nogpgcheck localinstall cloudera-cdh-4-0.x86_64.rpm
Tupu ịwụnye Hadoop Multinode Cluster, tinye igodo GPG Cloudera Ọha na ebe nchekwa gị site na ịme otu n'ime iwu ndị a dịka nhazi usoro gị.
## on 32-bit System ## # rpm --import http://archive.cloudera.com/cdh4/redhat/6/i386/cdh/RPM-GPG-KEY-cloudera
## on 64-bit System ## # rpm --import http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera
Na-esote, gbaa iwu a ka ịwụnye na ịtọlite JobTracker na NameNode na sava Master.
yum clean all yum install hadoop-0.20-mapreduce-jobtracker
yum clean all yum install hadoop-hdfs-namenode
Ọzọ, gbanye iwu ndị a na ihe nkesa Master ka ịtọlite ọnụ ọnụ nke abụọ.
yum clean all yum install hadoop-hdfs-secondarynam
Na-esote, ntọlite tasktracker & datanode na ndị agha ụyọkọ niile (Node) ewezuga JobTracker, NameNode, na Secondary (ma ọ bụ Njikere) AhaNode ndị ọbịa (na ọnụ na nke a).
yum clean all yum install hadoop-0.20-mapreduce-tasktracker hadoop-hdfs-datanode
Ị nwere ike ịwụnye onye ahịa Hadoop na igwe dị iche (na nke a etinyere m ya na datanode ị nwere ike tinye ya na igwe ọ bụla).
yum install hadoop-client
Ugbu a, ọ bụrụ na anyị emechara usoro ndị dị n'elu, ka anyị gaa n'ihu na-ebuga hdfs (a ga-eme ya na oghere niile).
Detuo nhazi ndabara gaa na ndekọ ndekọ /etc/hadoop ( n'ọnụ ọnụ nke ọ bụla na ụyọkọ ).
cp -r /etc/hadoop/conf.dist /etc/hadoop/conf.my_cluster
cp -r /etc/hadoop/conf.dist /etc/hadoop/conf.my_cluster
Jiri iwu ndị ọzọ ka ịtọọ ndekọ aha omenala gị, dị ka ndị a ( n'ọnụ ọnụ nke ọ bụla na ụyọkọ ).
alternatives --verbose --install /etc/hadoop/conf hadoop-conf /etc/hadoop/conf.my_cluster 50 reading /var/lib/alternatives/hadoop-conf alternatives --set hadoop-conf /etc/hadoop/conf.my_cluster
alternatives --verbose --install /etc/hadoop/conf hadoop-conf /etc/hadoop/conf.my_cluster 50 reading /var/lib/alternatives/hadoop-conf alternatives --set hadoop-conf /etc/hadoop/conf.my_cluster
Ugbu a mepee faịlụ 'core-site.xml' wee melite fs.defaultFS na ọnụ ọ bụla na ụyọkọ.
cat /etc/hadoop/conf/core-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://master/</value> </property> </configuration>
cat /etc/hadoop/conf/core-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://master/</value> </property> </configuration>
Na-esote mmelite “dfs.permissions.superusergroup” na hdfs-site.xml na ọnụ nke ọ bụla na ụyọkọ.
cat /etc/hadoop/conf/hdfs-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>dfs.name.dir</name> <value>/var/lib/hadoop-hdfs/cache/hdfs/dfs/name</value> </property> <property> <name>dfs.permissions.superusergroup</name> <value>hadoop</value> </property> </configuration>
cat /etc/hadoop/conf/hdfs-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>dfs.name.dir</name> <value>/var/lib/hadoop-hdfs/cache/hdfs/dfs/name</value> </property> <property> <name>dfs.permissions.superusergroup</name> <value>hadoop</value> </property> </configuration>
Rịba ama: Biko jide n'aka na, nhazi nke dị n'elu dị na oghere niile (mee na otu ọnụ ma mee scp iji detuo na akụkụ ndị ọzọ).
Melite dfs.name.dir ma ọ bụ dfs.namenode.name.dir na 'hdfs-site.xml' na NameNode (na Master na Node). Biko gbanwee uru dị ka akọwara ya.
cat /etc/hadoop/conf/hdfs-site.xml
<property> <name>dfs.namenode.name.dir</name> <value>file:///data/1/dfs/nn,/nfsmount/dfs/nn</value> </property>
cat /etc/hadoop/conf/hdfs-site.xml
<property> <name>dfs.datanode.data.dir</name> <value>file:///data/1/dfs/dn,/data/2/dfs/dn,/data/3/dfs/dn</value> </property>
Mezue iwu dị n'okpuru ka ịmepụta usoro ndekọ aha & jikwaa ikike onye ọrụ na igwe Namenode (Master) na Datanode (Node).
mkdir -p /data/1/dfs/nn /nfsmount/dfs/nn chmod 700 /data/1/dfs/nn /nfsmount/dfs/nn
mkdir -p /data/1/dfs/dn /data/2/dfs/dn /data/3/dfs/dn /data/4/dfs/dn chown -R hdfs:hdfs /data/1/dfs/nn /nfsmount/dfs/nn /data/1/dfs/dn /data/2/dfs/dn /data/3/dfs/dn /data/4/dfs/dn
Hazie Namenode (na Master), site n'inye iwu.
sudo -u hdfs hdfs namenode -format
Tinye ihe ndị a na faịlụ hdfs-site.xml wee dochie uru dị ka egosiri na Master.
<property> <name>dfs.namenode.http-address</name> <value>172.21.17.175:50070</value> <description> The address and port on which the NameNode UI will listen. </description> </property>
Mara: N'ọnọdụ anyị, uru kwesịrị ịbụ adreesị IP nke nna ukwu VM.
Ugbu a, ka anyị tinye MRv1 (Map-reduce version 1). Mepee faịlụ 'mapred-site.xml' na-eso ụkpụrụ dịka egosiri.
cp hdfs-site.xml mapred-site.xml vi mapred-site.xml cat mapred-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>mapred.job.tracker</name> <value>master:8021</value> </property> </configuration>
Na-esote, detuo faịlụ 'mapred-site.xml' na igwe node site na iji iwu scp na-esonụ.
scp /etc/hadoop/conf/mapred-site.xml node:/etc/hadoop/conf/ mapred-site.xml 100% 200 0.2KB/s 00:00
Ugbu a hazie akwụkwọ ndekọ aha nchekwa mpaghara ka MRv1 Daemons jiri. Ọzọ mepee faịlụ 'mapred-site.xml' wee mee mgbanwe dịka egosiri n'okpuru maka TaskTracker ọ bụla.
<property> Â <name>mapred.local.dir</name> Â <value>/data/1/mapred/local,/data/2/mapred/local,/data/3/mapred/local</value> </property>
Mgbe ezipụtachara akwụkwọ ndekọ aha ndị a na faịlụ 'mapred-site.xml', ị ga-emerịrị akwụkwọ ndekọ aha wee kenye ha ikike faịlụ ziri ezi n'ọnụ ọnụ nke ọ bụla na ụyọkọ gị.
mkdir -p /data/1/mapred/local /data/2/mapred/local /data/3/mapred/local /data/4/mapred/local chown -R mapred:hadoop /data/1/mapred/local /data/2/mapred/local /data/3/mapred/local /data/4/mapred/local
Ugbu a gbaa iwu a ka ịmalite HDFS n'ọnụ ọnụ ọ bụla dị na ụyọkọ.
for x in `cd /etc/init.d ; ls hadoop-hdfs-*` ; do sudo service $x start ; done
for x in `cd /etc/init.d ; ls hadoop-hdfs-*` ; do sudo service $x start ; done
Achọrọ ka ịmepụta /tmp jiri ikike kwesịrị ekwesị dịka ekwuru n'okpuru.
sudo -u hdfs hadoop fs -mkdir /tmp sudo -u hdfs hadoop fs -chmod -R 1777 /tmp
sudo -u hdfs hadoop fs -mkdir -p /var/lib/hadoop-hdfs/cache/mapred/mapred/staging sudo -u hdfs hadoop fs -chmod 1777 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging sudo -u hdfs hadoop fs -chown -R mapred /var/lib/hadoop-hdfs/cache/mapred
Ugbu a nyochaa usoro faịlụ HDFS.
sudo -u hdfs hadoop fs -ls -R / drwxrwxrwt - hdfs hadoop 0 2014-05-29 09:58 /tmp drwxr-xr-x - hdfs hadoop 0 2014-05-29 09:59 /var drwxr-xr-x - hdfs hadoop 0 2014-05-29 09:59 /var/lib drwxr-xr-x - hdfs hadoop 0 2014-05-29 09:59 /var/lib/hadoop-hdfs drwxr-xr-x - hdfs hadoop 0 2014-05-29 09:59 /var/lib/hadoop-hdfs/cache drwxr-xr-x - mapred hadoop 0 2014-05-29 09:59 /var/lib/hadoop-hdfs/cache/mapred drwxr-xr-x - mapred hadoop 0 2014-05-29 09:59 /var/lib/hadoop-hdfs/cache/mapred/mapred drwxrwxrwt - mapred hadoop 0 2014-05-29 09:59 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging
Mgbe ịmalite HDFS wee mepụta '/tmp', mana tupu ịmalite JobTracker, biko mepụta ndekọ HDFS akọwapụtara site na paramita 'mapred.system.dir' (site na ndabara & # 36 {hadoop.tmp.dir}/mapred/system). ma gbanwee onye nwe ya ka ọ kpụrụ ya.
sudo -u hdfs hadoop fs -mkdir /tmp/mapred/system sudo -u hdfs hadoop fs -chown mapred:hadoop /tmp/mapred/system
Ka ịmalite MapReduce: biko malite ọrụ TT na JT.
service hadoop-0.20-mapreduce-tasktracker start Starting Tasktracker: [ OK ] starting tasktracker, logging to /var/log/hadoop-0.20-mapreduce/hadoop-hadoop-tasktracker-node.out
service hadoop-0.20-mapreduce-jobtracker start Starting Jobtracker: [ OK ] starting jobtracker, logging to /var/log/hadoop-0.20-mapreduce/hadoop-hadoop-jobtracker-master.out
Na-esote, mepụta ndekọ ụlọ maka onye ọrụ hadoop ọ bụla. a na-atụ aro ka ị mee nke a na NameNode; ọmụmaatụ.
sudo -u hdfs hadoop fs -mkdir /user/<user> sudo -u hdfs hadoop fs -chown <user> /user/<user>
Mara: ebe bụ aha njirimara Linux nke onye ọrụ ọ bụla.
N'aka nke ọzọ, ị nwere ike ịmepụta ndekọ ụlọ dịka ndị a.
sudo -u hdfs hadoop fs -mkdir /user/$USER sudo -u hdfs hadoop fs -chown $USER /user/$USER
Mepee ihe nchọgharị gị wee pịnye url ka http://ip_address_of_namenode:50070 iji nweta Namenode.
Mepee taabụ ọzọ na ihe nchọgharị gị wee pịnye url ka http://ip_address_of_jobtracker:50030 iji nweta JobTracker.
A nwalela usoro a nke ọma na RHEL/CentOS 5.X/6.X. Biko kwuo n'okpuru ebe a ma ọ bụrụ na ị na-eche nsogbu ọ bụla ihu na nrụnye, m ga-enyere gị aka na ngwọta.