Companion Guide:
http://wiki.apache.org/hadoop/GettingStartedWithHadoop
OS: Linux 2.6.9-78.0.0.0.1.ELsmp
After downloading hadoop,
$ gunzip -c hadoop-0.20.1.tar.gz tar xvf -
$ ssh-keygen -t rsa <<>>
Generates id_rsa.pub under $HOME/.ssh.
cd $HOME/.ssh ; cat id_rsa.pub >> aothorized_keys
$ Set JAVA_HOME to Java 6.
=> Set this in hadoop-env.sh under HADOOP_INSTALL/conf'
Setting JAVA_HOME to Java 5 results in error:
Exception in thread "main" java.lang.UnsupportedClassVersionError: Bad version number in .class file
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:620)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:124)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:260)
at java.net.URLClassLoader.access$100(URLClassLoader.java:56)
at java.net.URLClassLoader$1.run(URLClassLoader.java:195)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
After setting JAVA_HOME to Java 6 still error:
localhost: Exception in thread "main" java.lang.NullPointerException
localhost: at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:134)
localhost: at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:156)
localhost: at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:160)
localhost: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:131)
localhost: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.
localhost: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:469)
Patch required:
https://issues.apache.org/jira/browse/HADOOP-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
This is a known issue with 0.20.
Hadoop 0.21 is yet to freeze and deliver. So going to 0.19.2 @
http://apache.g5searchmarketing.com/hadoop/core/hadoop-0.19.2/
Namenode fails with the error:
Namenode logs still give:
2009-09-17 11:56:18,420 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.lang.NullPointerException
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:132)
at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:130)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:150)
at org.apache.hadoop.hdfs.server.namenode.NameNode.
at org.apache.hadoop.hdfs.server.namenode.NameNode.
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
Under hadoop/conf, set fs.default.name to hdfs://localhost:9090
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.
Now, datanode and secondarynamenode did come up but namenode still is FLAT,
not to mention the job/tracker.
2009-09-17 12:02:35,836 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed.
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /tmp/hadoop-*/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:278)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:163)
at org.apache.hadoop.hdfs.server.namenode.NameNode.
at org.apache.hadoop.hdfs.server.namenode.NameNode.
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:859)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:868)
Change:
hadoop.tmp.dir in hadoop-default.xml under hadoop/conf
and create $(hadoop.tmp.dir)/dfs/name directory.
$ stop-all.sh ; start-all.sh
Error again:
java.io.IOException: NameNode is not formatted. at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:305) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:309) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.
Run command: hadoop namenode -format
$ ./stop-all.sh ; start-all.sh
Now, NameNode, SecondaryNameNode and DataNode are running.
However, JobTracker fails to START.
2009-09-17 12:26:42,275 FATAL org.apache.hadoop.mapred.JobTracker: java.lang.RuntimeException: Not a host:port pair: local at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:134) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:121) at org.apache.hadoop.mapred.JobTracker.getAddress(JobTracker.java:1318) at org.apache.hadoop.mapred.JobTracker.
$ Set:
and voila !!!!
$ stop-all.sh ; start-all.sh
Ref:
http://developer.yahoo.com/hadoop/tutorial/module3.html
No comments:
Post a Comment