diagnosing HDFS access on Cloudera CDH3/SCM Express clusters

From looking at various Hadoop/CDH user groups, some people are having difficulty placing files in HDFS for the first time.  A typical scenario goes like this:

  • a newly configured CDH3 cluster (with the primary Namenode also being a Datanode)
  • user is connected via SSH to the Namenode
  • user attempts to copy a file using -copyFromLocal

The copy command is silent, and it’s not clear where/how the file was placed.  In the most recent case, the user was logged in as root, so he ended up copying the file to “/root/test.txt” on his local filesystem instead of HDFS.

Let’s say you have hostnames defined for all your machines (as you should), and your primary namenode is called “hadoop01”.

Becase your particular HDFS issue could be caused by anything from account permissions to Hadoop library conflicts to incorrect configuration – it helps to see if you can work with HDFS by explicitly specifying the target.  For example (assuming you are SSH’ed into your Namenode):

hadoop fs -ls hdfs://hadoop01/

You should see the root level of your HDFS and user/group ownership:

Found 4 items
drwxr-xr-x   – hbase hbase           0 2011-11-15 10:49 /hbase
drwxr-xr-x   – hdfs  hadoop          0 2011-11-15 11:08 /system
drwxrwxrwt   – hdfs  hadoop          0 2011-11-15 11:35 /tmp
drwxr-xr-x   – hdfs  hadoop          0 2011-11-15 09:50 /user 

Now, try to put a file in HDFS while specifying the target filesystem:

hadoop fs -copyFromLocal text.txt hdfs://hadoop01/tmp/test.txt 

This should either work, which you can verify by issuing the following HDFS command:

hadoop fs -ls hdfs://hadoop01/tmp/test.txt

… or it will fail, while giving you a more helpful error message, for example:

copyFromLocal:  org.apache.hadoop.hdfs.server.namenode.SafeModeException:
Cannot create file/text.txt. Name node is in safe mode. 

Now you know what to fix.


