From looking at various Hadoop/CDH user groups, some people are having difficulty placing files in HDFS for the first time. A typical scenario goes like this:
- a newly configured CDH3 cluster (with the primary Namenode also being a Datanode)
- user is connected via SSH to the Namenode
- user attempts to copy a file using -copyFromLocal
The copy command is silent, and it’s not clear where/how the file was placed. In the most recent case, the user was logged in as root, so he ended up copying the file to “/root/test.txt” on his local filesystem instead of HDFS.
Let’s say you have hostnames defined for all your machines (as you should), and your primary namenode is called “hadoop01”.
Becase your particular HDFS issue could be caused by anything from account permissions to Hadoop library conflicts to incorrect configuration – it helps to see if you can work with HDFS by explicitly specifying the target. For example (assuming you are SSH’ed into your Namenode):
hadoop fs -ls hdfs://hadoop01/
You should see the root level of your HDFS and user/group ownership:
Found 4 items
drwxr-xr-x – hbase hbase 0 2011-11-15 10:49 /hbase
drwxr-xr-x – hdfs hadoop 0 2011-11-15 11:08 /system
drwxrwxrwt – hdfs hadoop 0 2011-11-15 11:35 /tmp
drwxr-xr-x – hdfs hadoop 0 2011-11-15 09:50 /user
Now, try to put a file in HDFS while specifying the target filesystem:
hadoop fs -copyFromLocal text.txt hdfs://hadoop01/tmp/test.txt
This should either work, which you can verify by issuing the following HDFS command:
hadoop fs -ls hdfs://hadoop01/tmp/test.txt
… or it will fail, while giving you a more helpful error message, for example:
Cannot create file/text.txt. Name node is in safe mode.
Now you know what to fix.