Copying files from HDFS to Local¶
We can copy files from HDFS to local file system either by using copyToLocal or get command.
hdfs dfs -copyToLocalorhdfs dfs -get– to copy files or directories from HDFS to local filesystem.It will read all the blocks using index in sequence and construct the file in local file system.
If the target file or directory already exists in the local file system,
getwill fail saying already exists
%%sh
hdfs dfs -help get
%%sh
hdfs dfs -help copyToLocal
Warning
This will copy the entire folder from /user/${USER}/retail_db to local home directory and you will see /home/${USER}/retail_db.
%%sh
hdfs dfs -ls /user/${USER}/retail_db
%%sh
ls -ltr /home/${USER}/
%%sh
mkdir /home/${USER}/retail_db
%%sh
hdfs dfs -get /user/${USER}/retail_db/* /home/${USER}/retail_db
%%sh
ls -ltr /home/${USER}/retail_db
Note
This will fail as retail_db folder already exists.
%%sh
hdfs dfs -get /user/${USER}/retail_db /home/${USER}
Note
Alternative approach, where the folder and contents are copied directly.
%%sh
rm -rf /home/${USER}/retail_db
%%sh
ls -ltr /home/${USER}
%%sh
hdfs dfs -get /user/${USER}/retail_db /home/${USER}
%%sh
ls -ltr /home/${USER}/retail_db/*
We can also use patterns while using
getcommand to get files from HDFS to local file system. Also, we can pass multiple files or folders in HDFS togetcommand.
%%sh
rm -rf /home/${USER}/retail_db
%%sh
ls -ltr /home/${USER}
%%sh
mkdir /home/${USER}/retail_db
%%sh
hdfs dfs -get /user/${USER}/retail_db/order* /home/${USER}/retail_db
%%sh
ls -ltr /home/${USER}/retail_db
%%sh
hdfs dfs -get /user/${USER}/retail_db/departments /user/${USER}/retail_db/products /home/${USER}/retail_db
%%sh
ls -ltr /home/${USER}/retail_db
%%sh
hdfs dfs -get /user/${USER}/retail_db/categories /user/${USER}/retail_db/customers /home/${USER}/retail_db
%%sh
ls -ltr /home/${USER}/retail_db