Listing HDFS FilesΒΆ
Now let us walk through different options we have with hdfs ls
command to list the files.
We can get usage by running
hdfs dfs -usage ls
.
%%sh
hdfs dfs -usage ls
We can get help using
hdfs dfs -help ls
%%sh
hdfs dfs -help ls
Let us list all the files in /public/nyse_all/nyse_data folder. It is one of the public data sets that are available under /public. By default files and folders are sorted in ascending order by name.
%%sh
hdfs dfs -ls /public/nyse_all/nyse_data
%%sh
hdfs dfs -ls -r /public/nyse_all/nyse_data
We can sort the files and directories by time using
-t
option. By default you will see latest files at top. We can reverse it by using-t -r
.
%%sh
hdfs dfs -ls -t /public/nyse_all/nyse_data
%%sh
hdfs dfs -ls -t -r /public/nyse_all/nyse_data
We can sort the files and directories by size using
-S
. By default, the files will be sorted in descending order by size. We can reverse the sorting order using-S -r
.
%%sh
hdfs dfs -ls -S /public/nyse_all/nyse_data
%%sh
hdfs dfs -ls -S -r /public/nyse_all/nyse_data
%%sh
hdfs dfs -ls -h /public/nyse_all/nyse_data
%%sh
hdfs dfs -ls -h -t /public/nyse_all/nyse_data
%%sh
hdfs dfs -ls -h -S /public/nyse_all/nyse_data
%%sh
hdfs dfs -ls -h -S -r /public/nyse_all/nyse_data