Understanding Linux File SystemΒΆ

Let us get quick overview of Linux File System.

  • All the public data sets are under /data.

  • Users have read only access on them.

  • We will go through some of the important commands to understand how we typically manage files using Linux.

    • ls - list the files.

    • mkdir - to create empty directory.

    • cp - to copy files.

    • rm - to delete the files or directories.

  • All these commands deal with local files on Linux. We need to use hdfs dfs or hadoop fs to deal with files in HDFS.

%%sh

uname -a
  • We can access files in Linux File System using ls command. Also the file system starts with /. It is also called as root file system. By default files and folders will be alphabetically sorted.

Note

In Windows, the local file system starts with C:\ (C Drive), D:\ (D Drive) etc.

%%sh

ls /
  • We typically use ls -ltr to get the list of files and directories along with their properties.

    • l for listing properties.

    • t for sorting files and folders based up on time. By default latest files comes on top.

    • r to reverse the sorting order.

  • ls -ltr will provide the list of files and directories sorted by time where latest files comes at the end.

%%sh

ls -ltr /
%%sh

ls -ltr /data
  • You will have read only access to these local files. As demonstrated below only owner (root) have write access to retail_db folder where as others have read and execute permissions only.

%%sh

ls -ltr /data/retail_db
  • You can copy them to your user space with in HDFS. It will be /user/YOUR_LOGIN_USER.

  • You can determine your home directory on linux file system as well as your user space with in HDFS. You can run echo $HOME to get details about your home directory on linux file system. It is same as /home/$USER.

%%sh

echo $HOME
  • You can also copy these files to your home directory on linux file system using cp with options r and f.

  • It will take care of recursively copying folder /data/retail_db to /user/${USER}/retail_db

%%sh

rm -rf /home/${USER}/retail_db
%%sh

ls -ltr /home/${USER}/retail_db
%%sh

cp -rf /data/retail_db /home/${USER}
%%sh

ls -ltr /home/${USER}/retail_db
  • We can also delete the folder /user/${USER}/retail_db recursively using rm -rf.

%%sh

rm -rf /home/${USER}/retail_db