Managing Spark Metastore TablesΒΆ

Let us create our first Spark Metastore table. We will also have a look into how to list the tables.

  • We will get into details related to DDL Commands at a later point in time.

  • For now we will just create our first table. We will get into the details about creating tables as part of subsequent sections.

Use your OS username as prefix for the databases, if you are using our labs

import org.apache.spark.sql.SparkSession

val username = System.getProperty("user.name")
val spark = SparkSession.
    builder.
    config("spark.ui.port", "0").
    config("spark.sql.warehouse.dir", s"/user/${username}/warehouse").
    enableHiveSupport.
    master("yarn").
    appName(s"${username} | Spark SQL - Getting Started").
    getOrCreate
%%sql

SELECT current_database()
%%sql

DROP DATABASE itversity_retail CASCADE
%%sql

CREATE DATABASE itversity_retail
%%sql

USE itversity_retail
%%sql

CREATE TABLE orders (
  order_id INT,
  order_date STRING,
  order_customer_id INT,
  order_status STRING
) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
  • We can list the tables using SHOW tables;

%%sql

SHOW tables
  • We can also drop the table using DROP TABLE command. We will get into more details at a later point in time.

  • We can also truncate the managed tables using TRUNCATE TABLE command.