In Hive, we can create a table by using the conventions similar to the SQL. It supports a wide range of flexibility where the data files for tables are stored. It provides two types of table: -
- Internal table
- External table
Internal Table
The internal tables are also called managed tables as the lifecycle of their data is controlled by the Hive. By default, these tables are stored in a subdirectory under the directory defined by hive.metastore.warehouse.dir (i.e. /user/hive/warehouse). The internal tables are not flexible enough to share with other tools like Pig. If we try to drop the internal table, Hive deletes both table schema and data.
- Let's create an internal table by using the following command:-
Here, the command also includes the information that the data is separated by ','.
- Let's see the metadata of the created table by using the following command:-
- Let's see the result when we try to create the existing table again.
In such a case, the exception occurs. If we want to ignore this type of exception, we can use if not exists command while creating the table.
- While creating a table, we can add the comments to the columns and can also define the table properties.
- Let's see the metadata of the created table by using the following command: -
- Hive allows creating a new table by using the schema of an existing table.
Here, we can say that the new table is a copy of an existing table.
External Table
The external table allows us to create and access a table and a data externally. The external keyword is used to specify the external table, whereas the location keyword is used to determine the location of loaded data.
As the table is external, the data is not present in the Hive directory. Therefore, if we try to drop the table, the metadata of the table will be deleted, but the data still exists.
To create an external table, follow the below steps: -
- Let's create a directory on HDFS by using the following command: -
- Now, store the file on the created directory.
- Let's create an external table using the following command: -
- Now, we can use the following command to retrieve the data: -
No comments:
Post a Comment