CREATE TABLE
Creating tables is one of the most complicated operations for many databases because you might need to:
- Manually specify the engine
- Manually specify the indexes
- And even specify the data partitions or data shard
Databend aims to be easy to use by design and does NOT require any of those operations when you create a table. Moreover, the CREATE TABLE statement provides these options to make it much easier for you to create tables in various scenarios:
- CREATE TABLE: Creates a table from scratch.
- CREATE TABLE ... LIKE: Creates a table with the same column definitions as an existing one.
- CREATE TABLE ... AS: Creates a table and inserts data with the results of a SELECT query.
- CREATE TRANSIENT TABLE: Creates a table without storing its historical data for Time Travel.
- CREATE TABLE ... SNAPSHOT_LOCATION: Creates a table and inserts data with a snapshot file.
- CREATE TABLE ... EXTERNAL_LOCATION: Creates a table and specifies an S3 bucket for the data storage instead of the FUSE engine.
CREATE TABLE
CREATE [TRANSIENT] TABLE [IF NOT EXISTS] [db.]table_name
(
<column_name> <data_type> [ NOT NULL | NULL] [ { DEFAULT <expr> }],
<column_name> <data_type> [ NOT NULL | NULL] [ { DEFAULT <expr> }],
...
) [CLUSTER BY(<expr> [, <expr>, ...] )]
<data_type>:
TINYINT
| SMALLINT
| INT
| BIGINT
| FLOAT
| DOUBLE
| DATE
| TIMESTAMP
| VARCHAR
| ARRAY
| OBJECT
| VARIANT
For detailed information about the CLUSTER BY clause, see SET CLUSTER KEY.
CREATE TABLE ... LIKE
Creates an empty copy of an existing table, the new table automatically copies all column names, their data types, and their not-null constraints.
Syntax:
CREATE TABLE [IF NOT EXISTS] [db.]table_name
LIKE [db.]origin_table_name
CREATE TABLE ... AS
Creates a table and fills it with data computed by a SELECT command.
Syntax:
CREATE TABLE [IF NOT EXISTS] [db.]table_name
AS SELECT query
CREATE TRANSIENT TABLE
Creates a transient table.
Transient tables are used to hold transitory data that does not require a data protection or recovery mechanism. Dataebend does not hold historical data for a transient table so you will not be able to query from a previous version of the transient table with the Time Travel feature, for example, the AT clause in the SELECT statement will not work for transient tables. Please note that you can still drop and undrop a transient table.
Transient tables help save your storage expenses because they do not need extra space for historical data compared to non-transient tables. See example for detailed explanations.
Syntax:
CREATE TRANSIENT TABLE ...
CREATE TABLE ... SNAPSHOT_LOCATION
Creates a table and inserts data from a snapshot file.
Databend automatically creates snapshots when data updates occur, so a snapshot can be considered as a view of your data at a time point in the past. Databend may store many snapshots of a table (depending on the number of update operations you performed) for the Time Travel feature that allows you to query, back up, and restore from a previous version of your data within the retention period (24 hours by default).
This command enables you to insert the data stored in a snapshot file when you create a table. Please note that the table you create must have same column definations as the data from the snapshot.
Syntax:
CREATE TABLE [IF NOT EXISTS] [db.]table_name
(
<column_name> <data_type> [ NOT NULL | NULL] [ { DEFAULT <expr> }],
<column_name> <data_type> [ NOT NULL | NULL] [ { DEFAULT <expr> }],
...
)
SNAPSHOT_LOCATION = '<SNAPSHOT_FILENAME>';
To obtain the snapshot information (including the snapshot locations) of a table, execute the following command:
SELECT *
FROM Fuse_snapshot('<database_name>', '<table_name>');
CREATE TABLE ... EXTERNAL_LOCATION
Creates a table and specifies an S3 bucket for the data storage instead of the FUSE engine.
Databend stores the table data in the location configured in the file databend-query.toml
by default. This option enables you to store the data (in parquet format) in a table in another bucket instead of the default one.
Syntax:
CREATE TABLE [IF NOT EXISTS] [db.]table_name
<column_name> <data_type> [ NOT NULL | NULL] [ { DEFAULT <expr> }],
<column_name> <data_type> [ NOT NULL | NULL] [ { DEFAULT <expr> }],
...
's3://<bucket>/[<path>]'
CONNECTION = (AWS_KEY_ID = '<your_aws_key_id>' AWS_SECRECT_KEY = '<your_aws_secret_key>' ENDPOINT_URL = '<endpoint_url>');
Parameter | Description | Required |
---|---|---|
s3://<bucket>/[<path>] | Files are in the specified external location (S3-like bucket) | YES |
AWS_KEY_ID = '<your_aws_key_id>' AWS_SECRECT_KEY = '<your_aws_secret_key>' | The credentials for connecting to AWS and accessing the private/protected S3 bucket where the files to load are staged. | Optional |
ENDPOINT_URL = '<endpoint_url>' | S3-compatible endpoint URL like MinIO. Default: https://s3.amazonaws.com | Optional |
Column Nullable
By default, all columns are not nullable(NOT NULL), if you want to specify a column default to NULL
, please use:
CREATE TABLE [IF NOT EXISTS] [db.]table_name
(
<column_name> <data_type> NULL,
...
)
Let check it out how difference the column is NULL
or NOT NULL
.
Create a table t_not_null
which column with NOT NULL
(Databend Column is NOT NULL
by default):
CREATE TABLE t_not_null(a INT);
DESC t_not_null;
+-------+-------+------+---------+
| Field | Type | Null | Default |
+-------+-------+------+---------+
| a | Int32 | NO | 0 |
+-------+-------+------+---------+
Create another table t_null
column with NULL
:
CREATE TABLE t_null(a INT NULL);
DESC t_null;
+-------+-------+------+---------+
| Field | Type | Null | Default |
+-------+-------+------+---------+
| a | Int32 | YES | NULL |
+-------+-------+------+---------+
Default Values
DEFAULT <expression>
Specifies a default value inserted in the column if a value is not specified via an INSERT or CREATE TABLE AS SELECT statement.
For example:
CREATE TABLE t_default_value(a TINYINT UNSIGNED, b VARCHAR DEFAULT 'b');
Desc the t_default_value
table:
DESC t_default_value;
+-------+------------------+------+---------+-------+
| Field | Type | Null | Default | Extra |
+-------+------------------+------+---------+-------+
| a | TINYINT UNSIGNED | NO | 0 | |
| b | VARCHAR | NO | b | |
+-------+------------------+------+---------+-------+
Insert a value:
INSERT INTO T_default_value(a) VALUES(1);
Check the table values:
SELECT * FROM t_default_value;
+------+------+
| a | b |
+------+------+
| 1 | b |
+------+------+
MySQL Compatibility
Databend’s syntax is difference from MySQL mainly in the data type and some specific index hints.
Examples
Create Table
CREATE TABLE test(a BIGINT UNSIGNED, b VARCHAR , c VARCHAR DEFAULT concat(b, '-b'));
DESC test;
+-------+--------+------+---------------+
| Field | Type | Null | Default |
+-------+--------+------+---------------+
| a | UInt64 | NO | 0 |
| b | String | NO | |
| c | String | NO | concat(b, -b) |
+-------+--------+------+---------------+
INSERT INTO test(a,b) VALUES(888, 'stars');
SELECT * FROM test;
+------+-------+---------+
| a | b | c |
+------+-------+---------+
| 888 | stars | stars-b |
+------+-------+---------+
Create Table ... Like
CREATE TABLE test2 LIKE test;
DESC test2;
+-------+--------+------+---------------+
| Field | Type | Null | Default |
+-------+--------+------+---------------+
| a | UInt64 | NO | 0 |
| b | String | NO | |
| c | String | NO | concat(b, -b) |
+-------+--------+------+---------------+
INSERT INTO test2(a,b) VALUES(888, 'stars');
SELECT * FROM test2;
+------+-------+---------+
| a | b | c |
+------+-------+---------+
| 888 | stars | stars-b |
+------+-------+---------+
Create Table ... As
CREATE TABLE test3 AS SELECT * FROM test2;
DESC test3;
+-------+--------+------+---------------+
| Field | Type | Null | Default |
+-------+--------+------+---------------+
| a | UInt64 | NO | 0 |
| b | String | NO | |
| c | String | NO | concat(b, -b) |
+-------+--------+------+---------------+
SELECT * FROM test3;
+------+-------+---------+
| a | b | c |
+------+-------+---------+
| 888 | stars | stars-b |
+------+-------+---------+
Create Transient Table
-- Create a transient table
CREATE TRANSIENT TABLE mytemp (c bigint);
-- Insert values
insert into mytemp values(1);
insert into mytemp values(2);
insert into mytemp values(3);
-- Only one snapshot is stored. This explains why the Time Travel feature does not work for transient tables.
select count(*) from fuse_snapshot('default', 'mytemp');
+---------+
| count() |
+---------+
| 1 |
Create Table ... Snapshot_Location
CREATE TABLE members
(
name VARCHAR
);
INSERT INTO members
VALUES ('Amy');
SELECT snapshot_id,
timestamp,
snapshot_location
FROM fuse_snapshot('default', 'members');
+-----------------------------------+----------------------------+------------------------------------------------------------+
| snapshot_id | timestamp | snapshot_location |
+-----------------------------------+----------------------------+------------------------------------------------------------+
| b5931727ee404869ab99b25bf9e672a9 | 2022-08-29 17:53:54.243561 | 418920/604411/_ss/b5931727ee404869ab99b25bf9e672a9_v1.json |
+-----------------------------------+----------------------------+------------------------------------------------------------+
INSERT INTO members
VALUES ('Bob');
SELECT snapshot_id,
timestamp,
snapshot_location
FROM fuse_snapshot('default', 'members');
+----------------------------------+----------------------------+------------------------------------------------------------+
| snapshot_id | timestamp | snapshot_location |
|----------------------------------|----------------------------|------------------------------------------------------------|
| b5931727ee404869ab99b25bf9e672a9 | 2022-08-29 17:53:54.243561 | 418920/604411/_ss/b5931727ee404869ab99b25bf9e672a9_v1.json |
| 12637e70dd1c4abbab15470fa0a6d69b | 2022-08-29 18:04:18.973272 | 418920/604411/_ss/12637e70dd1c4abbab15470fa0a6d69b_v1.json |
+----------------------------------+----------------------------+------------------------------------------------------------+
-- Create a new table with a snapshot (ID: b5931727ee404869ab99b25bf9e672a9)
CREATE TABLE members_previous
(
name VARCHAR
)
snapshot_location='418920/604411/_ss/b5931727ee404869ab99b25bf9e672a9_v1.json';
-- The created table contains "Amy" that is stored in the snapshot
SELECT *
FROM members_previous;
---
Amy
Create Table ... External_Location
-- Create a table named `mytable` and specify the location `s3://testbucket/admin/data/` for the data storage
CREATE TABLE mytable(a int)
's3://testbucket/admin/data/'
connection=(aws_key_id='<your_aws_key_id>' aws_secret_key='<your_aws_secret_key>' endpoint_url='https://s3.amazonaws.com');