From 94a969f8fe75edd6a16d17b72c4e96310f454bb9 Mon Sep 17 00:00:00 2001 From: Airton Lastori Date: Fri, 24 Oct 2025 20:32:11 -0400 Subject: [PATCH 01/14] Update Premium TOC for new migration guide --- TOC-tidb-cloud-premium.md | 2 +- .../premium/import-csv-files-premium.md | 221 ++++++++++ .../premium/import-from-mysql-premium.md | 175 ++++++++ .../premium/migrate-from-op-tidb-premium.md | 412 ++++++++++++++++++ 4 files changed, 809 insertions(+), 1 deletion(-) create mode 100644 tidb-cloud/premium/import-csv-files-premium.md create mode 100644 tidb-cloud/premium/import-from-mysql-premium.md create mode 100644 tidb-cloud/premium/migrate-from-op-tidb-premium.md diff --git a/TOC-tidb-cloud-premium.md b/TOC-tidb-cloud-premium.md index b797bf126cbf5..541efb87f7fab 100644 --- a/TOC-tidb-cloud-premium.md +++ b/TOC-tidb-cloud-premium.md @@ -204,7 +204,7 @@ - Migrate or Import Data - [Overview](/tidb-cloud/tidb-cloud-migration-overview.md) - Migrate Data into TiDB Cloud - - [Migrate from TiDB Self-Managed to TiDB Cloud](/tidb-cloud/migrate-from-op-tidb.md) + - [Migrate from TiDB Self-Managed to TiDB Cloud Premium](/tidb-cloud/premium/migrate-from-op-tidb-premium.md) - [Migrate and Merge MySQL Shards of Large Datasets](/tidb-cloud/migrate-sql-shards.md) - [Migrate from Amazon RDS for Oracle Using AWS DMS](/tidb-cloud/migrate-from-oracle-using-aws-dms.md) - Import Data into TiDB Cloud diff --git a/tidb-cloud/premium/import-csv-files-premium.md b/tidb-cloud/premium/import-csv-files-premium.md new file mode 100644 index 0000000000000..0c102af49d6bb --- /dev/null +++ b/tidb-cloud/premium/import-csv-files-premium.md @@ -0,0 +1,221 @@ +--- +title: Import CSV files into TiDB Cloud Premium +summary: Learn how to import CSV files from Amazon S3 or Alibaba Cloud Object Storage Service (OSS) into TiDB Cloud Premium instances. +--- + +> **Warning:** +> +> TiDB Cloud Premium is currently available in **Private Preview** in select AWS regions. +> +> If Premium is not yet enabled for your organization, or if you need access in another cloud provider or region, click **Support** in the lower-left corner of the [TiDB Cloud console](https://tidbcloud.com/), or submit a request through the [Contact Us form](https://www.pingcap.com/contact-us) on our website. + +# Import CSV files into TiDB Cloud Premium + +This document describes how to import CSV files from Amazon Simple Storage Service (Amazon S3) or Alibaba Cloud Object Storage Service (OSS) into TiDB Cloud Premium instances. + +> **Note:** +> +> For TiDB Cloud Serverless or Essential, see [Import CSV files from cloud storage into TiDB Cloud](/tidb-cloud/import-csv-files-serverless.md). + +## Limitations + +- To ensure data consistency, TiDB Cloud Premium allows importing CSV files into empty tables only. To import data into an existing table that already contains data, you can import the data into a temporary empty table by following this document, and then use the `INSERT SELECT` statement to copy the data to the target existing table. + +## Step 1. Prepare the CSV files + +1. If a CSV file is larger than 256 MB, consider splitting it into smaller files, each with a size around 256 MB. + + TiDB Cloud Premium supports importing very large CSV files but performs best with multiple input files around 256 MB in size. This is because TiDB Cloud Premium can process multiple files in parallel, which can greatly improve the import speed. + +2. Name the CSV files as follows: + + - If a CSV file contains all data of an entire table, name the file in the `${db_name}.${table_name}.csv` format, which maps to the `${db_name}.${table_name}` table when you import the data. + - If the data of one table is separated into multiple CSV files, append a numeric suffix to these CSV files. For example, `${db_name}.${table_name}.000001.csv` and `${db_name}.${table_name}.000002.csv`. The numeric suffixes can be inconsecutive but must be in ascending order. You also need to add extra zeros before the number to ensure all the suffixes are in the same length. + - TiDB Cloud Premium supports importing compressed files in the following formats: `.gzip`, `.gz`, `.zstd`, `.zst` and `.snappy`. If you want to import compressed CSV files, name the files in the `${db_name}.${table_name}.${suffix}.csv.${compress}` format, in which `${suffix}` is optional and can be any integer such as '000001'. For example, if you want to import the `trips.000001.csv.gz` file to the `bikeshare.trips` table, you need to rename the file as `bikeshare.trips.000001.csv.gz`. + + > **Note:** + > + > - To achieve better performance, it is recommended to limit the size of each compressed file to 100 MiB. + > - The Snappy compressed file must be in the [official Snappy format](https://github.com/google/snappy). Other variants of Snappy compression are not supported. + > - For uncompressed files, if you cannot update the CSV filenames according to the preceding rules in some cases (for example, the CSV file links are also used by your other programs), you can keep the filenames unchanged and use the **Mapping Settings** in [Step 4](#step-4-import-csv-files) to import your source data to a single target table. + +## Step 2. Create the target table schemas + +Because CSV files do not contain schema information, before importing data from CSV files into TiDB Cloud Premium, you need to create the table schemas using either of the following methods: + +- Method 1: In TiDB Cloud Premium, create the target databases and tables for your source data. + +- Method 2: In the Amazon S3 or Alibaba Cloud Object Storage Service (OSS) directory where the CSV files are located, create the target table schema files for your source data as follows: + + 1. Create database schema files for your source data. + + If your CSV files follow the naming rules in [Step 1](#step-1-prepare-the-csv-files), the database schema files are optional for the data import. Otherwise, the database schema files are mandatory. + + Each database schema file must be in the `${db_name}-schema-create.sql` format and contain a `CREATE DATABASE` DDL statement. With this file, TiDB Cloud Premium will create the `${db_name}` database to store your data when you import the data. + + For example, if you create a `mydb-scehma-create.sql` file that contains the following statement, TiDB Cloud Premium will create the `mydb` database when you import the data. + + ```sql + CREATE DATABASE mydb; + ``` + + 2. Create table schema files for your source data. + + If you do not include the table schema files in the Amazon S3 or Alibaba Cloud Object Storage Service directory where the CSV files are located, TiDB Cloud Premium will not create the corresponding tables for you when you import the data. + + Each table schema file must be in the `${db_name}.${table_name}-schema.sql` format and contain a `CREATE TABLE` DDL statement. With this file, TiDB Cloud Premium will create the `${db_table}` table in the `${db_name}` database when you import the data. + + For example, if you create a `mydb.mytable-schema.sql` file that contains the following statement, TiDB Cloud Premium will create the `mytable` table in the `mydb` database when you import the data. + + ```sql + CREATE TABLE mytable ( + ID INT, + REGION VARCHAR(20), + COUNT INT ); + ``` + + > **Note:** + > + > Each `${db_name}.${table_name}-schema.sql` file should only contain a single DDL statement. If the file contains multiple DDL statements, only the first one takes effect. + +## Step 3. Configure cross-account access + +To allow TiDB Cloud Premium to access the CSV files in Amazon S3 or Alibaba Cloud Object Storage Service (OSS), do one of the following: + +- If your CSV files are located in Amazon S3, [configure Amazon S3 access](/tidb-cloud/serverless-external-storage.md#configure-amazon-s3-access) for your cluster. + + You can use either an AWS access key or a Role ARN to access your bucket. Once finished, make a note of the access key (including the access key ID and secret access key) or the Role ARN value as you will need it in [Step 4](#step-4-import-csv-files). + +- If your CSV files are located in Alibaba Cloud Object Storage Service (OSS), [configure Alibaba Cloud Object Storage Service (OSS) access](/tidb-cloud/serverless-external-storage.md#configure-alibaba-cloud-object-storage-service-oss-access) for your cluster. + +## Step 4. Import CSV files + +To import the CSV files to TiDB Cloud Premium, take the following steps: + + +
+ +1. Open the **Import** page for your target cluster. + + 1. Log in to the [TiDB Cloud console](https://tidbcloud.com/) and navigate to the [**TiDB Instances**](https://tidbcloud.com/tidbs) page of your project. + + > **Tip:** + > + > You can use the combo box in the upper-left corner to switch between organizations, projects, and clusters. + + 2. Click the name of your target cluster to go to its overview page, and then click **Data** > **Import** in the left navigation pane. + +2. Click **Import data from Cloud Storage**. + +3. On the **Import Data from Cloud Storage** page, provide the following information: + + - **Storage Provider**: select **Amazon S3**. + - **Source Files URI**: + - When importing one file, enter the source file URI in the following format `s3://[bucket_name]/[data_source_folder]/[file_name].csv`. For example, `s3://sampledata/ingest/TableName.01.csv`. + - When importing multiple files, enter the source folder URI in the following format `s3://[bucket_name]/[data_source_folder]/`. For example, `s3://sampledata/ingest/`. + - **Credential**: you can use either an AWS Role ARN or an AWS access key to access your bucket. For more information, see [Configure Amazon S3 access](/tidb-cloud/serverless-external-storage.md#configure-amazon-s3-access). + - **AWS Role ARN**: enter the AWS Role ARN value. + - **AWS Access Key**: enter the AWS access key ID and AWS secret access key. + +4. Click **Next**. + +5. In the **Destination Mapping** section, specify how source files are mapped to target tables. + + When a directory is specified in **Source Files URI**, the **Use [File naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) for automatic mapping** option is selected by default. + + > **Note:** + > + > When a single file is specified in **Source Files URI**, the **Use [File naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) for automatic mapping** option is not displayed, and TiDB Cloud Premium automatically populates the **Source** field with the file name. In this case, you only need to select the target database and table for data import. + + - To let TiDB Cloud Premium automatically map all source files that follow the [File naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) to their corresponding tables, keep this option selected and select **CSV** as the data format. + + - To manually configure the mapping rules to associate your source CSV files with the target database and table, unselect this option, and then fill in the following fields: + + - **Source**: enter the file name pattern in the `[file_name].csv` format. For example: `TableName.01.csv`. You can also use wildcards to match multiple files. Only `*` and `?` wildcards are supported. + + - `my-data?.csv`: matches all CSV files that start with `my-data` followed by a single character, such as `my-data1.csv` and `my-data2.csv`. + - `my-data*.csv`: matches all CSV files that start with `my-data`, such as `my-data-2023.csv` and `my-data-final.csv`. + + - **Target Database** and **Target Table**: select the target database and table to import the data to. + +6. Click **Next**. TiDB Cloud Premium scans the source files accordingly. + +7. Review the scan results, check the data files found and corresponding target tables, and then click **Start Import**. + +8. When the import progress shows **Completed**, check the imported tables. + +
+ +
+ +1. Open the **Import** page for your target cluster. + + 1. Log in to the [TiDB Cloud console](https://tidbcloud.com/) and navigate to the [**TiDB Instances**](https://tidbcloud.com/tidbs) page of your project. + + > **Tip:** + > + > You can use the combo box in the upper-left corner to switch between organizations, projects, and clusters. + + 2. Click the name of your target cluster to go to its overview page, and then click **Data** > **Import** in the left navigation pane. + +2. Click **Import data from Cloud Storage**. + +3. On the **Import Data from Cloud Storage** page, provide the following information: + + - **Storage Provider**: select **Alibaba Cloud OSS**. + - **Source Files URI**: + - When importing one file, enter the source file URI in the following format `oss://[bucket_name]/[data_source_folder]/[file_name].csv`. For example, `oss://sampledata/ingest/TableName.01.csv`. + - When importing multiple files, enter the source folder URI in the following format `oss://[bucket_name]/[data_source_folder]/`. For example, `oss://sampledata/ingest/`. + - **Credential**: you can use an AccessKey pair to access your bucket. For more information, see [Configure Alibaba Cloud Object Storage Service (OSS) access](/tidb-cloud/serverless-external-storage.md#configure-alibaba-cloud-object-storage-service-oss-access). + +4. Click **Next**. + +5. In the **Destination Mapping** section, specify how source files are mapped to target tables. + + When a directory is specified in **Source Files URI**, the **Use [File naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) for automatic mapping** option is selected by default. + + > **Note:** + > + > When a single file is specified in **Source Files URI**, the **Use [File naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) for automatic mapping** option is not displayed, and TiDB Cloud Premium automatically populates the **Source** field with the file name. In this case, you only need to select the target database and table for data import. + + - To let TiDB Cloud Premium automatically map all source files that follow the [File naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) to their corresponding tables, keep this option selected and select **CSV** as the data format. + + - To manually configure the mapping rules to associate your source CSV files with the target database and table, unselect this option, and then fill in the following fields: + + - **Source**: enter the file name pattern in the `[file_name].csv` format. For example: `TableName.01.csv`. You can also use wildcards to match multiple files. Only `*` and `?` wildcards are supported. + + - `my-data?.csv`: matches all CSV files that start with `my-data` followed by a single character, such as `my-data1.csv` and `my-data2.csv`. + - `my-data*.csv`: matches all CSV files that start with `my-data`, such as `my-data-2023.csv` and `my-data-final.csv`. + + - **Target Database** and **Target Table**: select the target database and table to import the data to. + +6. Click **Next**. TiDB Cloud Premium scans the source files accordingly. + +7. Review the scan results, check the data files found and corresponding target tables, and then click **Start Import**. + +8. When the import progress shows **Completed**, check the imported tables. + +
+ +
+ +When you run an import task, if any unsupported or invalid conversions are detected, TiDB Cloud Premium terminates the import job automatically and reports an importing error. + +If you get an importing error, do the following: + +1. Drop the partially imported table. +2. Check the table schema file. If there are any errors, correct the table schema file. +3. Check the data types in the CSV files. +4. Try the import task again. + +## Troubleshooting + +### Resolve warnings during data import + +After clicking **Start Import**, if you see a warning message such as `can't find the corresponding source files`, resolve this by providing the correct source file, renaming the existing one according to [Naming Conventions for Data Import](/tidb-cloud/naming-conventions-for-data-import.md), or using **Advanced Settings** to make changes. + +After resolving these issues, you need to import the data again. + +### Zero rows in the imported tables + +After the import progress shows **Completed**, check the imported tables. If the number of rows is zero, it means no data files matched the Bucket URI that you entered. In this case, resolve this issue by providing the correct source file, renaming the existing one according to [Naming Conventions for Data Import](/tidb-cloud/naming-conventions-for-data-import.md), or using **Advanced Settings** to make changes. After that, import those tables again. diff --git a/tidb-cloud/premium/import-from-mysql-premium.md b/tidb-cloud/premium/import-from-mysql-premium.md new file mode 100644 index 0000000000000..c6701959d391a --- /dev/null +++ b/tidb-cloud/premium/import-from-mysql-premium.md @@ -0,0 +1,175 @@ +--- +title: Import data into TiDB Cloud Premium via MySQL Command-Line Client +summary: Learn how to import small CSV or SQL files into TiDB Cloud Premium instances using the MySQL Command-Line Client (`mysql`). +--- + +> **Warning:** +> +> TiDB Cloud Premium is currently available in **Private Preview** in select AWS regions. +> +> If Premium is not yet enabled for your organization, or if you need access in another cloud provider or region, click **Support** in the lower-left corner of the [TiDB Cloud console](https://tidbcloud.com/), or submit a request through the [Contact Us form](https://www.pingcap.com/contact-us) on our website. + +# Import data into TiDB Cloud Premium using the MySQL Command-Line Client + +This document describes how to import data into TiDB Cloud Premium using the [MySQL Command-Line Client](https://dev.mysql.com/doc/refman/8.0/en/mysql.html) (`mysql`). The following sections provide step-by-step instructions for importing data from SQL or CSV files. These steps use a logical import, meaning the MySQL Command-Line Client replays SQL statements against TiDB Cloud from your local machine. + +> **Tip:** +> +> Logical imports are best suited for relatively small SQL or CSV files. For faster, parallel imports from cloud storage or to process multiple files from [Dumpling](/dumpling-overview.md) exports, see [Import CSV files into TiDB Cloud Premium](/tidb-cloud/premium/import-csv-files-premium.md). + +## Prerequisites + +Before you can import data via the MySQL Command-Line Client to a TiDB Cloud Premium instance, you need the following prerequisites: + +- You have access to your TiDB Cloud Premium instance. +- Install the MySQL Command-Line Client (`mysql`) on your local computer. + +## Step 1. Connect to your TiDB Cloud Premium instance + +Connect to your TiDB instance via the MySQL Command-Line Client. If this is your first time, you will need to configure the network connection and generate the TiDB SQL `root` user password following the steps below. + +1. Log in to the [TiDB Cloud console](https://tidbcloud.com/) and, if applicable, click **Switch to Private Preview** in the lower-left corner to enter the Premium workspace. Then navigate to the [**TiDB Instances**](https://tidbcloud.com/project/instances) page and click the name of your target instance to go to its overview page. + +2. Click **Connect** in the upper-right corner. A connection dialog is displayed. + +3. Ensure the configurations in the connection dialog match your operating environment. + + - **Connection Type** is set to `Public`. + - **Connect With** is set to `MySQL CLI`. + - **Operating System** matches your environment. + + > **Note:** + > + > Premium clusters ship with the public endpoint disabled by default. If you do not see the `Public` option, enable the public endpoint from the instance details page (in the **Network** tab), or ask an organization admin to enable it before proceeding. + +4. Click **Generate Password** to create a random password. If you have already configured a password, reuse that credential or rotate it before proceeding. + +## Step 2. Define the target database and table schema + +Before importing data, create the target table structure that matches your dataset. + +The following is an example SQL file (`products-schema.sql`) that creates a sample database and table. Update the database or table names to match your environment. + +```sql +CREATE DATABASE IF NOT EXISTS test; +USE test; + +CREATE TABLE products ( + product_id INT PRIMARY KEY, + product_name VARCHAR(255), + price DECIMAL(10, 2) +); +``` + +Run the schema file against your TiDB Cloud Premium instance so the database and table exist before you load data in the next step. + +## Step 3. Import data from a SQL or CSV file + +Use the MySQL Command-Line Client to load data into the schema you created in Step 2. Replace the placeholders with your own file paths, credentials, and dataset as needed, then follow the workflow that matches your source format. + + +
+ +Do the following to import data from an SQL file: + +1. Provide a real SQL file (for example, `products.sql`) that contains the data you want to import. This SQL file must include `INSERT` statements with real data, similar to the following: + + ```sql + INSERT INTO products (product_id, product_name, price) VALUES + (1, 'Laptop', 999.99), + (2, 'Smartphone', 499.99), + (3, 'Tablet', 299.99); + ``` + +2. Use the following command to import data from the SQL file: + + ```bash + mysql --comments --connect-timeout 150 \ + -u '' -h -P 4000 -D test \ + --ssl-mode=VERIFY_IDENTITY --ssl-ca= \ + -p < products.sql + ``` + + Replace the placeholder values (for example, ``, ``, ``, ``, and the SQL file name) with your own connection details and file path. + +> **Note:** +> +> The sample schema creates a `test` database and the commands use `-D test`. Change both the schema file and the `-D` parameter if you plan to import into a different database. + +> **Important:** +> +> The SQL user you authenticate with must have the required privileges (for example, `CREATE` and `INSERT`) to define tables and load data into the target database. + +
+
+ +Do the following to import data from a CSV file: + +1. Ensure the target database and table exist in TiDB (for example, the `products` table you created in Step 2). + +2. Provide a sample CSV file (for example, `products.csv`) that contains the data you want to import. The following is an example of a CSV file: + + **products.csv:** + + ```csv + product_id,product_name,price + 1,Laptop,999.99 + 2,Smartphone,499.99 + 3,Tablet,299.99 + ``` + +3. Use the following command to import data from the CSV file: + + ```bash + mysql --comments --connect-timeout 150 \ + -u '' -h -P 4000 -D test \ + --ssl-mode=VERIFY_IDENTITY --ssl-ca= \ + -p \ + -e "LOAD DATA LOCAL INFILE '' INTO TABLE products + FIELDS TERMINATED BY ',' + LINES TERMINATED BY '\n' + IGNORE 1 LINES (product_id, product_name, price);" + ``` + + Replace the placeholder values (for example, ``, ``, ``, ``, ``, and the table name) with your own connection details and dataset paths. + +> **Note:** +> +> For more syntax details about `LOAD DATA LOCAL INFILE`, see [`LOAD DATA`](/sql-statements/sql-statement-load-data.md). + +
+
+ +## Step 4. Validate the imported data + +After the import completes, run basic queries to confirm that the expected rows are present and the data looks correct. + +Use the MySQL Command-Line Client to connect to the same database and run validation queries, such as counting rows and inspecting sample records: + +```bash +mysql --comments --connect-timeout 150 \ + -u '' -h -P 4000 -D test \ + --ssl-mode=VERIFY_IDENTITY --ssl-ca= \ + -p \ + -e "SELECT COUNT(*) AS row_count FROM products; \ + SELECT * FROM products ORDER BY product_id LIMIT 5;" +``` + +Expected output (example): + +```text ++-----------+ +| row_count | ++-----------+ +| 3 | ++-----------+ ++------------+---------------+--------+ +| product_id | product_name | price | ++------------+---------------+--------+ +| 1 | Laptop | 999.99 | +| 2 | Smartphone | 499.99 | +| 3 | Tablet | 299.99 | ++------------+---------------+--------+ +``` + +Replace the placeholder values with your own connection details, and adjust the validation queries to suit the shape of your dataset. diff --git a/tidb-cloud/premium/migrate-from-op-tidb-premium.md b/tidb-cloud/premium/migrate-from-op-tidb-premium.md new file mode 100644 index 0000000000000..c9df3d07cb761 --- /dev/null +++ b/tidb-cloud/premium/migrate-from-op-tidb-premium.md @@ -0,0 +1,412 @@ +--- +title: Migrate from TiDB Self-Managed to TiDB Cloud Premium +summary: Learn how to migrate data from TiDB Self-Managed to TiDB Cloud Premium. +--- + +> **Warning:** +> +> TiDB Cloud Premium is currently available in **Private Preview** in select AWS regions. +> +> If Premium is not yet enabled for your organization, or if you need access in another cloud provider or region, click **Support** in the lower-left corner of the [TiDB Cloud console](https://tidbcloud.com/), or submit a request through the [Contact Us form](https://www.pingcap.com/contact-us) on our website. + +# Migrate from TiDB Self-Managed to TiDB Cloud Premium + +This document describes how to migrate data from your TiDB Self-Managed clusters to TiDB Cloud Premium (AWS) through Dumpling and TiCDC. + +The overall procedure is as follows: + +1. Build the environment and prepare the tools. +2. Migrate full data. The process is as follows: + 1. Export data from TiDB Self-Managed to Amazon S3 using Dumpling. + 2. Import data from Amazon S3 to TiDB Cloud Premium. +3. Replicate incremental data by using TiCDC. +4. Verify the migrated data. + +## Prerequisites + +It is recommended that you put the S3 bucket and the TiDB Cloud Premium cluster in the same region. Cross-region migration might incur additional cost for data conversion. + +Before migration, you need to prepare the following: + +- An [AWS account](https://docs.aws.amazon.com/AmazonS3/latest/userguide/setting-up-s3.html#sign-up-for-aws-gsg) with administrator access +- An [AWS S3 bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/creating-bucket.html) +- [A TiDB Cloud account](/tidb-cloud/tidb-cloud-quickstart.md) with at least the [`Project Data Access Read-Write`](/tidb-cloud/manage-user-access.md#user-roles) access to your target TiDB Cloud Premium cluster hosted on AWS + +## Prepare tools + +You need to prepare the following tools: + +- Dumpling: a data export tool +- TiCDC: a data replication tool + +### Dumpling + +[Dumpling](https://docs.pingcap.com/tidb/dev/dumpling-overview) is a tool that exports data from TiDB or MySQL into SQL or CSV files. You can use Dumpling to export full data from TiDB Self-Managed. + +Before you deploy Dumpling, note the following: + +- It is recommended to deploy Dumpling on a new EC2 instance in the same VPC as the TiDB cluster in TiDB Cloud Premium. +- The recommended EC2 instance type is **c6g.4xlarge** (16 vCPU and 32 GiB memory). You can choose other EC2 instance types based on your needs. The Amazon Machine Image (AMI) can be Amazon Linux, Ubuntu, or Red Hat. + +You can deploy Dumpling by using TiUP or using the installation package. + +#### Deploy Dumpling using TiUP + +Use [TiUP](https://docs.pingcap.com/tidb/stable/tiup-overview) to deploy Dumpling: + +```bash +## Deploy TiUP +curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh +source /root/.bash_profile +## Deploy Dumpling and update to the latest version +tiup install dumpling +tiup update --self && tiup update dumpling +``` + +#### Deploy Dumpling using the installation package + +To deploy Dumpling using the installation package: + +1. Download the [toolkit package](https://docs.pingcap.com/tidb/stable/download-ecosystem-tools). + +2. Extract it to the target machine. You can get Dumpling using TiUP by running `tiup install dumpling`. Afterwards, you can use `tiup dumpling ...` to run Dumpling. For more information, see [Dumpling introduction](https://docs.pingcap.com/tidb/stable/dumpling-overview#dumpling-introduction). + +#### Configure privileges for Dumpling + +You need the following privileges to export data from the upstream database: + +- SELECT +- RELOAD +- LOCK TABLES +- REPLICATION CLIENT +- PROCESS + +### Deploy TiCDC + +You need to [deploy TiCDC](https://docs.pingcap.com/tidb/dev/deploy-ticdc) to replicate incremental data from the upstream TiDB cluster to TiDB Cloud Premium. + +1. Confirm whether the current TiDB version supports TiCDC. TiDB v4.0.8.rc.1 and later versions support TiCDC. You can check the TiDB version by executing `select tidb_version();` in the TiDB cluster. If you need to upgrade it, see [Upgrade TiDB Using TiUP](https://docs.pingcap.com/tidb/dev/deploy-ticdc#upgrade-ticdc-using-tiup). + +2. Add the TiCDC component to the TiDB cluster. See [Add or scale out TiCDC to an existing TiDB cluster using TiUP](https://docs.pingcap.com/tidb/dev/deploy-ticdc#add-or-scale-out-ticdc-to-an-existing-tidb-cluster-using-tiup). Edit the `scale-out.yml` file to add TiCDC: + + ```yaml + cdc_servers: + - host: 10.0.1.3 + gc-ttl: 86400 + data_dir: /tidb-data/cdc-8300 + - host: 10.0.1.4 + gc-ttl: 86400 + data_dir: /tidb-data/cdc-8300 + ``` + +3. Add the TiCDC component and check the status. + + ```shell + tiup cluster scale-out scale-out.yml + tiup cluster display + ``` + +## Migrate full data + +To migrate data from the TiDB Self-Managed cluster to TiDB Cloud Premium, perform a full data migration as follows: + +1. Migrate data from the TiDB Self-Managed cluster to Amazon S3. +2. Migrate data from Amazon S3 to TiDB Cloud Premium. + +### Migrate data from the TiDB Self-Managed cluster to Amazon S3 + +You need to migrate data from the TiDB Self-Managed cluster to Amazon S3 using Dumpling. + +If your TiDB cluster is in a local IDC, or the network between the Dumpling server and Amazon S3 is not connected, you can export the files to the local storage first, and then upload them to Amazon S3 later. + +#### Step 1. Disable the GC mechanism of the upstream TiDB Self-Managed cluster temporarily + +To ensure that newly written data is not lost during incremental migration, you need to disable the upstream cluster's garbage collection (GC) mechanism before starting the migration to prevent the system from cleaning up historical data. + +Run the following command to verify whether the setting is successful. + +```sql +SET GLOBAL tidb_gc_enable = FALSE; +``` + +The following is an example output, in which `0` indicates that it is disabled. + +```sql +SELECT @@global.tidb_gc_enable; ++-------------------------+ +| @@global.tidb_gc_enable | ++-------------------------+ +| 0 | ++-------------------------+ +1 row in set (0.01 sec) +``` + +#### Step 2. Configure access permissions to the Amazon S3 bucket for Dumpling + +Create an access key in the AWS console. See [Create an access key](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html#Using_CreateAccessKey) for details. + +1. Use your AWS account ID or account alias, your IAM user name, and your password to sign in to [the IAM console](https://console.aws.amazon.com/iam/home#/security_credentials). + +2. In the navigation bar on the upper right, choose your user name, and then click **My Security Credentials**. + +3. To create an access key, click **Create access key**. Then choose **Download .csv file** to save the access key ID and secret access key to a CSV file on your computer. Store the file in a secure location. You will not have access to the secret access key again after this dialog box closes. After you download the CSV file, choose **Close**. When you create an access key, the key pair is active by default, and you can use the pair right away. + + ![Create access key](/media/tidb-cloud/op-to-cloud-create-access-key01.png) + + ![Download CSV file](/media/tidb-cloud/op-to-cloud-create-access-key02.png) + +#### Step 3. Export data from the upstream TiDB cluster to Amazon S3 using Dumpling + +Do the following to export data from the upstream TiDB cluster to Amazon S3 using Dumpling: + +1. Configure the environment variables for Dumpling. + + ```shell + export AWS_ACCESS_KEY_ID=${AccessKey} + export AWS_SECRET_ACCESS_KEY=${SecretKey} + ``` + +2. Get the S3 bucket URI and region information from the AWS console. See [Create a bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html) for details. + + The following screenshot shows how to get the S3 bucket URI information: + + ![Get the S3 URI](/media/tidb-cloud/op-to-cloud-copy-s3-uri.png) + + The following screenshot shows how to get the region information: + + ![Get the region information](/media/tidb-cloud/op-to-cloud-copy-region-info.png) + +3. Run Dumpling to export data to the Amazon S3 bucket. + + ```shell + dumpling \ + -u root \ + -P 4000 \ + -h 127.0.0.1 \ + -r 20000 \ + --filetype {sql|csv} \ + -F 256MiB \ + -t 8 \ + -o "${S3 URI}" \ + --s3.region "${s3.region}" + ``` + + The `-t` option specifies the number of threads for the export. Increasing the number of threads improves the concurrency of Dumpling and the export speed, and also increases the database's memory consumption. Therefore, do not set a too large number for this parameter. + + For more information, see [Dumpling](https://docs.pingcap.com/tidb/stable/dumpling-overview#export-to-sql-files). + +4. Check the export data. Usually the exported data includes the following: + + - `metadata`: this file contains the start time of the export, and the location of the master binary log. + - `{schema}-schema-create.sql`: the SQL file for creating the schema + - `{schema}.{table}-schema.sql`: the SQL file for creating the table + - `{schema}.{table}.{0001}.{sql|csv}`: data files + - `*-schema-view.sql`, `*-schema-trigger.sql`, `*-schema-post.sql`: other exported SQL files + +### Migrate data from Amazon S3 to TiDB Cloud Premium + +After you export data from the TiDB Self-Managed cluster to Amazon S3, you need to migrate the data to TiDB Cloud Premium. + +1. In the [TiDB Cloud console](https://tidbcloud.com/), navigate to your TiDB Cloud Premium cluster, click **Data → Import**, and choose **Import data from Cloud Storage** > **Amazon S3**. Copy the **Account ID** and **External ID** supplied in the wizard—you will need them when configuring AWS IAM. + +2. Configure access permissions for Amazon S3. Usually you need the following read-only permissions: + + - s3:GetObject + - s3:GetObjectVersion + - s3:ListBucket + - s3:GetBucketLocation + + If the S3 bucket uses server-side encryption SSE-KMS, you also need to add the KMS permission. + + - kms:Decrypt + +3. Configure the access policy. Go to the [AWS Console > IAM > Access Management > Policies](https://console.aws.amazon.com/iamv2/home#/policies) and switch to your region to check if the access policy for TiDB Cloud Premium exists already. If it does not exist, create a policy following this document [Creating policies on the JSON tab](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_create-console.html). + + The following is an example template for the json policy. + + ```json + ## Create a json policy template + ##: fill in the path to the folder in the S3 bucket where the data files to be imported are located. + ##: fill in the ARN of the S3 bucket. You can click the Copy ARN button on the S3 Bucket Overview page to get it. + ##: fill in the ARN for the S3 bucket KMS key. You can get it from S3 bucket > Properties > Default encryption > AWS KMS Key ARN. For more information, see https://docs.aws.amazon.com/AmazonS3/latest/userguide/viewing-bucket-key-settings.html + + { + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": [ + "s3:GetObject", + "s3:GetObjectVersion" + ], + "Resource": "arn:aws:s3:::" + }, + { + "Effect": "Allow", + "Action": [ + "s3:ListBucket", + "s3:GetBucketLocation" + ], + "Resource": "" + } + // If you have enabled SSE-KMS for the S3 bucket, you need to add the following permissions. + { + "Effect": "Allow", + "Action": [ + "kms:Decrypt" + ], + "Resource": "" + } + , + { + "Effect": "Allow", + "Action": "kms:Decrypt", + "Resource": "" + } + ] + } + ``` + +4. Configure the role. See [Creating an IAM role (console)](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-user.html). In the Account ID field, enter the TiDB Cloud Premium Account ID and TiDB Cloud Premium External ID you noted in Step 1. + +5. Get the Role-ARN. Go to [AWS Console > IAM > Access Management > Roles](https://console.aws.amazon.com/iamv2/home#/roles). Switch to your region. Click the role you have created, and note down the ARN. You will use it when importing data into TiDB Cloud Premium. + +6. Import data to TiDB Cloud Premium by following [Import data from Amazon S3 into TiDB Cloud Premium](/tidb-cloud/premium/import-from-s3-premium.md). + +## Replicate incremental data + +To replicate incremental data, do the following: + +1. Get the start time of the incremental data migration. For example, you can get it from the metadata file of the full data migration. + + ![Start Time in Metadata](/media/tidb-cloud/start_ts_in_metadata.png) + +2. Grant TiCDC to connect to TiDB Cloud Premium. + + 1. In the [TiDB Cloud console](https://tidbcloud.com/tidbs), navigate to the [**TiDB Instances**](https://tidbcloud.com/tidbs) page, and then click the name of your target TiDB Cloud Premium cluster to go to its overview page. + 2. In the left navigation pane, click **Settings** > **Networking**. + 3. On the **Networking** page, click **Add IP Address**. + 4. In the displayed dialog, select **Use IP addresses**, click **+**, fill in the public IP address of the TiCDC component in the **IP Address** field, and then click **Confirm**. Now TiCDC can access TiDB Cloud Premium. For more information, see [Configure an IP Access List](/tidb-cloud/configure-ip-access-list.md). + +3. Get the connection information of the downstream TiDB Cloud Premium cluster. + + 1. In the [TiDB Cloud console](https://tidbcloud.com/tidbs), navigate to the [**TiDB Instances**](https://tidbcloud.com/tidbs) page, and then click the name of your target TiDB Cloud Premium cluster to go to its overview page. + 2. Click **Connect** in the upper-right corner. + 3. In the connection dialog, select **Public** from the **Connection Type** drop-down list and select **General** from the **Connect With** drop-down list. + 4. From the connection information, you can get the host IP address and port of the cluster. For more information, see [Connect via public connection](/tidb-cloud/connect-via-standard-connection.md). + +4. Create and run the incremental replication task. In the upstream cluster, run the following: + + ```shell + tiup cdc cli changefeed create \ + --pd=http://172.16.6.122:2379 \ + --sink-uri="tidb://root:123456@172.16.6.125:4000" \ + --changefeed-id="upstream-to-downstream" \ + --start-ts="431434047157698561" + ``` + + - `--pd`: the PD address of the upstream cluster. The format is: `[upstream_pd_ip]:[pd_port]` + - `--sink-uri`: the downstream address of the replication task. Configure `--sink-uri` according to the following format. Currently, the scheme supports `mysql`, `tidb`, `kafka`, `s3`, and `local`. + + ```shell + [scheme]://[userinfo@][host]:[port][/path]?[query_parameters] + ``` + + - `--changefeed-id`: the ID of the replication task. The format must match the ^[a-zA-Z0-9]+(\-[a-zA-Z0-9]+)*$ regular expression. If this ID is not specified, TiCDC automatically generates a UUID (the version 4 format) as the ID. + - `--start-ts`: specifies the starting TSO of the changefeed. From this TSO, the TiCDC cluster starts pulling data. The default value is the current time. + + For more information, see [CLI and Configuration Parameters of TiCDC Changefeeds](https://docs.pingcap.com/tidb/dev/ticdc-changefeed-config). + +5. Enable the GC mechanism again in the upstream cluster. If no error or delay is found in incremental replication, enable the GC mechanism to resume garbage collection of the cluster. + + Run the following command to verify whether the setting works. + + ```sql + SET GLOBAL tidb_gc_enable = TRUE; + ``` + + The following is an example output, in which `1` indicates that GC is disabled. + + ```sql + SELECT @@global.tidb_gc_enable; + +-------------------------+ + | @@global.tidb_gc_enable | + +-------------------------+ + | 1 | + +-------------------------+ + 1 row in set (0.01 sec) + ``` + +6. Verify the incremental replication task. + + - If the message "Create changefeed successfully!" is displayed in the output, the replication task is created successfully. + - If the state is `normal`, the replication task is normal. + + ```shell + tiup cdc cli changefeed list --pd=http://172.16.6.122:2379 + ``` + + ![Update Filter](/media/tidb-cloud/normal_status_in_replication_task.png) + + - Verify the replication. Write a new record to the upstream cluster, and then check whether the record is replicated to the downstream TiDB Cloud Premium cluster. + +7. Set the same timezone for the upstream and downstream clusters. By default, TiDB Cloud Premium sets the timezone to UTC. If the timezone is different between the upstream and downstream clusters, you need to set the same timezone for both clusters. + + 1. In the upstream cluster, run the following command to check the timezone: + + ```sql + SELECT @@global.time_zone; + ``` + + 2. In the downstream cluster, run the following command to set the timezone: + + ```sql + SET GLOBAL time_zone = '+08:00'; + ``` + + 3. Check the timezone again to verify the setting: + + ```sql + SELECT @@global.time_zone; + ``` + +8. Back up the [query bindings](/sql-plan-management.md) in the upstream cluster and restore them in the downstream cluster. You can use the following query to back up the query bindings: + + ```sql + SELECT DISTINCT(CONCAT('CREATE GLOBAL BINDING FOR ', original_sql,' USING ', bind_sql,';')) FROM mysql.bind_info WHERE status='enabled'; + ``` + + If you do not get any output, query bindings might not be used in the upstream cluster. In this case, you can skip this step. + + After you get the query bindings, run them in the downstream cluster to restore the query bindings. + +9. Back up the user and privilege information in the upstream cluster and restore them in the downstream cluster. You can use the following script to back up the user and privilege information. Note that you need to replace the placeholders with the actual values. + + ```shell + #!/bin/bash + + export MYSQL_HOST={tidb_op_host} + export MYSQL_TCP_PORT={tidb_op_port} + export MYSQL_USER=root + export MYSQL_PWD={root_password} + export MYSQL="mysql -u${MYSQL_USER} --default-character-set=utf8mb4" + + function backup_user_priv(){ + ret=0 + sql="SELECT CONCAT(user,':',host,':',authentication_string) FROM mysql.user WHERE user NOT IN ('root')" + for usr in `$MYSQL -se "$sql"`;do + u=`echo $usr | awk -F ":" '{print $1}'` + h=`echo $usr | awk -F ":" '{print $2}'` + p=`echo $usr | awk -F ":" '{print $3}'` + echo "-- Grants for '${u}'@'${h}';" + [[ ! -z "${p}" ]] && echo "CREATE USER IF NOT EXISTS '${u}'@'${h}' IDENTIFIED WITH 'mysql_native_password' AS '${p}' ;" + $MYSQL -se "SHOW GRANTS FOR '${u}'@'${h}';" | sed 's/$/;/g' + [ $? -ne 0 ] && ret=1 && break + done + return $ret + } + + backup_user_priv + ``` + + After you get the user and privilege information, run the generated SQL statements in the downstream cluster to restore the user and privilege information. From 87cdb5ca3f9c387eecaeb107555c3d7165b1aa77 Mon Sep 17 00:00:00 2001 From: Airton Lastori Date: Fri, 24 Oct 2025 20:51:21 -0400 Subject: [PATCH 02/14] Describe Premium import wizard flow --- .../premium/import-csv-files-premium.md | 34 +++++--- tidb-cloud/premium/import-from-s3-premium.md | 77 +++++++++++++++++++ 2 files changed, 99 insertions(+), 12 deletions(-) create mode 100644 tidb-cloud/premium/import-from-s3-premium.md diff --git a/tidb-cloud/premium/import-csv-files-premium.md b/tidb-cloud/premium/import-csv-files-premium.md index 0c102af49d6bb..a2af85f0e6fa7 100644 --- a/tidb-cloud/premium/import-csv-files-premium.md +++ b/tidb-cloud/premium/import-csv-files-premium.md @@ -114,12 +114,14 @@ To import the CSV files to TiDB Cloud Premium, take the following steps: - When importing one file, enter the source file URI in the following format `s3://[bucket_name]/[data_source_folder]/[file_name].csv`. For example, `s3://sampledata/ingest/TableName.01.csv`. - When importing multiple files, enter the source folder URI in the following format `s3://[bucket_name]/[data_source_folder]/`. For example, `s3://sampledata/ingest/`. - **Credential**: you can use either an AWS Role ARN or an AWS access key to access your bucket. For more information, see [Configure Amazon S3 access](/tidb-cloud/serverless-external-storage.md#configure-amazon-s3-access). - - **AWS Role ARN**: enter the AWS Role ARN value. + - **AWS Role ARN**: enter the AWS Role ARN value. If you need to create a new role, click **Click here to create new one with AWS CloudFormation** and follow the guided steps to launch the provided template, acknowledge the IAM warning, create the stack, and copy the generated ARN back into TiDB Cloud Premium. - **AWS Access Key**: enter the AWS access key ID and AWS secret access key. + - **Test Bucket Access**: click this button after the credentials are in place to confirm that TiDB Cloud Premium can reach the bucket. + - **Target Connection**: supply the TiDB username and password that will run the import. Optionally click **Test Connection** to validate the credentials. 4. Click **Next**. -5. In the **Destination Mapping** section, specify how source files are mapped to target tables. +5. In the **Source Files Mapping** section, TiDB Cloud Premium scans the bucket and proposes mappings between the source files and destination tables. When a directory is specified in **Source Files URI**, the **Use [File naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) for automatic mapping** option is selected by default. @@ -129,8 +131,6 @@ To import the CSV files to TiDB Cloud Premium, take the following steps: - To let TiDB Cloud Premium automatically map all source files that follow the [File naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) to their corresponding tables, keep this option selected and select **CSV** as the data format. - - To manually configure the mapping rules to associate your source CSV files with the target database and table, unselect this option, and then fill in the following fields: - - **Source**: enter the file name pattern in the `[file_name].csv` format. For example: `TableName.01.csv`. You can also use wildcards to match multiple files. Only `*` and `?` wildcards are supported. - `my-data?.csv`: matches all CSV files that start with `my-data` followed by a single character, such as `my-data1.csv` and `my-data2.csv`. @@ -138,9 +138,15 @@ To import the CSV files to TiDB Cloud Premium, take the following steps: - **Target Database** and **Target Table**: select the target database and table to import the data to. -6. Click **Next**. TiDB Cloud Premium scans the source files accordingly. + - **Advanced options**: expand the panel to view the `Ignore compatibility checks (advanced)` toggle. Leave it disabled unless you intentionally want to bypass schema compatibility validation. + + > **Note:** + > + > Manual mapping is coming soon. The UI shows the related controls in a disabled state today, but the workflow below remains accurate for the upcoming release: + +6. TiDB Cloud Premium automatically scans the source path. Review the scan results, check the data files found and corresponding target tables, and then click **Start Import**. -7. Review the scan results, check the data files found and corresponding target tables, and then click **Start Import**. +7. When the import progress shows **Completed**, check the imported tables. 8. When the import progress shows **Completed**, check the imported tables. @@ -167,10 +173,12 @@ To import the CSV files to TiDB Cloud Premium, take the following steps: - When importing one file, enter the source file URI in the following format `oss://[bucket_name]/[data_source_folder]/[file_name].csv`. For example, `oss://sampledata/ingest/TableName.01.csv`. - When importing multiple files, enter the source folder URI in the following format `oss://[bucket_name]/[data_source_folder]/`. For example, `oss://sampledata/ingest/`. - **Credential**: you can use an AccessKey pair to access your bucket. For more information, see [Configure Alibaba Cloud Object Storage Service (OSS) access](/tidb-cloud/serverless-external-storage.md#configure-alibaba-cloud-object-storage-service-oss-access). + - **Test Bucket Access**: click this button after the credentials are in place to confirm that TiDB Cloud Premium can reach the bucket. + - **Target Connection**: supply the TiDB username and password that will run the import. Optionally click **Test Connection** to validate the credentials. 4. Click **Next**. -5. In the **Destination Mapping** section, specify how source files are mapped to target tables. +5. In the **Source Files Mapping** section, TiDB Cloud Premium scans the bucket and proposes mappings between the source files and destination tables. When a directory is specified in **Source Files URI**, the **Use [File naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) for automatic mapping** option is selected by default. @@ -180,8 +188,6 @@ To import the CSV files to TiDB Cloud Premium, take the following steps: - To let TiDB Cloud Premium automatically map all source files that follow the [File naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) to their corresponding tables, keep this option selected and select **CSV** as the data format. - - To manually configure the mapping rules to associate your source CSV files with the target database and table, unselect this option, and then fill in the following fields: - - **Source**: enter the file name pattern in the `[file_name].csv` format. For example: `TableName.01.csv`. You can also use wildcards to match multiple files. Only `*` and `?` wildcards are supported. - `my-data?.csv`: matches all CSV files that start with `my-data` followed by a single character, such as `my-data1.csv` and `my-data2.csv`. @@ -189,11 +195,15 @@ To import the CSV files to TiDB Cloud Premium, take the following steps: - **Target Database** and **Target Table**: select the target database and table to import the data to. -6. Click **Next**. TiDB Cloud Premium scans the source files accordingly. + - **Advanced options**: expand the panel to view the `Ignore compatibility checks (advanced)` toggle. Leave it disabled unless you intentionally want to bypass schema compatibility validation. -7. Review the scan results, check the data files found and corresponding target tables, and then click **Start Import**. + > **Note:** + > + > Manual mapping is coming soon. The UI shows the related controls in a disabled state today, but the workflow below remains accurate for the upcoming release: -8. When the import progress shows **Completed**, check the imported tables. +6. TiDB Cloud Premium automatically scans the source path. Review the scan results, check the data files found and corresponding target tables, and then click **Start Import**. + +7. When the import progress shows **Completed**, check the imported tables. diff --git a/tidb-cloud/premium/import-from-s3-premium.md b/tidb-cloud/premium/import-from-s3-premium.md new file mode 100644 index 0000000000000..ab7280c6626e7 --- /dev/null +++ b/tidb-cloud/premium/import-from-s3-premium.md @@ -0,0 +1,77 @@ +--- +title: Import data from Amazon S3 into TiDB Cloud Premium +summary: Learn how to import CSV files from Amazon S3 into TiDB Cloud Premium instances using the console wizard. +--- + +> **Warning:** +> +> TiDB Cloud Premium is currently available in **Private Preview** in select AWS regions. +> +> If Premium is not yet enabled for your organization, or if you need access in another cloud provider or region, click **Support** in the lower-left corner of the [TiDB Cloud console](https://tidbcloud.com/), or submit a request through the [Contact Us form](https://www.pingcap.com/contact-us) on our website. + +# Import data from Amazon S3 into TiDB Cloud Premium + +This document describes how to import CSV files from Amazon Simple Storage Service (Amazon S3) into TiDB Cloud Premium instances. The steps mirror the current Private Preview user interface and are intended as an initial scaffold for the public preview launch. + +> **Note:** +> +> For TiDB Cloud Serverless or Essential, see [Import CSV files from cloud storage into TiDB Cloud](/tidb-cloud/import-csv-files-serverless.md). For TiDB Cloud Dedicated, see [Import CSV files from cloud storage into TiDB Cloud Dedicated](/tidb-cloud/import-csv-files.md). + +## Limitations + +- To ensure data consistency, TiDB Cloud Premium allows importing CSV files into empty tables only. If the target table already contains data, import into a staging table and then copy the rows with `INSERT ... SELECT`. +- During the preview, the UI only surfaces Amazon S3 as the storage provider. Support for additional providers will be tracked separately. +- Each import job maps one source pattern to a single destination table. + +## Step 1. Prepare the CSV files + +1. If a CSV file is larger than 256 MB, consider splitting it into smaller files around 256 MB so TiDB Cloud Premium can process them in parallel. +2. Name the CSV files following Dumpling conventions: + - Full-table files use the `${db_name}.${table_name}.csv` format. + - Sharded files append numeric suffixes, such as `${db_name}.${table_name}.000001.csv`. + - Compressed files use `${db_name}.${table_name}.${suffix}.csv.${compress}`. +3. Optional schema files (`${db_name}-schema-create.sql`, `${db_name}.${table_name}-schema.sql`) help TiDB Cloud Premium create databases and tables automatically. + + +These naming conventions are identical to the TiDB Cloud Serverless workflow. Update this section after we validate the Premium defaults. + + +## Step 2. Create target schemas (optional) + +If TiDB Cloud Premium should create the databases and tables for you, place the schema files generated by Dumpling in the same S3 directory. Otherwise, create the objects manually in TiDB Cloud Premium before running the import. + +## Step 3. Configure Amazon S3 access + +To allow TiDB Cloud Premium to read your bucket: + +- Provide an AWS Role ARN that trusts TiDB Cloud and grants `s3:GetObject` and `s3:ListBucket` on the relevant paths, or +- Provide an AWS access key (access key ID and secret access key) with equivalent permissions. + +The wizard includes a helper link labeled **Click here to create new one with AWS CloudFormation**. Follow that link if you need TiDB Cloud Premium to pre-fill a CloudFormation stack that creates the role for you. + +## Step 4. Import CSV files from Amazon S3 + +1. Open the Premium workspace in the TiDB Cloud console and select your instance. +2. Go to **Data → Import** and click **Import data from Cloud Storage**. +3. On the **Source Connection** tab: + - Set **Storage Provider** to **Amazon S3**. + - Enter the **Source Files URI** for a single file (`s3://bucket/path/file.csv`) or for a folder (`s3://bucket/path/`). + - Choose **AWS Role ARN** or **AWS Access Key** and provide the credentials. + - Click **Test Bucket Access** to validate connectivity. + Known preview issue: the button returns to the idle state without a success toast. +4. Click **Next** and supply the TiDB SQL username and password for the import job. Optionally test the connection. +5. Review the automatically generated source-to-target mapping. Disable automatic mapping if you need to define custom patterns and destination tables. +6. Click **Next** to run the pre-check. Resolve any warnings about missing files or incompatible schemas. +7. Click **Start Import** to launch the job group. +8. Monitor the job statuses until they show **Completed**, then verify the imported data in TiDB Cloud. + +## Troubleshooting + +- If the pre-check finds zero files, confirm the S3 path and IAM permissions. +- If jobs remain in **Preparing**, make sure the destination tables are empty and the required schema files exist. +- Use the **Cancel** action to stop a job group if you need to adjust mappings or credentials. + +## Next steps + +- [Import data into TiDB Cloud Premium via MySQL CLI](/tidb-cloud/premium/import-from-mysql-premium.md) for scripted imports. +- [Troubleshoot import access denied errors](/tidb-cloud/troubleshoot-import-access-denied-error.md) for IAM-related problems. From 8a9dd85831cef0305d107fd314231620e8b674b3 Mon Sep 17 00:00:00 2001 From: Airton Lastori Date: Fri, 24 Oct 2025 21:01:03 -0400 Subject: [PATCH 03/14] Align migration guide with Premium import flow --- .../premium/migrate-from-op-tidb-premium.md | 47 +++++++++---------- 1 file changed, 22 insertions(+), 25 deletions(-) diff --git a/tidb-cloud/premium/migrate-from-op-tidb-premium.md b/tidb-cloud/premium/migrate-from-op-tidb-premium.md index c9df3d07cb761..9aa1cf51c5b6d 100644 --- a/tidb-cloud/premium/migrate-from-op-tidb-premium.md +++ b/tidb-cloud/premium/migrate-from-op-tidb-premium.md @@ -207,29 +207,33 @@ Do the following to export data from the upstream TiDB cluster to Amazon S3 usin After you export data from the TiDB Self-Managed cluster to Amazon S3, you need to migrate the data to TiDB Cloud Premium. -1. In the [TiDB Cloud console](https://tidbcloud.com/), navigate to your TiDB Cloud Premium cluster, click **Data → Import**, and choose **Import data from Cloud Storage** > **Amazon S3**. Copy the **Account ID** and **External ID** supplied in the wizard—you will need them when configuring AWS IAM. +1. In the [TiDB Cloud console](https://tidbcloud.com/), navigate to your TiDB Cloud Premium cluster, click **Data → Import**, and choose **Import data from Cloud Storage** > **Amazon S3**. Make a note of the **Account ID** and **External ID** displayed in the wizard—these values are embedded in the CloudFormation template. -2. Configure access permissions for Amazon S3. Usually you need the following read-only permissions: +2. In the **Source Connection** dialog, select **AWS Role ARN**, then click **Click here to create new one with AWS CloudFormation** and follow the on-screen guidance. If your organization cannot launch CloudFormation stacks, see [Manually create the IAM role](#manually-create-the-iam-role-optional). - - s3:GetObject - - s3:GetObjectVersion - - s3:ListBucket - - s3:GetBucketLocation + - Open the pre-filled CloudFormation template in the AWS console. + - Provide a role name, review the permissions, and acknowledge the IAM warning. + - Create the stack and wait for the status to change to **CREATE_COMPLETE**. + - On the **Outputs** tab, copy the newly generated Role ARN. + - Return to TiDB Cloud Premium, paste the Role ARN, and click **Confirm**. The wizard stores the ARN for subsequent import jobs. - If the S3 bucket uses server-side encryption SSE-KMS, you also need to add the KMS permission. +3. Continue with the remaining steps in the import wizard, using the saved Role ARN when prompted. - - kms:Decrypt +#### Manually create the IAM role (optional) -3. Configure the access policy. Go to the [AWS Console > IAM > Access Management > Policies](https://console.aws.amazon.com/iamv2/home#/policies) and switch to your region to check if the access policy for TiDB Cloud Premium exists already. If it does not exist, create a policy following this document [Creating policies on the JSON tab](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_create-console.html). +If your organization cannot deploy CloudFormation stacks, create the access policy and IAM role manually: - The following is an example template for the json policy. +1. In AWS IAM, create a policy that grants the following actions on your bucket (and KMS key, if applicable): - ```json - ## Create a json policy template - ##: fill in the path to the folder in the S3 bucket where the data files to be imported are located. - ##: fill in the ARN of the S3 bucket. You can click the Copy ARN button on the S3 Bucket Overview page to get it. - ##: fill in the ARN for the S3 bucket KMS key. You can get it from S3 bucket > Properties > Default encryption > AWS KMS Key ARN. For more information, see https://docs.aws.amazon.com/AmazonS3/latest/userguide/viewing-bucket-key-settings.html + - `s3:GetObject` + - `s3:GetObjectVersion` + - `s3:ListBucket` + - `s3:GetBucketLocation` + - `kms:Decrypt` (only when SSE-KMS encryption is enabled) + + The JSON template below shows the required structure. Replace the placeholders with your bucket path, bucket ARN, and (if needed) KMS key ARN. + ```json { "Version": "2012-10-17", "Statement": [ @@ -248,8 +252,7 @@ After you export data from the TiDB Self-Managed cluster to Amazon S3, you need "s3:GetBucketLocation" ], "Resource": "" - } - // If you have enabled SSE-KMS for the S3 bucket, you need to add the following permissions. + }, { "Effect": "Allow", "Action": [ @@ -257,19 +260,13 @@ After you export data from the TiDB Self-Managed cluster to Amazon S3, you need ], "Resource": "" } - , - { - "Effect": "Allow", - "Action": "kms:Decrypt", - "Resource": "" - } ] } ``` -4. Configure the role. See [Creating an IAM role (console)](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-user.html). In the Account ID field, enter the TiDB Cloud Premium Account ID and TiDB Cloud Premium External ID you noted in Step 1. +2. Create an IAM role that trusts TiDB Cloud Premium by supplying the **Account ID** and **External ID** noted in Step 1. Attach the policy from the previous step to this role. -5. Get the Role-ARN. Go to [AWS Console > IAM > Access Management > Roles](https://console.aws.amazon.com/iamv2/home#/roles). Switch to your region. Click the role you have created, and note down the ARN. You will use it when importing data into TiDB Cloud Premium. +3. Copy the resulting Role ARN and enter it in the TiDB Cloud Premium import wizard. 6. Import data to TiDB Cloud Premium by following [Import data from Amazon S3 into TiDB Cloud Premium](/tidb-cloud/premium/import-from-s3-premium.md). From 5d40e69410f6a99fee9e073a7798bc94a9d95806 Mon Sep 17 00:00:00 2001 From: Airton Lastori Date: Fri, 24 Oct 2025 21:08:29 -0400 Subject: [PATCH 04/14] Clarify Premium CSV mapping behavior --- .../premium/import-csv-files-premium.md | 30 +++++++------------ 1 file changed, 10 insertions(+), 20 deletions(-) diff --git a/tidb-cloud/premium/import-csv-files-premium.md b/tidb-cloud/premium/import-csv-files-premium.md index a2af85f0e6fa7..7fe88f7c4945e 100644 --- a/tidb-cloud/premium/import-csv-files-premium.md +++ b/tidb-cloud/premium/import-csv-files-premium.md @@ -129,27 +129,21 @@ To import the CSV files to TiDB Cloud Premium, take the following steps: > > When a single file is specified in **Source Files URI**, the **Use [File naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) for automatic mapping** option is not displayed, and TiDB Cloud Premium automatically populates the **Source** field with the file name. In this case, you only need to select the target database and table for data import. - - To let TiDB Cloud Premium automatically map all source files that follow the [File naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) to their corresponding tables, keep this option selected and select **CSV** as the data format. - - - **Source**: enter the file name pattern in the `[file_name].csv` format. For example: `TableName.01.csv`. You can also use wildcards to match multiple files. Only `*` and `?` wildcards are supported. - - - `my-data?.csv`: matches all CSV files that start with `my-data` followed by a single character, such as `my-data1.csv` and `my-data2.csv`. - - `my-data*.csv`: matches all CSV files that start with `my-data`, such as `my-data-2023.csv` and `my-data-final.csv`. - - - **Target Database** and **Target Table**: select the target database and table to import the data to. + - Leave automatic mapping enabled to apply the [file naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) to your source files and target tables. Keep **CSV** selected as the data format. - **Advanced options**: expand the panel to view the `Ignore compatibility checks (advanced)` toggle. Leave it disabled unless you intentionally want to bypass schema compatibility validation. > **Note:** > - > Manual mapping is coming soon. The UI shows the related controls in a disabled state today, but the workflow below remains accurate for the upcoming release: + > Manual mapping is coming soon. When the toggle becomes available, clear the automatic mapping option and configure the mapping manually: + > + > - **Source**: enter a filename pattern such as `TableName.01.csv`. Wildcards `*` and `?` are supported (for example, `my-data*.csv`). + > - **Target Database** and **Target Table**: choose the destination objects for the matched files. 6. TiDB Cloud Premium automatically scans the source path. Review the scan results, check the data files found and corresponding target tables, and then click **Start Import**. 7. When the import progress shows **Completed**, check the imported tables. -8. When the import progress shows **Completed**, check the imported tables. -
@@ -186,20 +180,16 @@ To import the CSV files to TiDB Cloud Premium, take the following steps: > > When a single file is specified in **Source Files URI**, the **Use [File naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) for automatic mapping** option is not displayed, and TiDB Cloud Premium automatically populates the **Source** field with the file name. In this case, you only need to select the target database and table for data import. - - To let TiDB Cloud Premium automatically map all source files that follow the [File naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) to their corresponding tables, keep this option selected and select **CSV** as the data format. - - - **Source**: enter the file name pattern in the `[file_name].csv` format. For example: `TableName.01.csv`. You can also use wildcards to match multiple files. Only `*` and `?` wildcards are supported. - - - `my-data?.csv`: matches all CSV files that start with `my-data` followed by a single character, such as `my-data1.csv` and `my-data2.csv`. - - `my-data*.csv`: matches all CSV files that start with `my-data`, such as `my-data-2023.csv` and `my-data-final.csv`. - - - **Target Database** and **Target Table**: select the target database and table to import the data to. + - Leave automatic mapping enabled to apply the [file naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) to your source files and target tables. Keep **CSV** selected as the data format. - **Advanced options**: expand the panel to view the `Ignore compatibility checks (advanced)` toggle. Leave it disabled unless you intentionally want to bypass schema compatibility validation. > **Note:** > - > Manual mapping is coming soon. The UI shows the related controls in a disabled state today, but the workflow below remains accurate for the upcoming release: + > Manual mapping is coming soon. When the toggle becomes available, clear the automatic mapping option and configure the mapping manually: + > + > - **Source**: enter a filename pattern such as `TableName.01.csv`. Wildcards `*` and `?` are supported (for example, `my-data*.csv`). + > - **Target Database** and **Target Table**: choose the destination objects for the matched files. 6. TiDB Cloud Premium automatically scans the source path. Review the scan results, check the data files found and corresponding target tables, and then click **Start Import**. From c5f0db10ae1b15dc41a6d93e963ac7ee42aa3030 Mon Sep 17 00:00:00 2001 From: Airton Lastori Date: Fri, 24 Oct 2025 21:27:03 -0400 Subject: [PATCH 05/14] Apply suggestions from code review Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --- tidb-cloud/premium/import-csv-files-premium.md | 10 +++++----- tidb-cloud/premium/import-from-mysql-premium.md | 6 +++--- tidb-cloud/premium/import-from-s3-premium.md | 6 +++--- .../premium/migrate-from-op-tidb-premium.md | 16 ++++++++-------- 4 files changed, 19 insertions(+), 19 deletions(-) diff --git a/tidb-cloud/premium/import-csv-files-premium.md b/tidb-cloud/premium/import-csv-files-premium.md index 7fe88f7c4945e..c10798e69641e 100644 --- a/tidb-cloud/premium/import-csv-files-premium.md +++ b/tidb-cloud/premium/import-csv-files-premium.md @@ -30,8 +30,8 @@ This document describes how to import CSV files from Amazon Simple Storage Servi 2. Name the CSV files as follows: - If a CSV file contains all data of an entire table, name the file in the `${db_name}.${table_name}.csv` format, which maps to the `${db_name}.${table_name}` table when you import the data. - - If the data of one table is separated into multiple CSV files, append a numeric suffix to these CSV files. For example, `${db_name}.${table_name}.000001.csv` and `${db_name}.${table_name}.000002.csv`. The numeric suffixes can be inconsecutive but must be in ascending order. You also need to add extra zeros before the number to ensure all the suffixes are in the same length. - - TiDB Cloud Premium supports importing compressed files in the following formats: `.gzip`, `.gz`, `.zstd`, `.zst` and `.snappy`. If you want to import compressed CSV files, name the files in the `${db_name}.${table_name}.${suffix}.csv.${compress}` format, in which `${suffix}` is optional and can be any integer such as '000001'. For example, if you want to import the `trips.000001.csv.gz` file to the `bikeshare.trips` table, you need to rename the file as `bikeshare.trips.000001.csv.gz`. + - If the data of one table is separated into multiple CSV files, append a numeric suffix to these CSV files. For example, `${db_name}.${table_name}.000001.csv` and `${db_name}.${table_name}.000002.csv`. The numeric suffixes can be non-consecutive but must be in ascending order. You also need to add extra zeros before the number to ensure that all suffixes have the same length. + - TiDB Cloud Premium supports importing compressed files in the following formats: `.gzip`, `.gz`, `.zstd`, `.zst` and `.snappy`. If you want to import compressed CSV files, name the files in the `${db_name}.${table_name}.${suffix}.csv.${compress}` format, where `${suffix}` is optional and can be any integer such as '000001'. For example, if you want to import the `trips.000001.csv.gz` file to the `bikeshare.trips` table, you need to rename the file as `bikeshare.trips.000001.csv.gz`. > **Note:** > @@ -53,7 +53,7 @@ Because CSV files do not contain schema information, before importing data from Each database schema file must be in the `${db_name}-schema-create.sql` format and contain a `CREATE DATABASE` DDL statement. With this file, TiDB Cloud Premium will create the `${db_name}` database to store your data when you import the data. - For example, if you create a `mydb-scehma-create.sql` file that contains the following statement, TiDB Cloud Premium will create the `mydb` database when you import the data. + For example, if you create a `mydb-schema-create.sql` file that contains the following statement, TiDB Cloud Premium will create the `mydb` database when you import the data. ```sql CREATE DATABASE mydb; @@ -63,7 +63,7 @@ Because CSV files do not contain schema information, before importing data from If you do not include the table schema files in the Amazon S3 or Alibaba Cloud Object Storage Service directory where the CSV files are located, TiDB Cloud Premium will not create the corresponding tables for you when you import the data. - Each table schema file must be in the `${db_name}.${table_name}-schema.sql` format and contain a `CREATE TABLE` DDL statement. With this file, TiDB Cloud Premium will create the `${db_table}` table in the `${db_name}` database when you import the data. + Each table schema file must be in the `${db_name}.${table_name}-schema.sql` format and contain a `CREATE TABLE` DDL statement. With this file, TiDB Cloud Premium will create the `${table_name}` table in the `${db_name}` database when you import the data. For example, if you create a `mydb.mytable-schema.sql` file that contains the following statement, TiDB Cloud Premium will create the `mytable` table in the `mydb` database when you import the data. @@ -114,7 +114,7 @@ To import the CSV files to TiDB Cloud Premium, take the following steps: - When importing one file, enter the source file URI in the following format `s3://[bucket_name]/[data_source_folder]/[file_name].csv`. For example, `s3://sampledata/ingest/TableName.01.csv`. - When importing multiple files, enter the source folder URI in the following format `s3://[bucket_name]/[data_source_folder]/`. For example, `s3://sampledata/ingest/`. - **Credential**: you can use either an AWS Role ARN or an AWS access key to access your bucket. For more information, see [Configure Amazon S3 access](/tidb-cloud/serverless-external-storage.md#configure-amazon-s3-access). - - **AWS Role ARN**: enter the AWS Role ARN value. If you need to create a new role, click **Click here to create new one with AWS CloudFormation** and follow the guided steps to launch the provided template, acknowledge the IAM warning, create the stack, and copy the generated ARN back into TiDB Cloud Premium. + - **AWS Role ARN**: enter the AWS Role ARN value. If you need to create a new role, click **Click here to create a new one with AWS CloudFormation** and follow the guided steps to launch the provided template, acknowledge the IAM warning, create the stack, and copy the generated ARN back into TiDB Cloud Premium. - **AWS Access Key**: enter the AWS access key ID and AWS secret access key. - **Test Bucket Access**: click this button after the credentials are in place to confirm that TiDB Cloud Premium can reach the bucket. - **Target Connection**: supply the TiDB username and password that will run the import. Optionally click **Test Connection** to validate the credentials. diff --git a/tidb-cloud/premium/import-from-mysql-premium.md b/tidb-cloud/premium/import-from-mysql-premium.md index c6701959d391a..4d8ce9e4a79c8 100644 --- a/tidb-cloud/premium/import-from-mysql-premium.md +++ b/tidb-cloud/premium/import-from-mysql-premium.md @@ -19,7 +19,7 @@ This document describes how to import data into TiDB Cloud Premium using the [My ## Prerequisites -Before you can import data via the MySQL Command-Line Client to a TiDB Cloud Premium instance, you need the following prerequisites: +Before you can import data to a TiDB Cloud Premium instance via the MySQL Command-Line Client, you need the following prerequisites: - You have access to your TiDB Cloud Premium instance. - Install the MySQL Command-Line Client (`mysql`) on your local computer. @@ -72,7 +72,7 @@ Use the MySQL Command-Line Client to load data into the schema you created in St Do the following to import data from an SQL file: -1. Provide a real SQL file (for example, `products.sql`) that contains the data you want to import. This SQL file must include `INSERT` statements with real data, similar to the following: +1. Provide an SQL file (for example, `products.sql`) that contains the data you want to import. This SQL file must include `INSERT` statements with data, similar to the following: ```sql INSERT INTO products (product_id, product_name, price) VALUES @@ -107,7 +107,7 @@ Do the following to import data from a CSV file: 1. Ensure the target database and table exist in TiDB (for example, the `products` table you created in Step 2). -2. Provide a sample CSV file (for example, `products.csv`) that contains the data you want to import. The following is an example of a CSV file: +2. Provide a sample CSV file (for example, `products.csv`) that contains the data you want to import. The following is an example: **products.csv:** diff --git a/tidb-cloud/premium/import-from-s3-premium.md b/tidb-cloud/premium/import-from-s3-premium.md index ab7280c6626e7..9434a3829d34e 100644 --- a/tidb-cloud/premium/import-from-s3-premium.md +++ b/tidb-cloud/premium/import-from-s3-premium.md @@ -38,7 +38,7 @@ These naming conventions are identical to the TiDB Cloud Serverless workflow. Up ## Step 2. Create target schemas (optional) -If TiDB Cloud Premium should create the databases and tables for you, place the schema files generated by Dumpling in the same S3 directory. Otherwise, create the objects manually in TiDB Cloud Premium before running the import. +If you want TiDB Cloud Premium to create the databases and tables for you, place the schema files generated by Dumpling in the same S3 directory. Otherwise, create the objects manually in TiDB Cloud Premium before running the import. ## Step 3. Configure Amazon S3 access @@ -47,12 +47,12 @@ To allow TiDB Cloud Premium to read your bucket: - Provide an AWS Role ARN that trusts TiDB Cloud and grants `s3:GetObject` and `s3:ListBucket` on the relevant paths, or - Provide an AWS access key (access key ID and secret access key) with equivalent permissions. -The wizard includes a helper link labeled **Click here to create new one with AWS CloudFormation**. Follow that link if you need TiDB Cloud Premium to pre-fill a CloudFormation stack that creates the role for you. +The wizard includes a helper link labeled **Click here to create a new one with AWS CloudFormation**. Follow that link if you need TiDB Cloud Premium to pre-fill a CloudFormation stack that creates the role for you. ## Step 4. Import CSV files from Amazon S3 1. Open the Premium workspace in the TiDB Cloud console and select your instance. -2. Go to **Data → Import** and click **Import data from Cloud Storage**. +2. Go to **Data > Import** and click **Import data from Cloud Storage**. 3. On the **Source Connection** tab: - Set **Storage Provider** to **Amazon S3**. - Enter the **Source Files URI** for a single file (`s3://bucket/path/file.csv`) or for a folder (`s3://bucket/path/`). diff --git a/tidb-cloud/premium/migrate-from-op-tidb-premium.md b/tidb-cloud/premium/migrate-from-op-tidb-premium.md index 9aa1cf51c5b6d..7e3bf8a9612a8 100644 --- a/tidb-cloud/premium/migrate-from-op-tidb-premium.md +++ b/tidb-cloud/premium/migrate-from-op-tidb-premium.md @@ -45,7 +45,7 @@ You need to prepare the following tools: Before you deploy Dumpling, note the following: -- It is recommended to deploy Dumpling on a new EC2 instance in the same VPC as the TiDB cluster in TiDB Cloud Premium. +- It is recommended to deploy Dumpling on a new EC2 instance in the same VPC as your target TiDB Cloud Premium cluster. - The recommended EC2 instance type is **c6g.4xlarge** (16 vCPU and 32 GiB memory). You can choose other EC2 instance types based on your needs. The Amazon Machine Image (AMI) can be Amazon Linux, Ubuntu, or Red Hat. You can deploy Dumpling by using TiUP or using the installation package. @@ -69,7 +69,7 @@ To deploy Dumpling using the installation package: 1. Download the [toolkit package](https://docs.pingcap.com/tidb/stable/download-ecosystem-tools). -2. Extract it to the target machine. You can get Dumpling using TiUP by running `tiup install dumpling`. Afterwards, you can use `tiup dumpling ...` to run Dumpling. For more information, see [Dumpling introduction](https://docs.pingcap.com/tidb/stable/dumpling-overview#dumpling-introduction). +2. Extract it to the target machine. You can get Dumpling using TiUP by running `tiup install dumpling`. Then, you can use `tiup dumpling ...` to run Dumpling. For more information, see [Dumpling introduction](https://docs.pingcap.com/tidb/stable/dumpling-overview#dumpling-introduction). #### Configure privileges for Dumpling @@ -184,14 +184,14 @@ Do the following to export data from the upstream TiDB cluster to Amazon S3 usin -P 4000 \ -h 127.0.0.1 \ -r 20000 \ - --filetype {sql|csv} \ + --filetype sql \ -F 256MiB \ -t 8 \ -o "${S3 URI}" \ --s3.region "${s3.region}" ``` - The `-t` option specifies the number of threads for the export. Increasing the number of threads improves the concurrency of Dumpling and the export speed, and also increases the database's memory consumption. Therefore, do not set a too large number for this parameter. + The `-t` option specifies the number of threads for the export. Increasing the number of threads improves the concurrency of Dumpling and the export speed, and also increases the database's memory consumption. Therefore, do not set this parameter to a very large number. For more information, see [Dumpling](https://docs.pingcap.com/tidb/stable/dumpling-overview#export-to-sql-files). @@ -209,7 +209,7 @@ After you export data from the TiDB Self-Managed cluster to Amazon S3, you need 1. In the [TiDB Cloud console](https://tidbcloud.com/), navigate to your TiDB Cloud Premium cluster, click **Data → Import**, and choose **Import data from Cloud Storage** > **Amazon S3**. Make a note of the **Account ID** and **External ID** displayed in the wizard—these values are embedded in the CloudFormation template. -2. In the **Source Connection** dialog, select **AWS Role ARN**, then click **Click here to create new one with AWS CloudFormation** and follow the on-screen guidance. If your organization cannot launch CloudFormation stacks, see [Manually create the IAM role](#manually-create-the-iam-role-optional). +2. In the **Source Connection** dialog, select **AWS Role ARN**, then click **Click here to create a new one with AWS CloudFormation** and follow the on-screen guidance. If your organization cannot launch CloudFormation stacks, see [Manually create the IAM role](#manually-create-the-iam-role-optional). - Open the pre-filled CloudFormation template in the AWS console. - Provide a role name, review the permissions, and acknowledge the IAM warning. @@ -268,7 +268,7 @@ If your organization cannot deploy CloudFormation stacks, create the access poli 3. Copy the resulting Role ARN and enter it in the TiDB Cloud Premium import wizard. -6. Import data to TiDB Cloud Premium by following [Import data from Amazon S3 into TiDB Cloud Premium](/tidb-cloud/premium/import-from-s3-premium.md). +4. Import data to TiDB Cloud Premium by following [Import data from Amazon S3 into TiDB Cloud Premium](/tidb-cloud/premium/import-from-s3-premium.md). ## Replicate incremental data @@ -322,7 +322,7 @@ To replicate incremental data, do the following: SET GLOBAL tidb_gc_enable = TRUE; ``` - The following is an example output, in which `1` indicates that GC is disabled. + The following is an example output, in which `1` indicates that GC is enabled. ```sql SELECT @@global.tidb_gc_enable; @@ -373,7 +373,7 @@ To replicate incremental data, do the following: SELECT DISTINCT(CONCAT('CREATE GLOBAL BINDING FOR ', original_sql,' USING ', bind_sql,';')) FROM mysql.bind_info WHERE status='enabled'; ``` - If you do not get any output, query bindings might not be used in the upstream cluster. In this case, you can skip this step. + If you do not get any output, it means that no query bindings are used in the upstream cluster. In this case, you can skip this step. After you get the query bindings, run them in the downstream cluster to restore the query bindings. From 207976516c818ea8dec4eb7b8bf396f29026f4ab Mon Sep 17 00:00:00 2001 From: Airton Lastori Date: Fri, 24 Oct 2025 21:41:42 -0400 Subject: [PATCH 06/14] Fix Dumpling link in Premium MySQL import doc --- tidb-cloud/premium/import-from-mysql-premium.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tidb-cloud/premium/import-from-mysql-premium.md b/tidb-cloud/premium/import-from-mysql-premium.md index 4d8ce9e4a79c8..df49c0126cd28 100644 --- a/tidb-cloud/premium/import-from-mysql-premium.md +++ b/tidb-cloud/premium/import-from-mysql-premium.md @@ -15,7 +15,7 @@ This document describes how to import data into TiDB Cloud Premium using the [My > **Tip:** > -> Logical imports are best suited for relatively small SQL or CSV files. For faster, parallel imports from cloud storage or to process multiple files from [Dumpling](/dumpling-overview.md) exports, see [Import CSV files into TiDB Cloud Premium](/tidb-cloud/premium/import-csv-files-premium.md). +> Logical imports are best suited for relatively small SQL or CSV files. For faster, parallel imports from cloud storage or to process multiple files from [Dumpling](https://docs.pingcap.com/tidb/stable/dumpling-overview) exports, see [Import CSV files into TiDB Cloud Premium](/tidb-cloud/premium/import-csv-files-premium.md). ## Prerequisites From 23bc90dd49091b8c7b380483a0c48e5d63908523 Mon Sep 17 00:00:00 2001 From: Lilian Lee Date: Tue, 28 Oct 2025 11:16:59 +0800 Subject: [PATCH 07/14] Apply suggestions from code review --- tidb-cloud/premium/migrate-from-op-tidb-premium.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/tidb-cloud/premium/migrate-from-op-tidb-premium.md b/tidb-cloud/premium/migrate-from-op-tidb-premium.md index 7e3bf8a9612a8..678828bc461a5 100644 --- a/tidb-cloud/premium/migrate-from-op-tidb-premium.md +++ b/tidb-cloud/premium/migrate-from-op-tidb-premium.md @@ -3,15 +3,15 @@ title: Migrate from TiDB Self-Managed to TiDB Cloud Premium summary: Learn how to migrate data from TiDB Self-Managed to TiDB Cloud Premium. --- +# Migrate from TiDB Self-Managed to TiDB Cloud Premium + +This document describes how to migrate data from your TiDB Self-Managed clusters to TiDB Cloud Premium (on AWS) through Dumpling and TiCDC. + > **Warning:** > > TiDB Cloud Premium is currently available in **Private Preview** in select AWS regions. > -> If Premium is not yet enabled for your organization, or if you need access in another cloud provider or region, click **Support** in the lower-left corner of the [TiDB Cloud console](https://tidbcloud.com/), or submit a request through the [Contact Us form](https://www.pingcap.com/contact-us) on our website. - -# Migrate from TiDB Self-Managed to TiDB Cloud Premium - -This document describes how to migrate data from your TiDB Self-Managed clusters to TiDB Cloud Premium (AWS) through Dumpling and TiCDC. +> If Premium is not yet enabled for your organization, or if you need access in another cloud provider or region, click **Support** in the lower-left corner of the [TiDB Cloud console](https://tidbcloud.com/), or submit a request through the [Contact Us](https://www.pingcap.com/contact-us) form on the website. The overall procedure is as follows: From de4c4ec00229dfc84fe931744614f4e8206d0f92 Mon Sep 17 00:00:00 2001 From: lilin90 Date: Tue, 28 Oct 2025 11:22:47 +0800 Subject: [PATCH 08/14] Refactor migration guide to use .premium variable Replaced hardcoded 'TiDB Cloud Premium' references with the '{{{ .premium }}}' variable throughout the migration documentation for improved maintainability and dynamic branding. --- .../premium/migrate-from-op-tidb-premium.md | 52 +++++++++---------- 1 file changed, 26 insertions(+), 26 deletions(-) diff --git a/tidb-cloud/premium/migrate-from-op-tidb-premium.md b/tidb-cloud/premium/migrate-from-op-tidb-premium.md index 678828bc461a5..149a2cc936ae7 100644 --- a/tidb-cloud/premium/migrate-from-op-tidb-premium.md +++ b/tidb-cloud/premium/migrate-from-op-tidb-premium.md @@ -1,15 +1,15 @@ --- -title: Migrate from TiDB Self-Managed to TiDB Cloud Premium -summary: Learn how to migrate data from TiDB Self-Managed to TiDB Cloud Premium. +title: Migrate from TiDB Self-Managed to {{{ .premium }}} +summary: Learn how to migrate data from TiDB Self-Managed to {{{ .premium }}}. --- -# Migrate from TiDB Self-Managed to TiDB Cloud Premium +# Migrate from TiDB Self-Managed to {{{ .premium }}} -This document describes how to migrate data from your TiDB Self-Managed clusters to TiDB Cloud Premium (on AWS) through Dumpling and TiCDC. +This document describes how to migrate data from your TiDB Self-Managed clusters to {{{ .premium }}} (on AWS) through Dumpling and TiCDC. > **Warning:** > -> TiDB Cloud Premium is currently available in **Private Preview** in select AWS regions. +> {{{ .premium }}} is currently available in **Private Preview** in select AWS regions. > > If Premium is not yet enabled for your organization, or if you need access in another cloud provider or region, click **Support** in the lower-left corner of the [TiDB Cloud console](https://tidbcloud.com/), or submit a request through the [Contact Us](https://www.pingcap.com/contact-us) form on the website. @@ -18,19 +18,19 @@ The overall procedure is as follows: 1. Build the environment and prepare the tools. 2. Migrate full data. The process is as follows: 1. Export data from TiDB Self-Managed to Amazon S3 using Dumpling. - 2. Import data from Amazon S3 to TiDB Cloud Premium. + 2. Import data from Amazon S3 to {{{ .premium }}}. 3. Replicate incremental data by using TiCDC. 4. Verify the migrated data. ## Prerequisites -It is recommended that you put the S3 bucket and the TiDB Cloud Premium cluster in the same region. Cross-region migration might incur additional cost for data conversion. +It is recommended that you put the S3 bucket and the {{{ .premium }}} cluster in the same region. Cross-region migration might incur additional cost for data conversion. Before migration, you need to prepare the following: - An [AWS account](https://docs.aws.amazon.com/AmazonS3/latest/userguide/setting-up-s3.html#sign-up-for-aws-gsg) with administrator access - An [AWS S3 bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/creating-bucket.html) -- [A TiDB Cloud account](/tidb-cloud/tidb-cloud-quickstart.md) with at least the [`Project Data Access Read-Write`](/tidb-cloud/manage-user-access.md#user-roles) access to your target TiDB Cloud Premium cluster hosted on AWS +- [A TiDB Cloud account](/tidb-cloud/tidb-cloud-quickstart.md) with at least the [`Project Data Access Read-Write`](/tidb-cloud/manage-user-access.md#user-roles) access to your target {{{ .premium }}} cluster hosted on AWS ## Prepare tools @@ -45,7 +45,7 @@ You need to prepare the following tools: Before you deploy Dumpling, note the following: -- It is recommended to deploy Dumpling on a new EC2 instance in the same VPC as your target TiDB Cloud Premium cluster. +- It is recommended to deploy Dumpling on a new EC2 instance in the same VPC as your target {{{ .premium }}} cluster. - The recommended EC2 instance type is **c6g.4xlarge** (16 vCPU and 32 GiB memory). You can choose other EC2 instance types based on your needs. The Amazon Machine Image (AMI) can be Amazon Linux, Ubuntu, or Red Hat. You can deploy Dumpling by using TiUP or using the installation package. @@ -83,7 +83,7 @@ You need the following privileges to export data from the upstream database: ### Deploy TiCDC -You need to [deploy TiCDC](https://docs.pingcap.com/tidb/dev/deploy-ticdc) to replicate incremental data from the upstream TiDB cluster to TiDB Cloud Premium. +You need to [deploy TiCDC](https://docs.pingcap.com/tidb/dev/deploy-ticdc) to replicate incremental data from the upstream TiDB cluster to {{{ .premium }}}. 1. Confirm whether the current TiDB version supports TiCDC. TiDB v4.0.8.rc.1 and later versions support TiCDC. You can check the TiDB version by executing `select tidb_version();` in the TiDB cluster. If you need to upgrade it, see [Upgrade TiDB Using TiUP](https://docs.pingcap.com/tidb/dev/deploy-ticdc#upgrade-ticdc-using-tiup). @@ -108,10 +108,10 @@ You need to [deploy TiCDC](https://docs.pingcap.com/tidb/dev/deploy-ticdc) to re ## Migrate full data -To migrate data from the TiDB Self-Managed cluster to TiDB Cloud Premium, perform a full data migration as follows: +To migrate data from the TiDB Self-Managed cluster to {{{ .premium }}}, perform a full data migration as follows: 1. Migrate data from the TiDB Self-Managed cluster to Amazon S3. -2. Migrate data from Amazon S3 to TiDB Cloud Premium. +2. Migrate data from Amazon S3 to {{{ .premium }}}. ### Migrate data from the TiDB Self-Managed cluster to Amazon S3 @@ -203,11 +203,11 @@ Do the following to export data from the upstream TiDB cluster to Amazon S3 usin - `{schema}.{table}.{0001}.{sql|csv}`: data files - `*-schema-view.sql`, `*-schema-trigger.sql`, `*-schema-post.sql`: other exported SQL files -### Migrate data from Amazon S3 to TiDB Cloud Premium +### Migrate data from Amazon S3 to {{{ .premium }}} -After you export data from the TiDB Self-Managed cluster to Amazon S3, you need to migrate the data to TiDB Cloud Premium. +After you export data from the TiDB Self-Managed cluster to Amazon S3, you need to migrate the data to {{{ .premium }}}. -1. In the [TiDB Cloud console](https://tidbcloud.com/), navigate to your TiDB Cloud Premium cluster, click **Data → Import**, and choose **Import data from Cloud Storage** > **Amazon S3**. Make a note of the **Account ID** and **External ID** displayed in the wizard—these values are embedded in the CloudFormation template. +1. In the [TiDB Cloud console](https://tidbcloud.com/), navigate to your {{{ .premium }}} cluster, click **Data → Import**, and choose **Import data from Cloud Storage** > **Amazon S3**. Make a note of the **Account ID** and **External ID** displayed in the wizard—these values are embedded in the CloudFormation template. 2. In the **Source Connection** dialog, select **AWS Role ARN**, then click **Click here to create a new one with AWS CloudFormation** and follow the on-screen guidance. If your organization cannot launch CloudFormation stacks, see [Manually create the IAM role](#manually-create-the-iam-role-optional). @@ -215,7 +215,7 @@ After you export data from the TiDB Self-Managed cluster to Amazon S3, you need - Provide a role name, review the permissions, and acknowledge the IAM warning. - Create the stack and wait for the status to change to **CREATE_COMPLETE**. - On the **Outputs** tab, copy the newly generated Role ARN. - - Return to TiDB Cloud Premium, paste the Role ARN, and click **Confirm**. The wizard stores the ARN for subsequent import jobs. + - Return to {{{ .premium }}}, paste the Role ARN, and click **Confirm**. The wizard stores the ARN for subsequent import jobs. 3. Continue with the remaining steps in the import wizard, using the saved Role ARN when prompted. @@ -264,11 +264,11 @@ If your organization cannot deploy CloudFormation stacks, create the access poli } ``` -2. Create an IAM role that trusts TiDB Cloud Premium by supplying the **Account ID** and **External ID** noted in Step 1. Attach the policy from the previous step to this role. +2. Create an IAM role that trusts {{{ .premium }}} by supplying the **Account ID** and **External ID** noted in Step 1. Attach the policy from the previous step to this role. -3. Copy the resulting Role ARN and enter it in the TiDB Cloud Premium import wizard. +3. Copy the resulting Role ARN and enter it in the {{{ .premium }}} import wizard. -4. Import data to TiDB Cloud Premium by following [Import data from Amazon S3 into TiDB Cloud Premium](/tidb-cloud/premium/import-from-s3-premium.md). +4. Import data to {{{ .premium }}} by following [Import data from Amazon S3 into {{{ .premium }}}](/tidb-cloud/premium/import-from-s3-premium.md). ## Replicate incremental data @@ -278,16 +278,16 @@ To replicate incremental data, do the following: ![Start Time in Metadata](/media/tidb-cloud/start_ts_in_metadata.png) -2. Grant TiCDC to connect to TiDB Cloud Premium. +2. Grant TiCDC to connect to {{{ .premium }}}. - 1. In the [TiDB Cloud console](https://tidbcloud.com/tidbs), navigate to the [**TiDB Instances**](https://tidbcloud.com/tidbs) page, and then click the name of your target TiDB Cloud Premium cluster to go to its overview page. + 1. In the [TiDB Cloud console](https://tidbcloud.com/tidbs), navigate to the [**TiDB Instances**](https://tidbcloud.com/tidbs) page, and then click the name of your target {{{ .premium }}} cluster to go to its overview page. 2. In the left navigation pane, click **Settings** > **Networking**. 3. On the **Networking** page, click **Add IP Address**. - 4. In the displayed dialog, select **Use IP addresses**, click **+**, fill in the public IP address of the TiCDC component in the **IP Address** field, and then click **Confirm**. Now TiCDC can access TiDB Cloud Premium. For more information, see [Configure an IP Access List](/tidb-cloud/configure-ip-access-list.md). + 4. In the displayed dialog, select **Use IP addresses**, click **+**, fill in the public IP address of the TiCDC component in the **IP Address** field, and then click **Confirm**. Now TiCDC can access {{{ .premium }}}. For more information, see [Configure an IP Access List](/tidb-cloud/configure-ip-access-list.md). -3. Get the connection information of the downstream TiDB Cloud Premium cluster. +3. Get the connection information of the downstream {{{ .premium }}} cluster. - 1. In the [TiDB Cloud console](https://tidbcloud.com/tidbs), navigate to the [**TiDB Instances**](https://tidbcloud.com/tidbs) page, and then click the name of your target TiDB Cloud Premium cluster to go to its overview page. + 1. In the [TiDB Cloud console](https://tidbcloud.com/tidbs), navigate to the [**TiDB Instances**](https://tidbcloud.com/tidbs) page, and then click the name of your target {{{ .premium }}} cluster to go to its overview page. 2. Click **Connect** in the upper-right corner. 3. In the connection dialog, select **Public** from the **Connection Type** drop-down list and select **General** from the **Connect With** drop-down list. 4. From the connection information, you can get the host IP address and port of the cluster. For more information, see [Connect via public connection](/tidb-cloud/connect-via-standard-connection.md). @@ -345,9 +345,9 @@ To replicate incremental data, do the following: ![Update Filter](/media/tidb-cloud/normal_status_in_replication_task.png) - - Verify the replication. Write a new record to the upstream cluster, and then check whether the record is replicated to the downstream TiDB Cloud Premium cluster. + - Verify the replication. Write a new record to the upstream cluster, and then check whether the record is replicated to the downstream {{{ .premium }}} cluster. -7. Set the same timezone for the upstream and downstream clusters. By default, TiDB Cloud Premium sets the timezone to UTC. If the timezone is different between the upstream and downstream clusters, you need to set the same timezone for both clusters. +7. Set the same timezone for the upstream and downstream clusters. By default, {{{ .premium }}} sets the timezone to UTC. If the timezone is different between the upstream and downstream clusters, you need to set the same timezone for both clusters. 1. In the upstream cluster, run the following command to check the timezone: From c47d4f9fb3ed86b4c07c750c0b81f2237f0c9dad Mon Sep 17 00:00:00 2001 From: lilin90 Date: Tue, 28 Oct 2025 15:13:22 +0800 Subject: [PATCH 09/14] Update migrating from self-managed to premium Revised terminology to consistently use 'instance' instead of 'cluster' for TiDB Cloud Premium. Improved step-by-step instructions for data import and incremental replication, clarified IAM role creation, and enhanced overall clarity and accuracy throughout the migration documentation. --- .../premium/migrate-from-op-tidb-premium.md | 65 ++++++++++--------- 1 file changed, 35 insertions(+), 30 deletions(-) diff --git a/tidb-cloud/premium/migrate-from-op-tidb-premium.md b/tidb-cloud/premium/migrate-from-op-tidb-premium.md index 149a2cc936ae7..ec3a6dab98ea4 100644 --- a/tidb-cloud/premium/migrate-from-op-tidb-premium.md +++ b/tidb-cloud/premium/migrate-from-op-tidb-premium.md @@ -5,11 +5,11 @@ summary: Learn how to migrate data from TiDB Self-Managed to {{{ .premium }}}. # Migrate from TiDB Self-Managed to {{{ .premium }}} -This document describes how to migrate data from your TiDB Self-Managed clusters to {{{ .premium }}} (on AWS) through Dumpling and TiCDC. +This document describes how to migrate data from your TiDB Self-Managed clusters to {{{ .premium }}} (on AWS) instances using Dumpling and TiCDC. > **Warning:** > -> {{{ .premium }}} is currently available in **Private Preview** in select AWS regions. +> {{{ .premium }}} is currently available in **private preview** in select AWS regions. > > If Premium is not yet enabled for your organization, or if you need access in another cloud provider or region, click **Support** in the lower-left corner of the [TiDB Cloud console](https://tidbcloud.com/), or submit a request through the [Contact Us](https://www.pingcap.com/contact-us) form on the website. @@ -19,18 +19,18 @@ The overall procedure is as follows: 2. Migrate full data. The process is as follows: 1. Export data from TiDB Self-Managed to Amazon S3 using Dumpling. 2. Import data from Amazon S3 to {{{ .premium }}}. -3. Replicate incremental data by using TiCDC. +3. Replicate incremental data using TiCDC. 4. Verify the migrated data. ## Prerequisites -It is recommended that you put the S3 bucket and the {{{ .premium }}} cluster in the same region. Cross-region migration might incur additional cost for data conversion. +It is recommended that you put the S3 bucket and the {{{ .premium }}} instance in the same region. Cross-region migration might incur additional cost for data conversion. Before migration, you need to prepare the following: - An [AWS account](https://docs.aws.amazon.com/AmazonS3/latest/userguide/setting-up-s3.html#sign-up-for-aws-gsg) with administrator access - An [AWS S3 bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/creating-bucket.html) -- [A TiDB Cloud account](/tidb-cloud/tidb-cloud-quickstart.md) with at least the [`Project Data Access Read-Write`](/tidb-cloud/manage-user-access.md#user-roles) access to your target {{{ .premium }}} cluster hosted on AWS +- [A TiDB Cloud account](/tidb-cloud/tidb-cloud-quickstart.md) with at least the [`Project Data Access Read-Write`](/tidb-cloud/manage-user-access.md#user-roles) access to your target {{{ .premium }}} instance hosted on AWS ## Prepare tools @@ -45,7 +45,7 @@ You need to prepare the following tools: Before you deploy Dumpling, note the following: -- It is recommended to deploy Dumpling on a new EC2 instance in the same VPC as your target {{{ .premium }}} cluster. +- It is recommended to deploy Dumpling on a new EC2 instance in the same VPC as your target TiDB instance. - The recommended EC2 instance type is **c6g.4xlarge** (16 vCPU and 32 GiB memory). You can choose other EC2 instance types based on your needs. The Amazon Machine Image (AMI) can be Amazon Linux, Ubuntu, or Red Hat. You can deploy Dumpling by using TiUP or using the installation package. @@ -207,17 +207,22 @@ Do the following to export data from the upstream TiDB cluster to Amazon S3 usin After you export data from the TiDB Self-Managed cluster to Amazon S3, you need to migrate the data to {{{ .premium }}}. -1. In the [TiDB Cloud console](https://tidbcloud.com/), navigate to your {{{ .premium }}} cluster, click **Data → Import**, and choose **Import data from Cloud Storage** > **Amazon S3**. Make a note of the **Account ID** and **External ID** displayed in the wizard—these values are embedded in the CloudFormation template. +1. In the [TiDB Cloud console](https://tidbcloud.com/), get the Account ID and External ID of your target TiDB instance. -2. In the **Source Connection** dialog, select **AWS Role ARN**, then click **Click here to create a new one with AWS CloudFormation** and follow the on-screen guidance. If your organization cannot launch CloudFormation stacks, see [Manually create the IAM role](#manually-create-the-iam-role-optional). + 1. Navigate to the **TiDB Instances** page, and click the name of your target instance. + 2. In the left navigation pane, click **Data** > **Import**. + 3. Choose **Import data from Cloud Storage** > **Amazon S3**. + 4. Note down the **Account ID** and **External ID** displayed in the wizard. These values are embedded in the CloudFormation template. - - Open the pre-filled CloudFormation template in the AWS console. - - Provide a role name, review the permissions, and acknowledge the IAM warning. - - Create the stack and wait for the status to change to **CREATE_COMPLETE**. - - On the **Outputs** tab, copy the newly generated Role ARN. - - Return to {{{ .premium }}}, paste the Role ARN, and click **Confirm**. The wizard stores the ARN for subsequent import jobs. +2. In the **Source Connection** dialog, select **AWS Role ARN**, then click **Click here to create a new one with AWS CloudFormation**, and follow the on-screen guidance. If your organization cannot launch CloudFormation stacks, see [Manually create the IAM role](#manually-create-the-iam-role-optional). -3. Continue with the remaining steps in the import wizard, using the saved Role ARN when prompted. + 1. Open the pre-filled CloudFormation template in the AWS console. + 2. Provide a role name, review the permissions, and acknowledge the IAM warning. + 3. Create the stack and wait for the status to change to **CREATE_COMPLETE**. + 4. On the **Outputs** tab, copy the newly generated Role ARN. + 5. Return to {{{ .premium }}}, paste the Role ARN, and click **Confirm**. The wizard stores the ARN for subsequent import jobs. + +3. Continue with the remaining steps in the import wizard, and use the saved Role ARN when prompted. #### Manually create the IAM role (optional) @@ -231,7 +236,7 @@ If your organization cannot deploy CloudFormation stacks, create the access poli - `s3:GetBucketLocation` - `kms:Decrypt` (only when SSE-KMS encryption is enabled) - The JSON template below shows the required structure. Replace the placeholders with your bucket path, bucket ARN, and (if needed) KMS key ARN. + The following JSON template shows the required structure. Replace the placeholders with your bucket path, bucket ARN, and KMS key ARN (if needed). ```json { @@ -264,7 +269,7 @@ If your organization cannot deploy CloudFormation stacks, create the access poli } ``` -2. Create an IAM role that trusts {{{ .premium }}} by supplying the **Account ID** and **External ID** noted in Step 1. Attach the policy from the previous step to this role. +2. Create an IAM role that trusts {{{ .premium }}} by providing the **Account ID** and **External ID** you have noted down earlier. Then, attach the policy created in the previous step to this role. 3. Copy the resulting Role ARN and enter it in the {{{ .premium }}} import wizard. @@ -280,17 +285,17 @@ To replicate incremental data, do the following: 2. Grant TiCDC to connect to {{{ .premium }}}. - 1. In the [TiDB Cloud console](https://tidbcloud.com/tidbs), navigate to the [**TiDB Instances**](https://tidbcloud.com/tidbs) page, and then click the name of your target {{{ .premium }}} cluster to go to its overview page. + 1. In the [TiDB Cloud console](https://tidbcloud.com/tidbs), navigate to the [**TiDB Instances**](https://tidbcloud.com/tidbs) page, and then click the name of your target TiDB instance to go to its overview page. 2. In the left navigation pane, click **Settings** > **Networking**. 3. On the **Networking** page, click **Add IP Address**. 4. In the displayed dialog, select **Use IP addresses**, click **+**, fill in the public IP address of the TiCDC component in the **IP Address** field, and then click **Confirm**. Now TiCDC can access {{{ .premium }}}. For more information, see [Configure an IP Access List](/tidb-cloud/configure-ip-access-list.md). -3. Get the connection information of the downstream {{{ .premium }}} cluster. +3. Get the connection information of the downstream {{{ .premium }}} instance. - 1. In the [TiDB Cloud console](https://tidbcloud.com/tidbs), navigate to the [**TiDB Instances**](https://tidbcloud.com/tidbs) page, and then click the name of your target {{{ .premium }}} cluster to go to its overview page. + 1. In the [TiDB Cloud console](https://tidbcloud.com/tidbs), navigate to the [**TiDB Instances**](https://tidbcloud.com/tidbs) page, and then click the name of your target TiDB instance to go to its overview page. 2. Click **Connect** in the upper-right corner. 3. In the connection dialog, select **Public** from the **Connection Type** drop-down list and select **General** from the **Connect With** drop-down list. - 4. From the connection information, you can get the host IP address and port of the cluster. For more information, see [Connect via public connection](/tidb-cloud/connect-via-standard-connection.md). + 4. From the connection information, you can get the host IP address and port of the instance. For more information, see [Connect via public connection](/tidb-cloud/connect-via-standard-connection.md). 4. Create and run the incremental replication task. In the upstream cluster, run the following: @@ -345,9 +350,9 @@ To replicate incremental data, do the following: ![Update Filter](/media/tidb-cloud/normal_status_in_replication_task.png) - - Verify the replication. Write a new record to the upstream cluster, and then check whether the record is replicated to the downstream {{{ .premium }}} cluster. + - Verify the replication. Write a new record to the upstream cluster, and then check whether the record is replicated to the downstream {{{ .premium }}} instance. -7. Set the same timezone for the upstream and downstream clusters. By default, {{{ .premium }}} sets the timezone to UTC. If the timezone is different between the upstream and downstream clusters, you need to set the same timezone for both clusters. +7. Set the same timezone for the upstream cluster and downstream instance. By default, {{{ .premium }}} sets the timezone to UTC. If the timezone is different between the upstream cluster and downstream instance, you need to set the same timezone for both. 1. In the upstream cluster, run the following command to check the timezone: @@ -355,7 +360,7 @@ To replicate incremental data, do the following: SELECT @@global.time_zone; ``` - 2. In the downstream cluster, run the following command to set the timezone: + 2. In the downstream instance, run the following command to set the timezone: ```sql SET GLOBAL time_zone = '+08:00'; @@ -367,7 +372,7 @@ To replicate incremental data, do the following: SELECT @@global.time_zone; ``` -8. Back up the [query bindings](/sql-plan-management.md) in the upstream cluster and restore them in the downstream cluster. You can use the following query to back up the query bindings: +8. Back up the [query bindings](/sql-plan-management.md) in the upstream cluster and restore them in the downstream instance. You can use the following query to back up the query bindings: ```sql SELECT DISTINCT(CONCAT('CREATE GLOBAL BINDING FOR ', original_sql,' USING ', bind_sql,';')) FROM mysql.bind_info WHERE status='enabled'; @@ -375,9 +380,9 @@ To replicate incremental data, do the following: If you do not get any output, it means that no query bindings are used in the upstream cluster. In this case, you can skip this step. - After you get the query bindings, run them in the downstream cluster to restore the query bindings. + After you get the query bindings, run them in the downstream instance to restore the query bindings. -9. Back up the user and privilege information in the upstream cluster and restore them in the downstream cluster. You can use the following script to back up the user and privilege information. Note that you need to replace the placeholders with the actual values. +9. Back up the user and privilege information in the upstream cluster and restore them in the downstream instance. You can use the following script to back up the user and privilege information. Note that you need to replace the placeholders with the actual values. ```shell #!/bin/bash @@ -387,7 +392,7 @@ To replicate incremental data, do the following: export MYSQL_USER=root export MYSQL_PWD={root_password} export MYSQL="mysql -u${MYSQL_USER} --default-character-set=utf8mb4" - + function backup_user_priv(){ ret=0 sql="SELECT CONCAT(user,':',host,':',authentication_string) FROM mysql.user WHERE user NOT IN ('root')" @@ -402,8 +407,8 @@ To replicate incremental data, do the following: done return $ret } - + backup_user_priv ``` - - After you get the user and privilege information, run the generated SQL statements in the downstream cluster to restore the user and privilege information. + + After you get the user and privilege information, run the generated SQL statements in the downstream TiDB instance to restore the user and privilege information. From 21bd3f4e36ad075d375b57764d84293c2aff237d Mon Sep 17 00:00:00 2001 From: lilin90 Date: Tue, 28 Oct 2025 15:42:16 +0800 Subject: [PATCH 10/14] Update importing csv to premium --- TOC-tidb-cloud-premium.md | 2 +- tidb-cloud/import-csv-files-serverless.md | 6 +- tidb-cloud/import-csv-files.md | 6 +- .../premium/import-csv-files-premium.md | 87 ++++++++++--------- 4 files changed, 52 insertions(+), 49 deletions(-) diff --git a/TOC-tidb-cloud-premium.md b/TOC-tidb-cloud-premium.md index 541efb87f7fab..be2d580fabe3e 100644 --- a/TOC-tidb-cloud-premium.md +++ b/TOC-tidb-cloud-premium.md @@ -209,7 +209,7 @@ - [Migrate from Amazon RDS for Oracle Using AWS DMS](/tidb-cloud/migrate-from-oracle-using-aws-dms.md) - Import Data into TiDB Cloud - [Import Sample Data (SQL Files) from Cloud Storage](/tidb-cloud/import-sample-data-serverless.md) - - [Import CSV Files from Cloud Storage](/tidb-cloud/import-csv-files-serverless.md) + - [Import CSV Files from Cloud Storage](/tidb-cloud/premium/import-csv-files-premium.md) - [Import Parquet Files from Cloud Storage](/tidb-cloud/import-parquet-files-serverless.md) - [Import Snapshot Files from Cloud Storage](/tidb-cloud/import-snapshot-files-serverless.md) - [Import with MySQL CLI](/tidb-cloud/import-with-mysql-cli-serverless.md) diff --git a/tidb-cloud/import-csv-files-serverless.md b/tidb-cloud/import-csv-files-serverless.md index 8b8b55abdbc94..0a8c4f0212dcd 100644 --- a/tidb-cloud/import-csv-files-serverless.md +++ b/tidb-cloud/import-csv-files-serverless.md @@ -13,13 +13,13 @@ This document describes how to import CSV files from Amazon Simple Storage Servi ## Limitations -- To ensure data consistency, TiDB Cloud allows to import CSV files into empty tables only. To import data into an existing table that already contains data, you can import the data into a temporary empty table by following this document, and then use the `INSERT SELECT` statement to copy the data to the target existing table. +- To ensure data consistency, TiDB Cloud allows importing CSV files into empty tables only. To import data into an existing table that already contains data, you can import the data into a temporary empty table by following this document, and then use the `INSERT SELECT` statement to copy the data to the target existing table. ## Step 1. Prepare the CSV files -1. If a CSV file is larger than 256 MB, consider splitting it into smaller files, each with a size around 256 MB. +1. If a CSV file is larger than 256 MiB, consider splitting it into smaller files, each with a size around 256 MiB. - TiDB Cloud supports importing very large CSV files but performs best with multiple input files around 256 MB in size. This is because TiDB Cloud can process multiple files in parallel, which can greatly improve the import speed. + TiDB Cloud supports importing very large CSV files but performs best with multiple input files around 256 MiB in size. This is because TiDB Cloud can process multiple files in parallel, which can greatly improve the import speed. 2. Name the CSV files as follows: diff --git a/tidb-cloud/import-csv-files.md b/tidb-cloud/import-csv-files.md index c72e8a89b2549..a68eac4546532 100644 --- a/tidb-cloud/import-csv-files.md +++ b/tidb-cloud/import-csv-files.md @@ -10,15 +10,15 @@ This document describes how to import CSV files from Amazon Simple Storage Servi ## Limitations -- To ensure data consistency, TiDB Cloud allows to import CSV files into empty tables only. To import data into an existing table that already contains data, you can use TiDB Cloud to import the data into a temporary empty table by following this document, and then use the `INSERT SELECT` statement to copy the data to the target existing table. +- To ensure data consistency, TiDB Cloud allows importing CSV files into empty tables only. To import data into an existing table that already contains data, you can use TiDB Cloud to import the data into a temporary empty table by following this document, and then use the `INSERT SELECT` statement to copy the data to the target existing table. - If a TiDB Cloud Dedicated cluster has a [changefeed](/tidb-cloud/changefeed-overview.md) or has [Point-in-time Restore](/tidb-cloud/backup-and-restore.md#turn-on-point-in-time-restore) enabled, you cannot import data to the cluster (the **Import Data** button will be disabled) because the current data import feature uses the [physical import mode](https://docs.pingcap.com/tidb/stable/tidb-lightning-physical-import-mode). In this mode, the imported data does not generate change logs, so the changefeed and Point-in-time Restore cannot detect the imported data. ## Step 1. Prepare the CSV files -1. If a CSV file is larger than 256 MB, consider splitting it into smaller files, each with a size of around 256 MB. +1. If a CSV file is larger than 256 MiB, consider splitting it into smaller files, each with a size of around 256 MiB. - TiDB Cloud supports importing very large CSV files but performs best with multiple input files around 256 MB in size. This is because TiDB Cloud can process multiple files in parallel, which can greatly improve the import speed. + TiDB Cloud supports importing very large CSV files but performs best with multiple input files around 256 MiB in size. This is because TiDB Cloud can process multiple files in parallel, which can greatly improve the import speed. 2. Name the CSV files as follows: diff --git a/tidb-cloud/premium/import-csv-files-premium.md b/tidb-cloud/premium/import-csv-files-premium.md index c10798e69641e..a10c0a668f532 100644 --- a/tidb-cloud/premium/import-csv-files-premium.md +++ b/tidb-cloud/premium/import-csv-files-premium.md @@ -1,37 +1,38 @@ --- -title: Import CSV files into TiDB Cloud Premium -summary: Learn how to import CSV files from Amazon S3 or Alibaba Cloud Object Storage Service (OSS) into TiDB Cloud Premium instances. +title: Import CSV Files from Cloud Storage into {{{ .premium }}} +summary: Learn how to import CSV files from Amazon S3 or Alibaba Cloud Object Storage Service (OSS) into {{{ .premium }}} instances. --- +# Import CSV Files from Cloud Storage into {{{ .premium }}} + +This document describes how to import CSV files from Amazon Simple Storage Service (Amazon S3) or Alibaba Cloud Object Storage Service (OSS) into {{{ .premium }}} instances. + > **Warning:** > -> TiDB Cloud Premium is currently available in **Private Preview** in select AWS regions. +> {{{ .premium }}} is currently available in **private preview** in select AWS regions. > -> If Premium is not yet enabled for your organization, or if you need access in another cloud provider or region, click **Support** in the lower-left corner of the [TiDB Cloud console](https://tidbcloud.com/), or submit a request through the [Contact Us form](https://www.pingcap.com/contact-us) on our website. - -# Import CSV files into TiDB Cloud Premium +> If Premium is not yet enabled for your organization, or if you need access in another cloud provider or region, click **Support** in the lower-left corner of the [TiDB Cloud console](https://tidbcloud.com/), or submit a request through the [Contact Us](https://www.pingcap.com/contact-us) form on the website. -This document describes how to import CSV files from Amazon Simple Storage Service (Amazon S3) or Alibaba Cloud Object Storage Service (OSS) into TiDB Cloud Premium instances. - -> **Note:** +> **Tip:** > -> For TiDB Cloud Serverless or Essential, see [Import CSV files from cloud storage into TiDB Cloud](/tidb-cloud/import-csv-files-serverless.md). +> - For TiDB Cloud Serverless or Essential, see [Import CSV files from cloud storage into TiDB Cloud](/tidb-cloud/import-csv-files-serverless.md). +> - For TiDB Cloud Dedicated, see [Import CSV Files from Cloud Storage into TiDB Cloud Dedicated](/tidb-cloud/import-csv-files.md). ## Limitations -- To ensure data consistency, TiDB Cloud Premium allows importing CSV files into empty tables only. To import data into an existing table that already contains data, you can import the data into a temporary empty table by following this document, and then use the `INSERT SELECT` statement to copy the data to the target existing table. +To ensure data consistency, {{{ .premium }}} allows importing CSV files into empty tables only. To import data into an existing table that already contains data, you can import the data into a temporary empty table by following this document, and then use the `INSERT SELECT` statement to copy the data to the target existing table. ## Step 1. Prepare the CSV files -1. If a CSV file is larger than 256 MB, consider splitting it into smaller files, each with a size around 256 MB. +1. If a CSV file is larger than 256 MiB, consider splitting it into smaller files, each with a size around 256 MiB. - TiDB Cloud Premium supports importing very large CSV files but performs best with multiple input files around 256 MB in size. This is because TiDB Cloud Premium can process multiple files in parallel, which can greatly improve the import speed. + {{{ .premium }}} supports importing very large CSV files but performs best with multiple input files around 256 MiB in size. This is because {{{ .premium }}} can process multiple files in parallel, which can greatly improve the import speed. 2. Name the CSV files as follows: - If a CSV file contains all data of an entire table, name the file in the `${db_name}.${table_name}.csv` format, which maps to the `${db_name}.${table_name}` table when you import the data. - If the data of one table is separated into multiple CSV files, append a numeric suffix to these CSV files. For example, `${db_name}.${table_name}.000001.csv` and `${db_name}.${table_name}.000002.csv`. The numeric suffixes can be non-consecutive but must be in ascending order. You also need to add extra zeros before the number to ensure that all suffixes have the same length. - - TiDB Cloud Premium supports importing compressed files in the following formats: `.gzip`, `.gz`, `.zstd`, `.zst` and `.snappy`. If you want to import compressed CSV files, name the files in the `${db_name}.${table_name}.${suffix}.csv.${compress}` format, where `${suffix}` is optional and can be any integer such as '000001'. For example, if you want to import the `trips.000001.csv.gz` file to the `bikeshare.trips` table, you need to rename the file as `bikeshare.trips.000001.csv.gz`. + - {{{ .premium }}} supports importing compressed files in the following formats: `.gzip`, `.gz`, `.zstd`, `.zst` and `.snappy`. If you want to import compressed CSV files, name the files in the `${db_name}.${table_name}.${suffix}.csv.${compress}` format, where `${suffix}` is optional and can be any integer such as '000001'. For example, if you want to import the `trips.000001.csv.gz` file to the `bikeshare.trips` table, you need to rename the file as `bikeshare.trips.000001.csv.gz`. > **Note:** > @@ -41,9 +42,9 @@ This document describes how to import CSV files from Amazon Simple Storage Servi ## Step 2. Create the target table schemas -Because CSV files do not contain schema information, before importing data from CSV files into TiDB Cloud Premium, you need to create the table schemas using either of the following methods: +Because CSV files do not contain schema information, before importing data from CSV files into {{{ .premium }}}, you need to create the table schemas using either of the following methods: -- Method 1: In TiDB Cloud Premium, create the target databases and tables for your source data. +- Method 1: In {{{ .premium }}}, create the target databases and tables for your source data. - Method 2: In the Amazon S3 or Alibaba Cloud Object Storage Service (OSS) directory where the CSV files are located, create the target table schema files for your source data as follows: @@ -51,9 +52,9 @@ Because CSV files do not contain schema information, before importing data from If your CSV files follow the naming rules in [Step 1](#step-1-prepare-the-csv-files), the database schema files are optional for the data import. Otherwise, the database schema files are mandatory. - Each database schema file must be in the `${db_name}-schema-create.sql` format and contain a `CREATE DATABASE` DDL statement. With this file, TiDB Cloud Premium will create the `${db_name}` database to store your data when you import the data. + Each database schema file must be in the `${db_name}-schema-create.sql` format and contain a `CREATE DATABASE` DDL statement. With this file, {{{ .premium }}} will create the `${db_name}` database to store your data when you import the data. - For example, if you create a `mydb-schema-create.sql` file that contains the following statement, TiDB Cloud Premium will create the `mydb` database when you import the data. + For example, if you create a `mydb-schema-create.sql` file that contains the following statement, {{{ .premium }}} will create the `mydb` database when you import the data. ```sql CREATE DATABASE mydb; @@ -61,11 +62,11 @@ Because CSV files do not contain schema information, before importing data from 2. Create table schema files for your source data. - If you do not include the table schema files in the Amazon S3 or Alibaba Cloud Object Storage Service directory where the CSV files are located, TiDB Cloud Premium will not create the corresponding tables for you when you import the data. + If you do not include the table schema files in the Amazon S3 or Alibaba Cloud Object Storage Service directory where the CSV files are located, {{{ .premium }}} will not create the corresponding tables for you when you import the data. - Each table schema file must be in the `${db_name}.${table_name}-schema.sql` format and contain a `CREATE TABLE` DDL statement. With this file, TiDB Cloud Premium will create the `${table_name}` table in the `${db_name}` database when you import the data. + Each table schema file must be in the `${db_name}.${table_name}-schema.sql` format and contain a `CREATE TABLE` DDL statement. With this file, {{{ .premium }}} will create the `${table_name}` table in the `${db_name}` database when you import the data. - For example, if you create a `mydb.mytable-schema.sql` file that contains the following statement, TiDB Cloud Premium will create the `mytable` table in the `mydb` database when you import the data. + For example, if you create a `mydb.mytable-schema.sql` file that contains the following statement, {{{ .premium }}} will create the `mytable` table in the `mydb` database when you import the data. ```sql CREATE TABLE mytable ( @@ -80,30 +81,30 @@ Because CSV files do not contain schema information, before importing data from ## Step 3. Configure cross-account access -To allow TiDB Cloud Premium to access the CSV files in Amazon S3 or Alibaba Cloud Object Storage Service (OSS), do one of the following: +To allow {{{ .premium }}} to access the CSV files in Amazon S3 or Alibaba Cloud Object Storage Service (OSS), do one of the following: -- If your CSV files are located in Amazon S3, [configure Amazon S3 access](/tidb-cloud/serverless-external-storage.md#configure-amazon-s3-access) for your cluster. +- If your CSV files are located in Amazon S3, [configure Amazon S3 access](/tidb-cloud/serverless-external-storage.md#configure-amazon-s3-access) for your TiDB instance. You can use either an AWS access key or a Role ARN to access your bucket. Once finished, make a note of the access key (including the access key ID and secret access key) or the Role ARN value as you will need it in [Step 4](#step-4-import-csv-files). -- If your CSV files are located in Alibaba Cloud Object Storage Service (OSS), [configure Alibaba Cloud Object Storage Service (OSS) access](/tidb-cloud/serverless-external-storage.md#configure-alibaba-cloud-object-storage-service-oss-access) for your cluster. +- If your CSV files are located in Alibaba Cloud Object Storage Service (OSS), [configure Alibaba Cloud Object Storage Service (OSS) access](/tidb-cloud/serverless-external-storage.md#configure-alibaba-cloud-object-storage-service-oss-access) for your TiDB instance. ## Step 4. Import CSV files -To import the CSV files to TiDB Cloud Premium, take the following steps: +To import the CSV files to {{{ .premium }}}, take the following steps:
-1. Open the **Import** page for your target cluster. +1. Open the **Import** page for your target TiDB instance. - 1. Log in to the [TiDB Cloud console](https://tidbcloud.com/) and navigate to the [**TiDB Instances**](https://tidbcloud.com/tidbs) page of your project. + 1. Log in to the [TiDB Cloud console](https://tidbcloud.com/) and navigate to the [**TiDB Instances**](https://tidbcloud.com/tidbs) page. > **Tip:** > > You can use the combo box in the upper-left corner to switch between organizations, projects, and clusters. - 2. Click the name of your target cluster to go to its overview page, and then click **Data** > **Import** in the left navigation pane. + 2. Click the name of your target TiDB instance to go to its overview page, and then click **Data** > **Import** in the left navigation pane. 2. Click **Import data from Cloud Storage**. @@ -114,25 +115,26 @@ To import the CSV files to TiDB Cloud Premium, take the following steps: - When importing one file, enter the source file URI in the following format `s3://[bucket_name]/[data_source_folder]/[file_name].csv`. For example, `s3://sampledata/ingest/TableName.01.csv`. - When importing multiple files, enter the source folder URI in the following format `s3://[bucket_name]/[data_source_folder]/`. For example, `s3://sampledata/ingest/`. - **Credential**: you can use either an AWS Role ARN or an AWS access key to access your bucket. For more information, see [Configure Amazon S3 access](/tidb-cloud/serverless-external-storage.md#configure-amazon-s3-access). - - **AWS Role ARN**: enter the AWS Role ARN value. If you need to create a new role, click **Click here to create a new one with AWS CloudFormation** and follow the guided steps to launch the provided template, acknowledge the IAM warning, create the stack, and copy the generated ARN back into TiDB Cloud Premium. + - **AWS Role ARN**: enter the AWS Role ARN value. If you need to create a new role, click **Click here to create a new one with AWS CloudFormation** and follow the guided steps to launch the provided template, acknowledge the IAM warning, create the stack, and copy the generated ARN back into {{{ .premium }}}. - **AWS Access Key**: enter the AWS access key ID and AWS secret access key. - - **Test Bucket Access**: click this button after the credentials are in place to confirm that TiDB Cloud Premium can reach the bucket. - - **Target Connection**: supply the TiDB username and password that will run the import. Optionally click **Test Connection** to validate the credentials. + - **Test Bucket Access**: click this button after the credentials are in place to confirm that {{{ .premium }}} can reach the bucket. + - **Target Connection**: provide the TiDB username and password that will run the import. Optionally, click **Test Connection** to validate the credentials. 4. Click **Next**. -5. In the **Source Files Mapping** section, TiDB Cloud Premium scans the bucket and proposes mappings between the source files and destination tables. +5. In the **Source Files Mapping** section, {{{ .premium }}} scans the bucket and proposes mappings between the source files and destination tables. When a directory is specified in **Source Files URI**, the **Use [File naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) for automatic mapping** option is selected by default. > **Note:** > - > When a single file is specified in **Source Files URI**, the **Use [File naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) for automatic mapping** option is not displayed, and TiDB Cloud Premium automatically populates the **Source** field with the file name. In this case, you only need to select the target database and table for data import. + > When a single file is specified in **Source Files URI**, the **Use [File naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) for automatic mapping** option is not displayed, and {{{ .premium }}} automatically populates the **Source** field with the file name. In this case, you only need to select the target database and table for data import. - Leave automatic mapping enabled to apply the [file naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) to your source files and target tables. Keep **CSV** selected as the data format. - **Advanced options**: expand the panel to view the `Ignore compatibility checks (advanced)` toggle. Leave it disabled unless you intentionally want to bypass schema compatibility validation. + > **Note:** > > Manual mapping is coming soon. When the toggle becomes available, clear the automatic mapping option and configure the mapping manually: @@ -140,7 +142,7 @@ To import the CSV files to TiDB Cloud Premium, take the following steps: > - **Source**: enter a filename pattern such as `TableName.01.csv`. Wildcards `*` and `?` are supported (for example, `my-data*.csv`). > - **Target Database** and **Target Table**: choose the destination objects for the matched files. -6. TiDB Cloud Premium automatically scans the source path. Review the scan results, check the data files found and corresponding target tables, and then click **Start Import**. +6. {{{ .premium }}} automatically scans the source path. Review the scan results, check the data files found and corresponding target tables, and then click **Start Import**. 7. When the import progress shows **Completed**, check the imported tables. @@ -148,7 +150,7 @@ To import the CSV files to TiDB Cloud Premium, take the following steps:
-1. Open the **Import** page for your target cluster. +1. Open the **Import** page for your target TiDB instance. 1. Log in to the [TiDB Cloud console](https://tidbcloud.com/) and navigate to the [**TiDB Instances**](https://tidbcloud.com/tidbs) page of your project. @@ -156,7 +158,7 @@ To import the CSV files to TiDB Cloud Premium, take the following steps: > > You can use the combo box in the upper-left corner to switch between organizations, projects, and clusters. - 2. Click the name of your target cluster to go to its overview page, and then click **Data** > **Import** in the left navigation pane. + 2. Click the name of your target TiDB instance to go to its overview page, and then click **Data** > **Import** in the left navigation pane. 2. Click **Import data from Cloud Storage**. @@ -167,23 +169,24 @@ To import the CSV files to TiDB Cloud Premium, take the following steps: - When importing one file, enter the source file URI in the following format `oss://[bucket_name]/[data_source_folder]/[file_name].csv`. For example, `oss://sampledata/ingest/TableName.01.csv`. - When importing multiple files, enter the source folder URI in the following format `oss://[bucket_name]/[data_source_folder]/`. For example, `oss://sampledata/ingest/`. - **Credential**: you can use an AccessKey pair to access your bucket. For more information, see [Configure Alibaba Cloud Object Storage Service (OSS) access](/tidb-cloud/serverless-external-storage.md#configure-alibaba-cloud-object-storage-service-oss-access). - - **Test Bucket Access**: click this button after the credentials are in place to confirm that TiDB Cloud Premium can reach the bucket. - - **Target Connection**: supply the TiDB username and password that will run the import. Optionally click **Test Connection** to validate the credentials. + - **Test Bucket Access**: click this button after the credentials are in place to confirm that {{{ .premium }}} can reach the bucket. + - **Target Connection**: provide the TiDB username and password that will run the import. Optionally, click **Test Connection** to validate the credentials. 4. Click **Next**. -5. In the **Source Files Mapping** section, TiDB Cloud Premium scans the bucket and proposes mappings between the source files and destination tables. +5. In the **Source Files Mapping** section, {{{ .premium }}} scans the bucket and proposes mappings between the source files and destination tables. When a directory is specified in **Source Files URI**, the **Use [File naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) for automatic mapping** option is selected by default. > **Note:** > - > When a single file is specified in **Source Files URI**, the **Use [File naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) for automatic mapping** option is not displayed, and TiDB Cloud Premium automatically populates the **Source** field with the file name. In this case, you only need to select the target database and table for data import. + > When a single file is specified in **Source Files URI**, the **Use [File naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) for automatic mapping** option is not displayed, and {{{ .premium }}} automatically populates the **Source** field with the file name. In this case, you only need to select the target database and table for data import. - Leave automatic mapping enabled to apply the [file naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) to your source files and target tables. Keep **CSV** selected as the data format. - **Advanced options**: expand the panel to view the `Ignore compatibility checks (advanced)` toggle. Leave it disabled unless you intentionally want to bypass schema compatibility validation. + > **Note:** > > Manual mapping is coming soon. When the toggle becomes available, clear the automatic mapping option and configure the mapping manually: @@ -191,7 +194,7 @@ To import the CSV files to TiDB Cloud Premium, take the following steps: > - **Source**: enter a filename pattern such as `TableName.01.csv`. Wildcards `*` and `?` are supported (for example, `my-data*.csv`). > - **Target Database** and **Target Table**: choose the destination objects for the matched files. -6. TiDB Cloud Premium automatically scans the source path. Review the scan results, check the data files found and corresponding target tables, and then click **Start Import**. +6. {{{ .premium }}} automatically scans the source path. Review the scan results, check the data files found and corresponding target tables, and then click **Start Import**. 7. When the import progress shows **Completed**, check the imported tables. @@ -199,7 +202,7 @@ To import the CSV files to TiDB Cloud Premium, take the following steps: -When you run an import task, if any unsupported or invalid conversions are detected, TiDB Cloud Premium terminates the import job automatically and reports an importing error. +When you run an import task, if any unsupported or invalid conversions are detected, {{{ .premium }}} terminates the import job automatically and reports an importing error. If you get an importing error, do the following: From 90e3922058095b472a5b501e649110f23289eac5 Mon Sep 17 00:00:00 2001 From: lilin90 Date: Tue, 28 Oct 2025 15:45:04 +0800 Subject: [PATCH 11/14] Update wording for consistency --- tidb-cloud/migrate-from-op-tidb.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tidb-cloud/migrate-from-op-tidb.md b/tidb-cloud/migrate-from-op-tidb.md index 68d8a3d6619d0..a104ed292ca21 100644 --- a/tidb-cloud/migrate-from-op-tidb.md +++ b/tidb-cloud/migrate-from-op-tidb.md @@ -5,7 +5,7 @@ summary: Learn how to migrate data from TiDB Self-Managed to TiDB Cloud. # Migrate from TiDB Self-Managed to TiDB Cloud -This document describes how to migrate data from your TiDB Self-Managed clusters to TiDB Cloud (AWS) through Dumpling and TiCDC. +This document describes how to migrate data from your TiDB Self-Managed clusters to TiDB Cloud (on AWS) through Dumpling and TiCDC. The overall procedure is as follows: @@ -13,7 +13,7 @@ The overall procedure is as follows: 2. Migrate full data. The process is as follows: 1. Export data from TiDB Self-Managed to Amazon S3 using Dumpling. 2. Import data from Amazon S3 to TiDB Cloud. -3. Replicate incremental data by using TiCDC. +3. Replicate incremental data using TiCDC. 4. Verify the migrated data. ## Prerequisites From 85e8193b082e3eac91eeb4ef57b5f5cb939c4b29 Mon Sep 17 00:00:00 2001 From: Lilian Lee Date: Tue, 28 Oct 2025 15:53:39 +0800 Subject: [PATCH 12/14] Remove wording about projects --- tidb-cloud/premium/import-csv-files-premium.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tidb-cloud/premium/import-csv-files-premium.md b/tidb-cloud/premium/import-csv-files-premium.md index a10c0a668f532..c9f270734c753 100644 --- a/tidb-cloud/premium/import-csv-files-premium.md +++ b/tidb-cloud/premium/import-csv-files-premium.md @@ -102,7 +102,7 @@ To import the CSV files to {{{ .premium }}}, take the following steps: > **Tip:** > - > You can use the combo box in the upper-left corner to switch between organizations, projects, and clusters. + > You can use the combo box in the upper-left corner to switch between organizations and instances. 2. Click the name of your target TiDB instance to go to its overview page, and then click **Data** > **Import** in the left navigation pane. @@ -156,7 +156,7 @@ To import the CSV files to {{{ .premium }}}, take the following steps: > **Tip:** > - > You can use the combo box in the upper-left corner to switch between organizations, projects, and clusters. + > You can use the combo box in the upper-left corner to switch between organizations and instances. 2. Click the name of your target TiDB instance to go to its overview page, and then click **Data** > **Import** in the left navigation pane. From 1f646e3b19808a236c40451ab7223b069ff40f2b Mon Sep 17 00:00:00 2001 From: lilin90 Date: Tue, 28 Oct 2025 17:03:59 +0800 Subject: [PATCH 13/14] Update importing using MySQL Command-Line Client --- TOC-tidb-cloud-premium.md | 2 +- .../import-with-mysql-cli-serverless.md | 2 +- tidb-cloud/import-with-mysql-cli.md | 2 +- .../premium/import-csv-files-premium.md | 4 +- ...um.md => import-with-mysql-cli-premium.md} | 52 ++++++++++--------- 5 files changed, 33 insertions(+), 29 deletions(-) rename tidb-cloud/premium/{import-from-mysql-premium.md => import-with-mysql-cli-premium.md} (62%) diff --git a/TOC-tidb-cloud-premium.md b/TOC-tidb-cloud-premium.md index be2d580fabe3e..2842946aa8cd2 100644 --- a/TOC-tidb-cloud-premium.md +++ b/TOC-tidb-cloud-premium.md @@ -212,7 +212,7 @@ - [Import CSV Files from Cloud Storage](/tidb-cloud/premium/import-csv-files-premium.md) - [Import Parquet Files from Cloud Storage](/tidb-cloud/import-parquet-files-serverless.md) - [Import Snapshot Files from Cloud Storage](/tidb-cloud/import-snapshot-files-serverless.md) - - [Import with MySQL CLI](/tidb-cloud/import-with-mysql-cli-serverless.md) + - [Import Data Using MySQL CLI](/tidb-cloud/premium/import-with-mysql-cli-premium.md) - Reference - [Configure External Storage Access for TiDB Cloud](/tidb-cloud/serverless-external-storage.md) - [Naming Conventions for Data Import](/tidb-cloud/naming-conventions-for-data-import.md) diff --git a/tidb-cloud/import-with-mysql-cli-serverless.md b/tidb-cloud/import-with-mysql-cli-serverless.md index 5de46158802d9..9d25b8416f6c3 100644 --- a/tidb-cloud/import-with-mysql-cli-serverless.md +++ b/tidb-cloud/import-with-mysql-cli-serverless.md @@ -53,7 +53,7 @@ INSERT INTO products (product_id, product_name, price) VALUES (3, 'Tablet', 299.99); ``` -## Step 3. Import data from a SQL or CSV file +## Step 3. Import data from an SQL or CSV file You can import data from an SQL file or a CSV file. The following sections provide step-by-step instructions for importing data from each type. diff --git a/tidb-cloud/import-with-mysql-cli.md b/tidb-cloud/import-with-mysql-cli.md index a7e0057918f2a..22e3e800069e8 100644 --- a/tidb-cloud/import-with-mysql-cli.md +++ b/tidb-cloud/import-with-mysql-cli.md @@ -49,7 +49,7 @@ INSERT INTO products (product_id, product_name, price) VALUES (3, 'Tablet', 299.99); ``` -## Step 3. Import data from a SQL or CSV file +## Step 3. Import data from an SQL or CSV file You can import data from an SQL file or a CSV file. The following sections provide step-by-step instructions for importing data from each type. diff --git a/tidb-cloud/premium/import-csv-files-premium.md b/tidb-cloud/premium/import-csv-files-premium.md index c9f270734c753..358157167ca4d 100644 --- a/tidb-cloud/premium/import-csv-files-premium.md +++ b/tidb-cloud/premium/import-csv-files-premium.md @@ -15,8 +15,8 @@ This document describes how to import CSV files from Amazon Simple Storage Servi > **Tip:** > -> - For TiDB Cloud Serverless or Essential, see [Import CSV files from cloud storage into TiDB Cloud](/tidb-cloud/import-csv-files-serverless.md). -> - For TiDB Cloud Dedicated, see [Import CSV Files from Cloud Storage into TiDB Cloud Dedicated](/tidb-cloud/import-csv-files.md). +> - For {{{ .starter }}} or Essential, see [Import CSV Files from Cloud Storage into {{{ .starter }}} or Essential](/tidb-cloud/import-csv-files-serverless.md). +> - For {{{ .dedicated }}}, see [Import CSV Files from Cloud Storage into {{{ .dedicated }}}](/tidb-cloud/import-csv-files.md). ## Limitations diff --git a/tidb-cloud/premium/import-from-mysql-premium.md b/tidb-cloud/premium/import-with-mysql-cli-premium.md similarity index 62% rename from tidb-cloud/premium/import-from-mysql-premium.md rename to tidb-cloud/premium/import-with-mysql-cli-premium.md index df49c0126cd28..39185d496ad95 100644 --- a/tidb-cloud/premium/import-from-mysql-premium.md +++ b/tidb-cloud/premium/import-with-mysql-cli-premium.md @@ -1,46 +1,48 @@ --- -title: Import data into TiDB Cloud Premium via MySQL Command-Line Client -summary: Learn how to import small CSV or SQL files into TiDB Cloud Premium instances using the MySQL Command-Line Client (`mysql`). +title: Import Data into {{{ .premium }}} using the MySQL Command-Line Client +summary: Learn how to import small CSV or SQL files into {{{ .premium }}} instances using the MySQL Command-Line Client (`mysql`). --- -> **Warning:** -> -> TiDB Cloud Premium is currently available in **Private Preview** in select AWS regions. -> -> If Premium is not yet enabled for your organization, or if you need access in another cloud provider or region, click **Support** in the lower-left corner of the [TiDB Cloud console](https://tidbcloud.com/), or submit a request through the [Contact Us form](https://www.pingcap.com/contact-us) on our website. +# Import Data into {{{ .premium }}} using the MySQL Command-Line Client -# Import data into TiDB Cloud Premium using the MySQL Command-Line Client +This document describes how to import data into {{{ .premium }}} using the [MySQL Command-Line Client](https://dev.mysql.com/doc/refman/8.0/en/mysql.html) (`mysql`). The following sections provide step-by-step instructions for importing data from SQL or CSV files. This process performs a logical import, where the MySQL Command-Line Client replays SQL statements from your local machine against TiDB Cloud. -This document describes how to import data into TiDB Cloud Premium using the [MySQL Command-Line Client](https://dev.mysql.com/doc/refman/8.0/en/mysql.html) (`mysql`). The following sections provide step-by-step instructions for importing data from SQL or CSV files. These steps use a logical import, meaning the MySQL Command-Line Client replays SQL statements against TiDB Cloud from your local machine. +> **Warning:** +> +> {{{ .premium }}} is currently available in **private preview** in select AWS regions. +> +> If Premium is not yet enabled for your organization, or if you need access in another cloud provider or region, click **Support** in the lower-left corner of the [TiDB Cloud console](https://tidbcloud.com/), or submit a request through the [Contact Us](https://www.pingcap.com/contact-us) form on the website. > **Tip:** > -> Logical imports are best suited for relatively small SQL or CSV files. For faster, parallel imports from cloud storage or to process multiple files from [Dumpling](https://docs.pingcap.com/tidb/stable/dumpling-overview) exports, see [Import CSV files into TiDB Cloud Premium](/tidb-cloud/premium/import-csv-files-premium.md). +> - Logical imports are best suited for relatively small SQL or CSV files. For faster, parallel imports from cloud storage or to process multiple files from [Dumpling](https://docs.pingcap.com/tidb/stable/dumpling-overview) exports, see [Import CSV Files from Cloud Storage into {{{ .premium }}}](/tidb-cloud/premium/import-csv-files-premium.md). +> - For {{{ .starter }}} or Essential, see [Import Data into {{{ .starter }}} or Essential via MySQL CLI](/tidb-cloud/import-with-mysql-cli-serverless.md). +> - For {{{ .dedicated }}}, see [Import Data into {{{ .dedicated }}} via MySQL CLI](/tidb-cloud/import-with-mysql-cli.md). ## Prerequisites -Before you can import data to a TiDB Cloud Premium instance via the MySQL Command-Line Client, you need the following prerequisites: +Before you can import data to a {{{ .premium }}} instance via the MySQL Command-Line Client, you need the following prerequisites: -- You have access to your TiDB Cloud Premium instance. +- You have access to your {{{ .premium }}} instance. - Install the MySQL Command-Line Client (`mysql`) on your local computer. -## Step 1. Connect to your TiDB Cloud Premium instance +## Step 1. Connect to your {{{ .premium }}} instance -Connect to your TiDB instance via the MySQL Command-Line Client. If this is your first time, you will need to configure the network connection and generate the TiDB SQL `root` user password following the steps below. +Connect to your TiDB instance using the MySQL Command-Line Client. If this is your first time, perform the following steps to configure the network connection and generate the TiDB SQL `root` user password: -1. Log in to the [TiDB Cloud console](https://tidbcloud.com/) and, if applicable, click **Switch to Private Preview** in the lower-left corner to enter the Premium workspace. Then navigate to the [**TiDB Instances**](https://tidbcloud.com/project/instances) page and click the name of your target instance to go to its overview page. +1. Log in to the [TiDB Cloud console](https://tidbcloud.com/) and navigate to the [**TiDB Instances**](https://tidbcloud.com/project/instances) page. Then, click the name of your target instance to go to its overview page. 2. Click **Connect** in the upper-right corner. A connection dialog is displayed. -3. Ensure the configurations in the connection dialog match your operating environment. +3. Ensure that the configurations in the connection dialog match your operating environment. - **Connection Type** is set to `Public`. - **Connect With** is set to `MySQL CLI`. - **Operating System** matches your environment. > **Note:** - > - > Premium clusters ship with the public endpoint disabled by default. If you do not see the `Public` option, enable the public endpoint from the instance details page (in the **Network** tab), or ask an organization admin to enable it before proceeding. + > + > {{{ .premium }}} instances have the public endpoint disabled by default. If you do not see the `Public` option, enable the public endpoint on the instance details page (under the **Network** tab), or ask an organization admin to enable it before proceeding. 4. Click **Generate Password** to create a random password. If you have already configured a password, reuse that credential or rotate it before proceeding. @@ -61,9 +63,9 @@ CREATE TABLE products ( ); ``` -Run the schema file against your TiDB Cloud Premium instance so the database and table exist before you load data in the next step. +Run the schema file against your {{{ .premium }}} instance so the database and table exist before you load data in the next step. -## Step 3. Import data from a SQL or CSV file +## Step 3. Import data from an SQL or CSV file Use the MySQL Command-Line Client to load data into the schema you created in Step 2. Replace the placeholders with your own file paths, credentials, and dataset as needed, then follow the workflow that matches your source format. @@ -96,9 +98,11 @@ Do the following to import data from an SQL file: > > The sample schema creates a `test` database and the commands use `-D test`. Change both the schema file and the `-D` parameter if you plan to import into a different database. -> **Important:** -> -> The SQL user you authenticate with must have the required privileges (for example, `CREATE` and `INSERT`) to define tables and load data into the target database. + + +The SQL user you authenticate with must have the required privileges (for example, `CREATE` and `INSERT`) to define tables and load data into the target database. + +
@@ -142,7 +146,7 @@ Do the following to import data from a CSV file: ## Step 4. Validate the imported data -After the import completes, run basic queries to confirm that the expected rows are present and the data looks correct. +After the import is complete, run basic queries to verify that the expected rows are present and the data is correct. Use the MySQL Command-Line Client to connect to the same database and run validation queries, such as counting rows and inspecting sample records: From f179d83a8e842d3485d5fdeb235468109e7ab0e5 Mon Sep 17 00:00:00 2001 From: lilin90 Date: Tue, 28 Oct 2025 18:28:23 +0800 Subject: [PATCH 14/14] Update importing from S3 --- TOC-tidb-cloud-premium.md | 1 + tidb-cloud/premium/import-from-s3-premium.md | 79 ++++++++++---------- 2 files changed, 41 insertions(+), 39 deletions(-) diff --git a/TOC-tidb-cloud-premium.md b/TOC-tidb-cloud-premium.md index 2842946aa8cd2..3364707897627 100644 --- a/TOC-tidb-cloud-premium.md +++ b/TOC-tidb-cloud-premium.md @@ -210,6 +210,7 @@ - Import Data into TiDB Cloud - [Import Sample Data (SQL Files) from Cloud Storage](/tidb-cloud/import-sample-data-serverless.md) - [Import CSV Files from Cloud Storage](/tidb-cloud/premium/import-csv-files-premium.md) + - [Import CSV Files from Amazon S3](/tidb-cloud/premium/import-from-s3-premium.md) - [Import Parquet Files from Cloud Storage](/tidb-cloud/import-parquet-files-serverless.md) - [Import Snapshot Files from Cloud Storage](/tidb-cloud/import-snapshot-files-serverless.md) - [Import Data Using MySQL CLI](/tidb-cloud/premium/import-with-mysql-cli-premium.md) diff --git a/tidb-cloud/premium/import-from-s3-premium.md b/tidb-cloud/premium/import-from-s3-premium.md index 9434a3829d34e..8d7e5f5a7f069 100644 --- a/tidb-cloud/premium/import-from-s3-premium.md +++ b/tidb-cloud/premium/import-from-s3-premium.md @@ -1,65 +1,66 @@ --- -title: Import data from Amazon S3 into TiDB Cloud Premium -summary: Learn how to import CSV files from Amazon S3 into TiDB Cloud Premium instances using the console wizard. +title: Import Data from Amazon S3 into {{{ .premium }}} +summary: Learn how to import CSV files from Amazon S3 into {{{ .premium }}} instances using the console wizard. --- +# Import Data from Amazon S3 into {{{ .premium }}} + +This document describes how to import CSV files from Amazon Simple Storage Service (Amazon S3) into {{{ .premium }}} instances. The steps reflect the current private preview user interface and serve as an initial framework for the upcoming public preview launch. + > **Warning:** > -> TiDB Cloud Premium is currently available in **Private Preview** in select AWS regions. +> {{{ .premium }}} is currently available in **private preview** in select AWS regions. > -> If Premium is not yet enabled for your organization, or if you need access in another cloud provider or region, click **Support** in the lower-left corner of the [TiDB Cloud console](https://tidbcloud.com/), or submit a request through the [Contact Us form](https://www.pingcap.com/contact-us) on our website. - -# Import data from Amazon S3 into TiDB Cloud Premium - -This document describes how to import CSV files from Amazon Simple Storage Service (Amazon S3) into TiDB Cloud Premium instances. The steps mirror the current Private Preview user interface and are intended as an initial scaffold for the public preview launch. +> If Premium is not yet enabled for your organization, or if you need access in another cloud provider or region, click **Support** in the lower-left corner of the [TiDB Cloud console](https://tidbcloud.com/), or submit a request through the [Contact Us](https://www.pingcap.com/contact-us) form on the website. -> **Note:** +> **Tip:** > -> For TiDB Cloud Serverless or Essential, see [Import CSV files from cloud storage into TiDB Cloud](/tidb-cloud/import-csv-files-serverless.md). For TiDB Cloud Dedicated, see [Import CSV files from cloud storage into TiDB Cloud Dedicated](/tidb-cloud/import-csv-files.md). +> - For {{{ .starter }}} or Essential, see [Import CSV Files from Cloud Storage into {{{ .starter }}} or Essential](/tidb-cloud/import-csv-files-serverless.md). +> - For {{{ .dedicated }}}, see [Import CSV Files from Cloud Storage into {{{ .dedicated }}}](/tidb-cloud/import-csv-files.md). ## Limitations -- To ensure data consistency, TiDB Cloud Premium allows importing CSV files into empty tables only. If the target table already contains data, import into a staging table and then copy the rows with `INSERT ... SELECT`. -- During the preview, the UI only surfaces Amazon S3 as the storage provider. Support for additional providers will be tracked separately. -- Each import job maps one source pattern to a single destination table. +- To ensure data consistency, {{{ .premium }}} allows importing CSV files into empty tables only. If the target table already contains data, import into a staging table and then copy the rows using the `INSERT ... SELECT` statement. +- During the private preview, the user interface currently supports Amazon S3 as the only storage provider. Support for additional providers will be added in future releases. +- Each import job maps a single source pattern to one destination table. ## Step 1. Prepare the CSV files -1. If a CSV file is larger than 256 MB, consider splitting it into smaller files around 256 MB so TiDB Cloud Premium can process them in parallel. -2. Name the CSV files following Dumpling conventions: - - Full-table files use the `${db_name}.${table_name}.csv` format. - - Sharded files append numeric suffixes, such as `${db_name}.${table_name}.000001.csv`. - - Compressed files use `${db_name}.${table_name}.${suffix}.csv.${compress}`. -3. Optional schema files (`${db_name}-schema-create.sql`, `${db_name}.${table_name}-schema.sql`) help TiDB Cloud Premium create databases and tables automatically. +1. If a CSV file is larger than 256 MiB, consider splitting it into smaller files around 256 MiB so {{{ .premium }}} can process them in parallel. +2. Name your CSV files according to the Dumpling naming conventions: + - Full-table files: use the `${db_name}.${table_name}.csv` format. + - Sharded files: append numeric suffixes, such as `${db_name}.${table_name}.000001.csv`. + - Compressed files: use the `${db_name}.${table_name}.${suffix}.csv.${compress}` format. +3. Optional schema files (`${db_name}-schema-create.sql`, `${db_name}.${table_name}-schema.sql`) help {{{ .premium }}} create databases and tables automatically. - + ## Step 2. Create target schemas (optional) -If you want TiDB Cloud Premium to create the databases and tables for you, place the schema files generated by Dumpling in the same S3 directory. Otherwise, create the objects manually in TiDB Cloud Premium before running the import. +If you want {{{ .premium }}} to create the databases and tables automatically, place the schema files generated by Dumpling in the same S3 directory. Otherwise, create the databases and tables manually in {{{ .premium }}} before running the import. -## Step 3. Configure Amazon S3 access +## Step 3. Configure access to Amazon S3 -To allow TiDB Cloud Premium to read your bucket: +To allow {{{ .premium }}} to read your bucket, use either of the following methods: -- Provide an AWS Role ARN that trusts TiDB Cloud and grants `s3:GetObject` and `s3:ListBucket` on the relevant paths, or +- Provide an AWS Role ARN that trusts TiDB Cloud and grants the `s3:GetObject` and `s3:ListBucket` permissions on the relevant paths. - Provide an AWS access key (access key ID and secret access key) with equivalent permissions. -The wizard includes a helper link labeled **Click here to create a new one with AWS CloudFormation**. Follow that link if you need TiDB Cloud Premium to pre-fill a CloudFormation stack that creates the role for you. +The wizard includes a helper link labeled **Click here to create a new one with AWS CloudFormation**. Follow this link if you need {{{ .premium }}} to pre-fill a CloudFormation stack that creates the role for you. ## Step 4. Import CSV files from Amazon S3 -1. Open the Premium workspace in the TiDB Cloud console and select your instance. -2. Go to **Data > Import** and click **Import data from Cloud Storage**. -3. On the **Source Connection** tab: - - Set **Storage Provider** to **Amazon S3**. - - Enter the **Source Files URI** for a single file (`s3://bucket/path/file.csv`) or for a folder (`s3://bucket/path/`). - - Choose **AWS Role ARN** or **AWS Access Key** and provide the credentials. - - Click **Test Bucket Access** to validate connectivity. - Known preview issue: the button returns to the idle state without a success toast. -4. Click **Next** and supply the TiDB SQL username and password for the import job. Optionally test the connection. +1. In the [TiDB Cloud console](https://tidbcloud.com/tidbs), navigate to the [**TiDB Instances**](https://tidbcloud.com/tidbs) page, and then click the name of your TiDB instance. +2. In the left navigation pane, click **Data** > **Import**, and choose **Import data from Cloud Storage**. +3. In the **Source Connection** dialog: + - Set **Storage Provider** to **Amazon S3**. + - Enter the **Source Files URI** for a single file (`s3://bucket/path/file.csv`) or for a folder (`s3://bucket/path/`). + - Choose **AWS Role ARN** or **AWS Access Key** and provide the credentials. + - Click **Test Bucket Access** to validate connectivity. + +4. Click **Next** and provide the TiDB SQL username and password for the import job. Optionally, test the connection. 5. Review the automatically generated source-to-target mapping. Disable automatic mapping if you need to define custom patterns and destination tables. 6. Click **Next** to run the pre-check. Resolve any warnings about missing files or incompatible schemas. 7. Click **Start Import** to launch the job group. @@ -67,11 +68,11 @@ The wizard includes a helper link labeled **Click here to create a new one with ## Troubleshooting -- If the pre-check finds zero files, confirm the S3 path and IAM permissions. -- If jobs remain in **Preparing**, make sure the destination tables are empty and the required schema files exist. +- If the pre-check reports zero files, verify the S3 path and IAM permissions. +- If jobs remain in **Preparing**, ensure that the destination tables are empty and the required schema files exist. - Use the **Cancel** action to stop a job group if you need to adjust mappings or credentials. ## Next steps -- [Import data into TiDB Cloud Premium via MySQL CLI](/tidb-cloud/premium/import-from-mysql-premium.md) for scripted imports. -- [Troubleshoot import access denied errors](/tidb-cloud/troubleshoot-import-access-denied-error.md) for IAM-related problems. +- See [Import Data into {{{ .premium }}} using the MySQL Command-Line Client](/tidb-cloud/premium/import-with-mysql-cli-premium.md) for scripted imports. +- See [Troubleshoot Access Denied Errors during Data Import from Amazon S3](/tidb-cloud/troubleshoot-import-access-denied-error.md) for IAM-related problems.