diff --git a/TOC-tidb-cloud-premium.md b/TOC-tidb-cloud-premium.md index b797bf126cbf5..3364707897627 100644 --- a/TOC-tidb-cloud-premium.md +++ b/TOC-tidb-cloud-premium.md @@ -204,15 +204,16 @@ - Migrate or Import Data - [Overview](/tidb-cloud/tidb-cloud-migration-overview.md) - Migrate Data into TiDB Cloud - - [Migrate from TiDB Self-Managed to TiDB Cloud](/tidb-cloud/migrate-from-op-tidb.md) + - [Migrate from TiDB Self-Managed to TiDB Cloud Premium](/tidb-cloud/premium/migrate-from-op-tidb-premium.md) - [Migrate and Merge MySQL Shards of Large Datasets](/tidb-cloud/migrate-sql-shards.md) - [Migrate from Amazon RDS for Oracle Using AWS DMS](/tidb-cloud/migrate-from-oracle-using-aws-dms.md) - Import Data into TiDB Cloud - [Import Sample Data (SQL Files) from Cloud Storage](/tidb-cloud/import-sample-data-serverless.md) - - [Import CSV Files from Cloud Storage](/tidb-cloud/import-csv-files-serverless.md) + - [Import CSV Files from Cloud Storage](/tidb-cloud/premium/import-csv-files-premium.md) + - [Import CSV Files from Amazon S3](/tidb-cloud/premium/import-from-s3-premium.md) - [Import Parquet Files from Cloud Storage](/tidb-cloud/import-parquet-files-serverless.md) - [Import Snapshot Files from Cloud Storage](/tidb-cloud/import-snapshot-files-serverless.md) - - [Import with MySQL CLI](/tidb-cloud/import-with-mysql-cli-serverless.md) + - [Import Data Using MySQL CLI](/tidb-cloud/premium/import-with-mysql-cli-premium.md) - Reference - [Configure External Storage Access for TiDB Cloud](/tidb-cloud/serverless-external-storage.md) - [Naming Conventions for Data Import](/tidb-cloud/naming-conventions-for-data-import.md) diff --git a/tidb-cloud/import-csv-files-serverless.md b/tidb-cloud/import-csv-files-serverless.md index 8b8b55abdbc94..0a8c4f0212dcd 100644 --- a/tidb-cloud/import-csv-files-serverless.md +++ b/tidb-cloud/import-csv-files-serverless.md @@ -13,13 +13,13 @@ This document describes how to import CSV files from Amazon Simple Storage Servi ## Limitations -- To ensure data consistency, TiDB Cloud allows to import CSV files into empty tables only. To import data into an existing table that already contains data, you can import the data into a temporary empty table by following this document, and then use the `INSERT SELECT` statement to copy the data to the target existing table. +- To ensure data consistency, TiDB Cloud allows importing CSV files into empty tables only. To import data into an existing table that already contains data, you can import the data into a temporary empty table by following this document, and then use the `INSERT SELECT` statement to copy the data to the target existing table. ## Step 1. Prepare the CSV files -1. If a CSV file is larger than 256 MB, consider splitting it into smaller files, each with a size around 256 MB. +1. If a CSV file is larger than 256 MiB, consider splitting it into smaller files, each with a size around 256 MiB. - TiDB Cloud supports importing very large CSV files but performs best with multiple input files around 256 MB in size. This is because TiDB Cloud can process multiple files in parallel, which can greatly improve the import speed. + TiDB Cloud supports importing very large CSV files but performs best with multiple input files around 256 MiB in size. This is because TiDB Cloud can process multiple files in parallel, which can greatly improve the import speed. 2. Name the CSV files as follows: diff --git a/tidb-cloud/import-csv-files.md b/tidb-cloud/import-csv-files.md index c72e8a89b2549..a68eac4546532 100644 --- a/tidb-cloud/import-csv-files.md +++ b/tidb-cloud/import-csv-files.md @@ -10,15 +10,15 @@ This document describes how to import CSV files from Amazon Simple Storage Servi ## Limitations -- To ensure data consistency, TiDB Cloud allows to import CSV files into empty tables only. To import data into an existing table that already contains data, you can use TiDB Cloud to import the data into a temporary empty table by following this document, and then use the `INSERT SELECT` statement to copy the data to the target existing table. +- To ensure data consistency, TiDB Cloud allows importing CSV files into empty tables only. To import data into an existing table that already contains data, you can use TiDB Cloud to import the data into a temporary empty table by following this document, and then use the `INSERT SELECT` statement to copy the data to the target existing table. - If a TiDB Cloud Dedicated cluster has a [changefeed](/tidb-cloud/changefeed-overview.md) or has [Point-in-time Restore](/tidb-cloud/backup-and-restore.md#turn-on-point-in-time-restore) enabled, you cannot import data to the cluster (the **Import Data** button will be disabled) because the current data import feature uses the [physical import mode](https://docs.pingcap.com/tidb/stable/tidb-lightning-physical-import-mode). In this mode, the imported data does not generate change logs, so the changefeed and Point-in-time Restore cannot detect the imported data. ## Step 1. Prepare the CSV files -1. If a CSV file is larger than 256 MB, consider splitting it into smaller files, each with a size of around 256 MB. +1. If a CSV file is larger than 256 MiB, consider splitting it into smaller files, each with a size of around 256 MiB. - TiDB Cloud supports importing very large CSV files but performs best with multiple input files around 256 MB in size. This is because TiDB Cloud can process multiple files in parallel, which can greatly improve the import speed. + TiDB Cloud supports importing very large CSV files but performs best with multiple input files around 256 MiB in size. This is because TiDB Cloud can process multiple files in parallel, which can greatly improve the import speed. 2. Name the CSV files as follows: diff --git a/tidb-cloud/import-with-mysql-cli-serverless.md b/tidb-cloud/import-with-mysql-cli-serverless.md index 5de46158802d9..9d25b8416f6c3 100644 --- a/tidb-cloud/import-with-mysql-cli-serverless.md +++ b/tidb-cloud/import-with-mysql-cli-serverless.md @@ -53,7 +53,7 @@ INSERT INTO products (product_id, product_name, price) VALUES (3, 'Tablet', 299.99); ``` -## Step 3. Import data from a SQL or CSV file +## Step 3. Import data from an SQL or CSV file You can import data from an SQL file or a CSV file. The following sections provide step-by-step instructions for importing data from each type. diff --git a/tidb-cloud/import-with-mysql-cli.md b/tidb-cloud/import-with-mysql-cli.md index a7e0057918f2a..22e3e800069e8 100644 --- a/tidb-cloud/import-with-mysql-cli.md +++ b/tidb-cloud/import-with-mysql-cli.md @@ -49,7 +49,7 @@ INSERT INTO products (product_id, product_name, price) VALUES (3, 'Tablet', 299.99); ``` -## Step 3. Import data from a SQL or CSV file +## Step 3. Import data from an SQL or CSV file You can import data from an SQL file or a CSV file. The following sections provide step-by-step instructions for importing data from each type. diff --git a/tidb-cloud/migrate-from-op-tidb.md b/tidb-cloud/migrate-from-op-tidb.md index 68d8a3d6619d0..a104ed292ca21 100644 --- a/tidb-cloud/migrate-from-op-tidb.md +++ b/tidb-cloud/migrate-from-op-tidb.md @@ -5,7 +5,7 @@ summary: Learn how to migrate data from TiDB Self-Managed to TiDB Cloud. # Migrate from TiDB Self-Managed to TiDB Cloud -This document describes how to migrate data from your TiDB Self-Managed clusters to TiDB Cloud (AWS) through Dumpling and TiCDC. +This document describes how to migrate data from your TiDB Self-Managed clusters to TiDB Cloud (on AWS) through Dumpling and TiCDC. The overall procedure is as follows: @@ -13,7 +13,7 @@ The overall procedure is as follows: 2. Migrate full data. The process is as follows: 1. Export data from TiDB Self-Managed to Amazon S3 using Dumpling. 2. Import data from Amazon S3 to TiDB Cloud. -3. Replicate incremental data by using TiCDC. +3. Replicate incremental data using TiCDC. 4. Verify the migrated data. ## Prerequisites diff --git a/tidb-cloud/premium/import-csv-files-premium.md b/tidb-cloud/premium/import-csv-files-premium.md new file mode 100644 index 0000000000000..358157167ca4d --- /dev/null +++ b/tidb-cloud/premium/import-csv-files-premium.md @@ -0,0 +1,224 @@ +--- +title: Import CSV Files from Cloud Storage into {{{ .premium }}} +summary: Learn how to import CSV files from Amazon S3 or Alibaba Cloud Object Storage Service (OSS) into {{{ .premium }}} instances. +--- + +# Import CSV Files from Cloud Storage into {{{ .premium }}} + +This document describes how to import CSV files from Amazon Simple Storage Service (Amazon S3) or Alibaba Cloud Object Storage Service (OSS) into {{{ .premium }}} instances. + +> **Warning:** +> +> {{{ .premium }}} is currently available in **private preview** in select AWS regions. +> +> If Premium is not yet enabled for your organization, or if you need access in another cloud provider or region, click **Support** in the lower-left corner of the [TiDB Cloud console](https://tidbcloud.com/), or submit a request through the [Contact Us](https://www.pingcap.com/contact-us) form on the website. + +> **Tip:** +> +> - For {{{ .starter }}} or Essential, see [Import CSV Files from Cloud Storage into {{{ .starter }}} or Essential](/tidb-cloud/import-csv-files-serverless.md). +> - For {{{ .dedicated }}}, see [Import CSV Files from Cloud Storage into {{{ .dedicated }}}](/tidb-cloud/import-csv-files.md). + +## Limitations + +To ensure data consistency, {{{ .premium }}} allows importing CSV files into empty tables only. To import data into an existing table that already contains data, you can import the data into a temporary empty table by following this document, and then use the `INSERT SELECT` statement to copy the data to the target existing table. + +## Step 1. Prepare the CSV files + +1. If a CSV file is larger than 256 MiB, consider splitting it into smaller files, each with a size around 256 MiB. + + {{{ .premium }}} supports importing very large CSV files but performs best with multiple input files around 256 MiB in size. This is because {{{ .premium }}} can process multiple files in parallel, which can greatly improve the import speed. + +2. Name the CSV files as follows: + + - If a CSV file contains all data of an entire table, name the file in the `${db_name}.${table_name}.csv` format, which maps to the `${db_name}.${table_name}` table when you import the data. + - If the data of one table is separated into multiple CSV files, append a numeric suffix to these CSV files. For example, `${db_name}.${table_name}.000001.csv` and `${db_name}.${table_name}.000002.csv`. The numeric suffixes can be non-consecutive but must be in ascending order. You also need to add extra zeros before the number to ensure that all suffixes have the same length. + - {{{ .premium }}} supports importing compressed files in the following formats: `.gzip`, `.gz`, `.zstd`, `.zst` and `.snappy`. If you want to import compressed CSV files, name the files in the `${db_name}.${table_name}.${suffix}.csv.${compress}` format, where `${suffix}` is optional and can be any integer such as '000001'. For example, if you want to import the `trips.000001.csv.gz` file to the `bikeshare.trips` table, you need to rename the file as `bikeshare.trips.000001.csv.gz`. + + > **Note:** + > + > - To achieve better performance, it is recommended to limit the size of each compressed file to 100 MiB. + > - The Snappy compressed file must be in the [official Snappy format](https://github.com/google/snappy). Other variants of Snappy compression are not supported. + > - For uncompressed files, if you cannot update the CSV filenames according to the preceding rules in some cases (for example, the CSV file links are also used by your other programs), you can keep the filenames unchanged and use the **Mapping Settings** in [Step 4](#step-4-import-csv-files) to import your source data to a single target table. + +## Step 2. Create the target table schemas + +Because CSV files do not contain schema information, before importing data from CSV files into {{{ .premium }}}, you need to create the table schemas using either of the following methods: + +- Method 1: In {{{ .premium }}}, create the target databases and tables for your source data. + +- Method 2: In the Amazon S3 or Alibaba Cloud Object Storage Service (OSS) directory where the CSV files are located, create the target table schema files for your source data as follows: + + 1. Create database schema files for your source data. + + If your CSV files follow the naming rules in [Step 1](#step-1-prepare-the-csv-files), the database schema files are optional for the data import. Otherwise, the database schema files are mandatory. + + Each database schema file must be in the `${db_name}-schema-create.sql` format and contain a `CREATE DATABASE` DDL statement. With this file, {{{ .premium }}} will create the `${db_name}` database to store your data when you import the data. + + For example, if you create a `mydb-schema-create.sql` file that contains the following statement, {{{ .premium }}} will create the `mydb` database when you import the data. + + ```sql + CREATE DATABASE mydb; + ``` + + 2. Create table schema files for your source data. + + If you do not include the table schema files in the Amazon S3 or Alibaba Cloud Object Storage Service directory where the CSV files are located, {{{ .premium }}} will not create the corresponding tables for you when you import the data. + + Each table schema file must be in the `${db_name}.${table_name}-schema.sql` format and contain a `CREATE TABLE` DDL statement. With this file, {{{ .premium }}} will create the `${table_name}` table in the `${db_name}` database when you import the data. + + For example, if you create a `mydb.mytable-schema.sql` file that contains the following statement, {{{ .premium }}} will create the `mytable` table in the `mydb` database when you import the data. + + ```sql + CREATE TABLE mytable ( + ID INT, + REGION VARCHAR(20), + COUNT INT ); + ``` + + > **Note:** + > + > Each `${db_name}.${table_name}-schema.sql` file should only contain a single DDL statement. If the file contains multiple DDL statements, only the first one takes effect. + +## Step 3. Configure cross-account access + +To allow {{{ .premium }}} to access the CSV files in Amazon S3 or Alibaba Cloud Object Storage Service (OSS), do one of the following: + +- If your CSV files are located in Amazon S3, [configure Amazon S3 access](/tidb-cloud/serverless-external-storage.md#configure-amazon-s3-access) for your TiDB instance. + + You can use either an AWS access key or a Role ARN to access your bucket. Once finished, make a note of the access key (including the access key ID and secret access key) or the Role ARN value as you will need it in [Step 4](#step-4-import-csv-files). + +- If your CSV files are located in Alibaba Cloud Object Storage Service (OSS), [configure Alibaba Cloud Object Storage Service (OSS) access](/tidb-cloud/serverless-external-storage.md#configure-alibaba-cloud-object-storage-service-oss-access) for your TiDB instance. + +## Step 4. Import CSV files + +To import the CSV files to {{{ .premium }}}, take the following steps: + + +
+ +1. Open the **Import** page for your target TiDB instance. + + 1. Log in to the [TiDB Cloud console](https://tidbcloud.com/) and navigate to the [**TiDB Instances**](https://tidbcloud.com/tidbs) page. + + > **Tip:** + > + > You can use the combo box in the upper-left corner to switch between organizations and instances. + + 2. Click the name of your target TiDB instance to go to its overview page, and then click **Data** > **Import** in the left navigation pane. + +2. Click **Import data from Cloud Storage**. + +3. On the **Import Data from Cloud Storage** page, provide the following information: + + - **Storage Provider**: select **Amazon S3**. + - **Source Files URI**: + - When importing one file, enter the source file URI in the following format `s3://[bucket_name]/[data_source_folder]/[file_name].csv`. For example, `s3://sampledata/ingest/TableName.01.csv`. + - When importing multiple files, enter the source folder URI in the following format `s3://[bucket_name]/[data_source_folder]/`. For example, `s3://sampledata/ingest/`. + - **Credential**: you can use either an AWS Role ARN or an AWS access key to access your bucket. For more information, see [Configure Amazon S3 access](/tidb-cloud/serverless-external-storage.md#configure-amazon-s3-access). + - **AWS Role ARN**: enter the AWS Role ARN value. If you need to create a new role, click **Click here to create a new one with AWS CloudFormation** and follow the guided steps to launch the provided template, acknowledge the IAM warning, create the stack, and copy the generated ARN back into {{{ .premium }}}. + - **AWS Access Key**: enter the AWS access key ID and AWS secret access key. + - **Test Bucket Access**: click this button after the credentials are in place to confirm that {{{ .premium }}} can reach the bucket. + - **Target Connection**: provide the TiDB username and password that will run the import. Optionally, click **Test Connection** to validate the credentials. + +4. Click **Next**. + +5. In the **Source Files Mapping** section, {{{ .premium }}} scans the bucket and proposes mappings between the source files and destination tables. + + When a directory is specified in **Source Files URI**, the **Use [File naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) for automatic mapping** option is selected by default. + + > **Note:** + > + > When a single file is specified in **Source Files URI**, the **Use [File naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) for automatic mapping** option is not displayed, and {{{ .premium }}} automatically populates the **Source** field with the file name. In this case, you only need to select the target database and table for data import. + + - Leave automatic mapping enabled to apply the [file naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) to your source files and target tables. Keep **CSV** selected as the data format. + + - **Advanced options**: expand the panel to view the `Ignore compatibility checks (advanced)` toggle. Leave it disabled unless you intentionally want to bypass schema compatibility validation. + + + > **Note:** + > + > Manual mapping is coming soon. When the toggle becomes available, clear the automatic mapping option and configure the mapping manually: + > + > - **Source**: enter a filename pattern such as `TableName.01.csv`. Wildcards `*` and `?` are supported (for example, `my-data*.csv`). + > - **Target Database** and **Target Table**: choose the destination objects for the matched files. + +6. {{{ .premium }}} automatically scans the source path. Review the scan results, check the data files found and corresponding target tables, and then click **Start Import**. + +7. When the import progress shows **Completed**, check the imported tables. + +
+ +
+ +1. Open the **Import** page for your target TiDB instance. + + 1. Log in to the [TiDB Cloud console](https://tidbcloud.com/) and navigate to the [**TiDB Instances**](https://tidbcloud.com/tidbs) page of your project. + + > **Tip:** + > + > You can use the combo box in the upper-left corner to switch between organizations and instances. + + 2. Click the name of your target TiDB instance to go to its overview page, and then click **Data** > **Import** in the left navigation pane. + +2. Click **Import data from Cloud Storage**. + +3. On the **Import Data from Cloud Storage** page, provide the following information: + + - **Storage Provider**: select **Alibaba Cloud OSS**. + - **Source Files URI**: + - When importing one file, enter the source file URI in the following format `oss://[bucket_name]/[data_source_folder]/[file_name].csv`. For example, `oss://sampledata/ingest/TableName.01.csv`. + - When importing multiple files, enter the source folder URI in the following format `oss://[bucket_name]/[data_source_folder]/`. For example, `oss://sampledata/ingest/`. + - **Credential**: you can use an AccessKey pair to access your bucket. For more information, see [Configure Alibaba Cloud Object Storage Service (OSS) access](/tidb-cloud/serverless-external-storage.md#configure-alibaba-cloud-object-storage-service-oss-access). + - **Test Bucket Access**: click this button after the credentials are in place to confirm that {{{ .premium }}} can reach the bucket. + - **Target Connection**: provide the TiDB username and password that will run the import. Optionally, click **Test Connection** to validate the credentials. + +4. Click **Next**. + +5. In the **Source Files Mapping** section, {{{ .premium }}} scans the bucket and proposes mappings between the source files and destination tables. + + When a directory is specified in **Source Files URI**, the **Use [File naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) for automatic mapping** option is selected by default. + + > **Note:** + > + > When a single file is specified in **Source Files URI**, the **Use [File naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) for automatic mapping** option is not displayed, and {{{ .premium }}} automatically populates the **Source** field with the file name. In this case, you only need to select the target database and table for data import. + + - Leave automatic mapping enabled to apply the [file naming conventions](/tidb-cloud/naming-conventions-for-data-import.md) to your source files and target tables. Keep **CSV** selected as the data format. + + - **Advanced options**: expand the panel to view the `Ignore compatibility checks (advanced)` toggle. Leave it disabled unless you intentionally want to bypass schema compatibility validation. + + + > **Note:** + > + > Manual mapping is coming soon. When the toggle becomes available, clear the automatic mapping option and configure the mapping manually: + > + > - **Source**: enter a filename pattern such as `TableName.01.csv`. Wildcards `*` and `?` are supported (for example, `my-data*.csv`). + > - **Target Database** and **Target Table**: choose the destination objects for the matched files. + +6. {{{ .premium }}} automatically scans the source path. Review the scan results, check the data files found and corresponding target tables, and then click **Start Import**. + +7. When the import progress shows **Completed**, check the imported tables. + +
+ +
+ +When you run an import task, if any unsupported or invalid conversions are detected, {{{ .premium }}} terminates the import job automatically and reports an importing error. + +If you get an importing error, do the following: + +1. Drop the partially imported table. +2. Check the table schema file. If there are any errors, correct the table schema file. +3. Check the data types in the CSV files. +4. Try the import task again. + +## Troubleshooting + +### Resolve warnings during data import + +After clicking **Start Import**, if you see a warning message such as `can't find the corresponding source files`, resolve this by providing the correct source file, renaming the existing one according to [Naming Conventions for Data Import](/tidb-cloud/naming-conventions-for-data-import.md), or using **Advanced Settings** to make changes. + +After resolving these issues, you need to import the data again. + +### Zero rows in the imported tables + +After the import progress shows **Completed**, check the imported tables. If the number of rows is zero, it means no data files matched the Bucket URI that you entered. In this case, resolve this issue by providing the correct source file, renaming the existing one according to [Naming Conventions for Data Import](/tidb-cloud/naming-conventions-for-data-import.md), or using **Advanced Settings** to make changes. After that, import those tables again. diff --git a/tidb-cloud/premium/import-from-s3-premium.md b/tidb-cloud/premium/import-from-s3-premium.md new file mode 100644 index 0000000000000..8d7e5f5a7f069 --- /dev/null +++ b/tidb-cloud/premium/import-from-s3-premium.md @@ -0,0 +1,78 @@ +--- +title: Import Data from Amazon S3 into {{{ .premium }}} +summary: Learn how to import CSV files from Amazon S3 into {{{ .premium }}} instances using the console wizard. +--- + +# Import Data from Amazon S3 into {{{ .premium }}} + +This document describes how to import CSV files from Amazon Simple Storage Service (Amazon S3) into {{{ .premium }}} instances. The steps reflect the current private preview user interface and serve as an initial framework for the upcoming public preview launch. + +> **Warning:** +> +> {{{ .premium }}} is currently available in **private preview** in select AWS regions. +> +> If Premium is not yet enabled for your organization, or if you need access in another cloud provider or region, click **Support** in the lower-left corner of the [TiDB Cloud console](https://tidbcloud.com/), or submit a request through the [Contact Us](https://www.pingcap.com/contact-us) form on the website. + +> **Tip:** +> +> - For {{{ .starter }}} or Essential, see [Import CSV Files from Cloud Storage into {{{ .starter }}} or Essential](/tidb-cloud/import-csv-files-serverless.md). +> - For {{{ .dedicated }}}, see [Import CSV Files from Cloud Storage into {{{ .dedicated }}}](/tidb-cloud/import-csv-files.md). + +## Limitations + +- To ensure data consistency, {{{ .premium }}} allows importing CSV files into empty tables only. If the target table already contains data, import into a staging table and then copy the rows using the `INSERT ... SELECT` statement. +- During the private preview, the user interface currently supports Amazon S3 as the only storage provider. Support for additional providers will be added in future releases. +- Each import job maps a single source pattern to one destination table. + +## Step 1. Prepare the CSV files + +1. If a CSV file is larger than 256 MiB, consider splitting it into smaller files around 256 MiB so {{{ .premium }}} can process them in parallel. +2. Name your CSV files according to the Dumpling naming conventions: + - Full-table files: use the `${db_name}.${table_name}.csv` format. + - Sharded files: append numeric suffixes, such as `${db_name}.${table_name}.000001.csv`. + - Compressed files: use the `${db_name}.${table_name}.${suffix}.csv.${compress}` format. +3. Optional schema files (`${db_name}-schema-create.sql`, `${db_name}.${table_name}-schema.sql`) help {{{ .premium }}} create databases and tables automatically. + + + +## Step 2. Create target schemas (optional) + +If you want {{{ .premium }}} to create the databases and tables automatically, place the schema files generated by Dumpling in the same S3 directory. Otherwise, create the databases and tables manually in {{{ .premium }}} before running the import. + +## Step 3. Configure access to Amazon S3 + +To allow {{{ .premium }}} to read your bucket, use either of the following methods: + +- Provide an AWS Role ARN that trusts TiDB Cloud and grants the `s3:GetObject` and `s3:ListBucket` permissions on the relevant paths. +- Provide an AWS access key (access key ID and secret access key) with equivalent permissions. + +The wizard includes a helper link labeled **Click here to create a new one with AWS CloudFormation**. Follow this link if you need {{{ .premium }}} to pre-fill a CloudFormation stack that creates the role for you. + +## Step 4. Import CSV files from Amazon S3 + +1. In the [TiDB Cloud console](https://tidbcloud.com/tidbs), navigate to the [**TiDB Instances**](https://tidbcloud.com/tidbs) page, and then click the name of your TiDB instance. +2. In the left navigation pane, click **Data** > **Import**, and choose **Import data from Cloud Storage**. +3. In the **Source Connection** dialog: + - Set **Storage Provider** to **Amazon S3**. + - Enter the **Source Files URI** for a single file (`s3://bucket/path/file.csv`) or for a folder (`s3://bucket/path/`). + - Choose **AWS Role ARN** or **AWS Access Key** and provide the credentials. + - Click **Test Bucket Access** to validate connectivity. + +4. Click **Next** and provide the TiDB SQL username and password for the import job. Optionally, test the connection. +5. Review the automatically generated source-to-target mapping. Disable automatic mapping if you need to define custom patterns and destination tables. +6. Click **Next** to run the pre-check. Resolve any warnings about missing files or incompatible schemas. +7. Click **Start Import** to launch the job group. +8. Monitor the job statuses until they show **Completed**, then verify the imported data in TiDB Cloud. + +## Troubleshooting + +- If the pre-check reports zero files, verify the S3 path and IAM permissions. +- If jobs remain in **Preparing**, ensure that the destination tables are empty and the required schema files exist. +- Use the **Cancel** action to stop a job group if you need to adjust mappings or credentials. + +## Next steps + +- See [Import Data into {{{ .premium }}} using the MySQL Command-Line Client](/tidb-cloud/premium/import-with-mysql-cli-premium.md) for scripted imports. +- See [Troubleshoot Access Denied Errors during Data Import from Amazon S3](/tidb-cloud/troubleshoot-import-access-denied-error.md) for IAM-related problems. diff --git a/tidb-cloud/premium/import-with-mysql-cli-premium.md b/tidb-cloud/premium/import-with-mysql-cli-premium.md new file mode 100644 index 0000000000000..39185d496ad95 --- /dev/null +++ b/tidb-cloud/premium/import-with-mysql-cli-premium.md @@ -0,0 +1,179 @@ +--- +title: Import Data into {{{ .premium }}} using the MySQL Command-Line Client +summary: Learn how to import small CSV or SQL files into {{{ .premium }}} instances using the MySQL Command-Line Client (`mysql`). +--- + +# Import Data into {{{ .premium }}} using the MySQL Command-Line Client + +This document describes how to import data into {{{ .premium }}} using the [MySQL Command-Line Client](https://dev.mysql.com/doc/refman/8.0/en/mysql.html) (`mysql`). The following sections provide step-by-step instructions for importing data from SQL or CSV files. This process performs a logical import, where the MySQL Command-Line Client replays SQL statements from your local machine against TiDB Cloud. + +> **Warning:** +> +> {{{ .premium }}} is currently available in **private preview** in select AWS regions. +> +> If Premium is not yet enabled for your organization, or if you need access in another cloud provider or region, click **Support** in the lower-left corner of the [TiDB Cloud console](https://tidbcloud.com/), or submit a request through the [Contact Us](https://www.pingcap.com/contact-us) form on the website. + +> **Tip:** +> +> - Logical imports are best suited for relatively small SQL or CSV files. For faster, parallel imports from cloud storage or to process multiple files from [Dumpling](https://docs.pingcap.com/tidb/stable/dumpling-overview) exports, see [Import CSV Files from Cloud Storage into {{{ .premium }}}](/tidb-cloud/premium/import-csv-files-premium.md). +> - For {{{ .starter }}} or Essential, see [Import Data into {{{ .starter }}} or Essential via MySQL CLI](/tidb-cloud/import-with-mysql-cli-serverless.md). +> - For {{{ .dedicated }}}, see [Import Data into {{{ .dedicated }}} via MySQL CLI](/tidb-cloud/import-with-mysql-cli.md). + +## Prerequisites + +Before you can import data to a {{{ .premium }}} instance via the MySQL Command-Line Client, you need the following prerequisites: + +- You have access to your {{{ .premium }}} instance. +- Install the MySQL Command-Line Client (`mysql`) on your local computer. + +## Step 1. Connect to your {{{ .premium }}} instance + +Connect to your TiDB instance using the MySQL Command-Line Client. If this is your first time, perform the following steps to configure the network connection and generate the TiDB SQL `root` user password: + +1. Log in to the [TiDB Cloud console](https://tidbcloud.com/) and navigate to the [**TiDB Instances**](https://tidbcloud.com/project/instances) page. Then, click the name of your target instance to go to its overview page. + +2. Click **Connect** in the upper-right corner. A connection dialog is displayed. + +3. Ensure that the configurations in the connection dialog match your operating environment. + + - **Connection Type** is set to `Public`. + - **Connect With** is set to `MySQL CLI`. + - **Operating System** matches your environment. + + > **Note:** + > + > {{{ .premium }}} instances have the public endpoint disabled by default. If you do not see the `Public` option, enable the public endpoint on the instance details page (under the **Network** tab), or ask an organization admin to enable it before proceeding. + +4. Click **Generate Password** to create a random password. If you have already configured a password, reuse that credential or rotate it before proceeding. + +## Step 2. Define the target database and table schema + +Before importing data, create the target table structure that matches your dataset. + +The following is an example SQL file (`products-schema.sql`) that creates a sample database and table. Update the database or table names to match your environment. + +```sql +CREATE DATABASE IF NOT EXISTS test; +USE test; + +CREATE TABLE products ( + product_id INT PRIMARY KEY, + product_name VARCHAR(255), + price DECIMAL(10, 2) +); +``` + +Run the schema file against your {{{ .premium }}} instance so the database and table exist before you load data in the next step. + +## Step 3. Import data from an SQL or CSV file + +Use the MySQL Command-Line Client to load data into the schema you created in Step 2. Replace the placeholders with your own file paths, credentials, and dataset as needed, then follow the workflow that matches your source format. + + +
+ +Do the following to import data from an SQL file: + +1. Provide an SQL file (for example, `products.sql`) that contains the data you want to import. This SQL file must include `INSERT` statements with data, similar to the following: + + ```sql + INSERT INTO products (product_id, product_name, price) VALUES + (1, 'Laptop', 999.99), + (2, 'Smartphone', 499.99), + (3, 'Tablet', 299.99); + ``` + +2. Use the following command to import data from the SQL file: + + ```bash + mysql --comments --connect-timeout 150 \ + -u '' -h -P 4000 -D test \ + --ssl-mode=VERIFY_IDENTITY --ssl-ca= \ + -p < products.sql + ``` + + Replace the placeholder values (for example, ``, ``, ``, ``, and the SQL file name) with your own connection details and file path. + +> **Note:** +> +> The sample schema creates a `test` database and the commands use `-D test`. Change both the schema file and the `-D` parameter if you plan to import into a different database. + + + +The SQL user you authenticate with must have the required privileges (for example, `CREATE` and `INSERT`) to define tables and load data into the target database. + + + +
+
+ +Do the following to import data from a CSV file: + +1. Ensure the target database and table exist in TiDB (for example, the `products` table you created in Step 2). + +2. Provide a sample CSV file (for example, `products.csv`) that contains the data you want to import. The following is an example: + + **products.csv:** + + ```csv + product_id,product_name,price + 1,Laptop,999.99 + 2,Smartphone,499.99 + 3,Tablet,299.99 + ``` + +3. Use the following command to import data from the CSV file: + + ```bash + mysql --comments --connect-timeout 150 \ + -u '' -h -P 4000 -D test \ + --ssl-mode=VERIFY_IDENTITY --ssl-ca= \ + -p \ + -e "LOAD DATA LOCAL INFILE '' INTO TABLE products + FIELDS TERMINATED BY ',' + LINES TERMINATED BY '\n' + IGNORE 1 LINES (product_id, product_name, price);" + ``` + + Replace the placeholder values (for example, ``, ``, ``, ``, ``, and the table name) with your own connection details and dataset paths. + +> **Note:** +> +> For more syntax details about `LOAD DATA LOCAL INFILE`, see [`LOAD DATA`](/sql-statements/sql-statement-load-data.md). + +
+
+ +## Step 4. Validate the imported data + +After the import is complete, run basic queries to verify that the expected rows are present and the data is correct. + +Use the MySQL Command-Line Client to connect to the same database and run validation queries, such as counting rows and inspecting sample records: + +```bash +mysql --comments --connect-timeout 150 \ + -u '' -h -P 4000 -D test \ + --ssl-mode=VERIFY_IDENTITY --ssl-ca= \ + -p \ + -e "SELECT COUNT(*) AS row_count FROM products; \ + SELECT * FROM products ORDER BY product_id LIMIT 5;" +``` + +Expected output (example): + +```text ++-----------+ +| row_count | ++-----------+ +| 3 | ++-----------+ ++------------+---------------+--------+ +| product_id | product_name | price | ++------------+---------------+--------+ +| 1 | Laptop | 999.99 | +| 2 | Smartphone | 499.99 | +| 3 | Tablet | 299.99 | ++------------+---------------+--------+ +``` + +Replace the placeholder values with your own connection details, and adjust the validation queries to suit the shape of your dataset. diff --git a/tidb-cloud/premium/migrate-from-op-tidb-premium.md b/tidb-cloud/premium/migrate-from-op-tidb-premium.md new file mode 100644 index 0000000000000..ec3a6dab98ea4 --- /dev/null +++ b/tidb-cloud/premium/migrate-from-op-tidb-premium.md @@ -0,0 +1,414 @@ +--- +title: Migrate from TiDB Self-Managed to {{{ .premium }}} +summary: Learn how to migrate data from TiDB Self-Managed to {{{ .premium }}}. +--- + +# Migrate from TiDB Self-Managed to {{{ .premium }}} + +This document describes how to migrate data from your TiDB Self-Managed clusters to {{{ .premium }}} (on AWS) instances using Dumpling and TiCDC. + +> **Warning:** +> +> {{{ .premium }}} is currently available in **private preview** in select AWS regions. +> +> If Premium is not yet enabled for your organization, or if you need access in another cloud provider or region, click **Support** in the lower-left corner of the [TiDB Cloud console](https://tidbcloud.com/), or submit a request through the [Contact Us](https://www.pingcap.com/contact-us) form on the website. + +The overall procedure is as follows: + +1. Build the environment and prepare the tools. +2. Migrate full data. The process is as follows: + 1. Export data from TiDB Self-Managed to Amazon S3 using Dumpling. + 2. Import data from Amazon S3 to {{{ .premium }}}. +3. Replicate incremental data using TiCDC. +4. Verify the migrated data. + +## Prerequisites + +It is recommended that you put the S3 bucket and the {{{ .premium }}} instance in the same region. Cross-region migration might incur additional cost for data conversion. + +Before migration, you need to prepare the following: + +- An [AWS account](https://docs.aws.amazon.com/AmazonS3/latest/userguide/setting-up-s3.html#sign-up-for-aws-gsg) with administrator access +- An [AWS S3 bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/creating-bucket.html) +- [A TiDB Cloud account](/tidb-cloud/tidb-cloud-quickstart.md) with at least the [`Project Data Access Read-Write`](/tidb-cloud/manage-user-access.md#user-roles) access to your target {{{ .premium }}} instance hosted on AWS + +## Prepare tools + +You need to prepare the following tools: + +- Dumpling: a data export tool +- TiCDC: a data replication tool + +### Dumpling + +[Dumpling](https://docs.pingcap.com/tidb/dev/dumpling-overview) is a tool that exports data from TiDB or MySQL into SQL or CSV files. You can use Dumpling to export full data from TiDB Self-Managed. + +Before you deploy Dumpling, note the following: + +- It is recommended to deploy Dumpling on a new EC2 instance in the same VPC as your target TiDB instance. +- The recommended EC2 instance type is **c6g.4xlarge** (16 vCPU and 32 GiB memory). You can choose other EC2 instance types based on your needs. The Amazon Machine Image (AMI) can be Amazon Linux, Ubuntu, or Red Hat. + +You can deploy Dumpling by using TiUP or using the installation package. + +#### Deploy Dumpling using TiUP + +Use [TiUP](https://docs.pingcap.com/tidb/stable/tiup-overview) to deploy Dumpling: + +```bash +## Deploy TiUP +curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh +source /root/.bash_profile +## Deploy Dumpling and update to the latest version +tiup install dumpling +tiup update --self && tiup update dumpling +``` + +#### Deploy Dumpling using the installation package + +To deploy Dumpling using the installation package: + +1. Download the [toolkit package](https://docs.pingcap.com/tidb/stable/download-ecosystem-tools). + +2. Extract it to the target machine. You can get Dumpling using TiUP by running `tiup install dumpling`. Then, you can use `tiup dumpling ...` to run Dumpling. For more information, see [Dumpling introduction](https://docs.pingcap.com/tidb/stable/dumpling-overview#dumpling-introduction). + +#### Configure privileges for Dumpling + +You need the following privileges to export data from the upstream database: + +- SELECT +- RELOAD +- LOCK TABLES +- REPLICATION CLIENT +- PROCESS + +### Deploy TiCDC + +You need to [deploy TiCDC](https://docs.pingcap.com/tidb/dev/deploy-ticdc) to replicate incremental data from the upstream TiDB cluster to {{{ .premium }}}. + +1. Confirm whether the current TiDB version supports TiCDC. TiDB v4.0.8.rc.1 and later versions support TiCDC. You can check the TiDB version by executing `select tidb_version();` in the TiDB cluster. If you need to upgrade it, see [Upgrade TiDB Using TiUP](https://docs.pingcap.com/tidb/dev/deploy-ticdc#upgrade-ticdc-using-tiup). + +2. Add the TiCDC component to the TiDB cluster. See [Add or scale out TiCDC to an existing TiDB cluster using TiUP](https://docs.pingcap.com/tidb/dev/deploy-ticdc#add-or-scale-out-ticdc-to-an-existing-tidb-cluster-using-tiup). Edit the `scale-out.yml` file to add TiCDC: + + ```yaml + cdc_servers: + - host: 10.0.1.3 + gc-ttl: 86400 + data_dir: /tidb-data/cdc-8300 + - host: 10.0.1.4 + gc-ttl: 86400 + data_dir: /tidb-data/cdc-8300 + ``` + +3. Add the TiCDC component and check the status. + + ```shell + tiup cluster scale-out scale-out.yml + tiup cluster display + ``` + +## Migrate full data + +To migrate data from the TiDB Self-Managed cluster to {{{ .premium }}}, perform a full data migration as follows: + +1. Migrate data from the TiDB Self-Managed cluster to Amazon S3. +2. Migrate data from Amazon S3 to {{{ .premium }}}. + +### Migrate data from the TiDB Self-Managed cluster to Amazon S3 + +You need to migrate data from the TiDB Self-Managed cluster to Amazon S3 using Dumpling. + +If your TiDB cluster is in a local IDC, or the network between the Dumpling server and Amazon S3 is not connected, you can export the files to the local storage first, and then upload them to Amazon S3 later. + +#### Step 1. Disable the GC mechanism of the upstream TiDB Self-Managed cluster temporarily + +To ensure that newly written data is not lost during incremental migration, you need to disable the upstream cluster's garbage collection (GC) mechanism before starting the migration to prevent the system from cleaning up historical data. + +Run the following command to verify whether the setting is successful. + +```sql +SET GLOBAL tidb_gc_enable = FALSE; +``` + +The following is an example output, in which `0` indicates that it is disabled. + +```sql +SELECT @@global.tidb_gc_enable; ++-------------------------+ +| @@global.tidb_gc_enable | ++-------------------------+ +| 0 | ++-------------------------+ +1 row in set (0.01 sec) +``` + +#### Step 2. Configure access permissions to the Amazon S3 bucket for Dumpling + +Create an access key in the AWS console. See [Create an access key](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html#Using_CreateAccessKey) for details. + +1. Use your AWS account ID or account alias, your IAM user name, and your password to sign in to [the IAM console](https://console.aws.amazon.com/iam/home#/security_credentials). + +2. In the navigation bar on the upper right, choose your user name, and then click **My Security Credentials**. + +3. To create an access key, click **Create access key**. Then choose **Download .csv file** to save the access key ID and secret access key to a CSV file on your computer. Store the file in a secure location. You will not have access to the secret access key again after this dialog box closes. After you download the CSV file, choose **Close**. When you create an access key, the key pair is active by default, and you can use the pair right away. + + ![Create access key](/media/tidb-cloud/op-to-cloud-create-access-key01.png) + + ![Download CSV file](/media/tidb-cloud/op-to-cloud-create-access-key02.png) + +#### Step 3. Export data from the upstream TiDB cluster to Amazon S3 using Dumpling + +Do the following to export data from the upstream TiDB cluster to Amazon S3 using Dumpling: + +1. Configure the environment variables for Dumpling. + + ```shell + export AWS_ACCESS_KEY_ID=${AccessKey} + export AWS_SECRET_ACCESS_KEY=${SecretKey} + ``` + +2. Get the S3 bucket URI and region information from the AWS console. See [Create a bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html) for details. + + The following screenshot shows how to get the S3 bucket URI information: + + ![Get the S3 URI](/media/tidb-cloud/op-to-cloud-copy-s3-uri.png) + + The following screenshot shows how to get the region information: + + ![Get the region information](/media/tidb-cloud/op-to-cloud-copy-region-info.png) + +3. Run Dumpling to export data to the Amazon S3 bucket. + + ```shell + dumpling \ + -u root \ + -P 4000 \ + -h 127.0.0.1 \ + -r 20000 \ + --filetype sql \ + -F 256MiB \ + -t 8 \ + -o "${S3 URI}" \ + --s3.region "${s3.region}" + ``` + + The `-t` option specifies the number of threads for the export. Increasing the number of threads improves the concurrency of Dumpling and the export speed, and also increases the database's memory consumption. Therefore, do not set this parameter to a very large number. + + For more information, see [Dumpling](https://docs.pingcap.com/tidb/stable/dumpling-overview#export-to-sql-files). + +4. Check the export data. Usually the exported data includes the following: + + - `metadata`: this file contains the start time of the export, and the location of the master binary log. + - `{schema}-schema-create.sql`: the SQL file for creating the schema + - `{schema}.{table}-schema.sql`: the SQL file for creating the table + - `{schema}.{table}.{0001}.{sql|csv}`: data files + - `*-schema-view.sql`, `*-schema-trigger.sql`, `*-schema-post.sql`: other exported SQL files + +### Migrate data from Amazon S3 to {{{ .premium }}} + +After you export data from the TiDB Self-Managed cluster to Amazon S3, you need to migrate the data to {{{ .premium }}}. + +1. In the [TiDB Cloud console](https://tidbcloud.com/), get the Account ID and External ID of your target TiDB instance. + + 1. Navigate to the **TiDB Instances** page, and click the name of your target instance. + 2. In the left navigation pane, click **Data** > **Import**. + 3. Choose **Import data from Cloud Storage** > **Amazon S3**. + 4. Note down the **Account ID** and **External ID** displayed in the wizard. These values are embedded in the CloudFormation template. + +2. In the **Source Connection** dialog, select **AWS Role ARN**, then click **Click here to create a new one with AWS CloudFormation**, and follow the on-screen guidance. If your organization cannot launch CloudFormation stacks, see [Manually create the IAM role](#manually-create-the-iam-role-optional). + + 1. Open the pre-filled CloudFormation template in the AWS console. + 2. Provide a role name, review the permissions, and acknowledge the IAM warning. + 3. Create the stack and wait for the status to change to **CREATE_COMPLETE**. + 4. On the **Outputs** tab, copy the newly generated Role ARN. + 5. Return to {{{ .premium }}}, paste the Role ARN, and click **Confirm**. The wizard stores the ARN for subsequent import jobs. + +3. Continue with the remaining steps in the import wizard, and use the saved Role ARN when prompted. + +#### Manually create the IAM role (optional) + +If your organization cannot deploy CloudFormation stacks, create the access policy and IAM role manually: + +1. In AWS IAM, create a policy that grants the following actions on your bucket (and KMS key, if applicable): + + - `s3:GetObject` + - `s3:GetObjectVersion` + - `s3:ListBucket` + - `s3:GetBucketLocation` + - `kms:Decrypt` (only when SSE-KMS encryption is enabled) + + The following JSON template shows the required structure. Replace the placeholders with your bucket path, bucket ARN, and KMS key ARN (if needed). + + ```json + { + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": [ + "s3:GetObject", + "s3:GetObjectVersion" + ], + "Resource": "arn:aws:s3:::" + }, + { + "Effect": "Allow", + "Action": [ + "s3:ListBucket", + "s3:GetBucketLocation" + ], + "Resource": "" + }, + { + "Effect": "Allow", + "Action": [ + "kms:Decrypt" + ], + "Resource": "" + } + ] + } + ``` + +2. Create an IAM role that trusts {{{ .premium }}} by providing the **Account ID** and **External ID** you have noted down earlier. Then, attach the policy created in the previous step to this role. + +3. Copy the resulting Role ARN and enter it in the {{{ .premium }}} import wizard. + +4. Import data to {{{ .premium }}} by following [Import data from Amazon S3 into {{{ .premium }}}](/tidb-cloud/premium/import-from-s3-premium.md). + +## Replicate incremental data + +To replicate incremental data, do the following: + +1. Get the start time of the incremental data migration. For example, you can get it from the metadata file of the full data migration. + + ![Start Time in Metadata](/media/tidb-cloud/start_ts_in_metadata.png) + +2. Grant TiCDC to connect to {{{ .premium }}}. + + 1. In the [TiDB Cloud console](https://tidbcloud.com/tidbs), navigate to the [**TiDB Instances**](https://tidbcloud.com/tidbs) page, and then click the name of your target TiDB instance to go to its overview page. + 2. In the left navigation pane, click **Settings** > **Networking**. + 3. On the **Networking** page, click **Add IP Address**. + 4. In the displayed dialog, select **Use IP addresses**, click **+**, fill in the public IP address of the TiCDC component in the **IP Address** field, and then click **Confirm**. Now TiCDC can access {{{ .premium }}}. For more information, see [Configure an IP Access List](/tidb-cloud/configure-ip-access-list.md). + +3. Get the connection information of the downstream {{{ .premium }}} instance. + + 1. In the [TiDB Cloud console](https://tidbcloud.com/tidbs), navigate to the [**TiDB Instances**](https://tidbcloud.com/tidbs) page, and then click the name of your target TiDB instance to go to its overview page. + 2. Click **Connect** in the upper-right corner. + 3. In the connection dialog, select **Public** from the **Connection Type** drop-down list and select **General** from the **Connect With** drop-down list. + 4. From the connection information, you can get the host IP address and port of the instance. For more information, see [Connect via public connection](/tidb-cloud/connect-via-standard-connection.md). + +4. Create and run the incremental replication task. In the upstream cluster, run the following: + + ```shell + tiup cdc cli changefeed create \ + --pd=http://172.16.6.122:2379 \ + --sink-uri="tidb://root:123456@172.16.6.125:4000" \ + --changefeed-id="upstream-to-downstream" \ + --start-ts="431434047157698561" + ``` + + - `--pd`: the PD address of the upstream cluster. The format is: `[upstream_pd_ip]:[pd_port]` + - `--sink-uri`: the downstream address of the replication task. Configure `--sink-uri` according to the following format. Currently, the scheme supports `mysql`, `tidb`, `kafka`, `s3`, and `local`. + + ```shell + [scheme]://[userinfo@][host]:[port][/path]?[query_parameters] + ``` + + - `--changefeed-id`: the ID of the replication task. The format must match the ^[a-zA-Z0-9]+(\-[a-zA-Z0-9]+)*$ regular expression. If this ID is not specified, TiCDC automatically generates a UUID (the version 4 format) as the ID. + - `--start-ts`: specifies the starting TSO of the changefeed. From this TSO, the TiCDC cluster starts pulling data. The default value is the current time. + + For more information, see [CLI and Configuration Parameters of TiCDC Changefeeds](https://docs.pingcap.com/tidb/dev/ticdc-changefeed-config). + +5. Enable the GC mechanism again in the upstream cluster. If no error or delay is found in incremental replication, enable the GC mechanism to resume garbage collection of the cluster. + + Run the following command to verify whether the setting works. + + ```sql + SET GLOBAL tidb_gc_enable = TRUE; + ``` + + The following is an example output, in which `1` indicates that GC is enabled. + + ```sql + SELECT @@global.tidb_gc_enable; + +-------------------------+ + | @@global.tidb_gc_enable | + +-------------------------+ + | 1 | + +-------------------------+ + 1 row in set (0.01 sec) + ``` + +6. Verify the incremental replication task. + + - If the message "Create changefeed successfully!" is displayed in the output, the replication task is created successfully. + - If the state is `normal`, the replication task is normal. + + ```shell + tiup cdc cli changefeed list --pd=http://172.16.6.122:2379 + ``` + + ![Update Filter](/media/tidb-cloud/normal_status_in_replication_task.png) + + - Verify the replication. Write a new record to the upstream cluster, and then check whether the record is replicated to the downstream {{{ .premium }}} instance. + +7. Set the same timezone for the upstream cluster and downstream instance. By default, {{{ .premium }}} sets the timezone to UTC. If the timezone is different between the upstream cluster and downstream instance, you need to set the same timezone for both. + + 1. In the upstream cluster, run the following command to check the timezone: + + ```sql + SELECT @@global.time_zone; + ``` + + 2. In the downstream instance, run the following command to set the timezone: + + ```sql + SET GLOBAL time_zone = '+08:00'; + ``` + + 3. Check the timezone again to verify the setting: + + ```sql + SELECT @@global.time_zone; + ``` + +8. Back up the [query bindings](/sql-plan-management.md) in the upstream cluster and restore them in the downstream instance. You can use the following query to back up the query bindings: + + ```sql + SELECT DISTINCT(CONCAT('CREATE GLOBAL BINDING FOR ', original_sql,' USING ', bind_sql,';')) FROM mysql.bind_info WHERE status='enabled'; + ``` + + If you do not get any output, it means that no query bindings are used in the upstream cluster. In this case, you can skip this step. + + After you get the query bindings, run them in the downstream instance to restore the query bindings. + +9. Back up the user and privilege information in the upstream cluster and restore them in the downstream instance. You can use the following script to back up the user and privilege information. Note that you need to replace the placeholders with the actual values. + + ```shell + #!/bin/bash + + export MYSQL_HOST={tidb_op_host} + export MYSQL_TCP_PORT={tidb_op_port} + export MYSQL_USER=root + export MYSQL_PWD={root_password} + export MYSQL="mysql -u${MYSQL_USER} --default-character-set=utf8mb4" + + function backup_user_priv(){ + ret=0 + sql="SELECT CONCAT(user,':',host,':',authentication_string) FROM mysql.user WHERE user NOT IN ('root')" + for usr in `$MYSQL -se "$sql"`;do + u=`echo $usr | awk -F ":" '{print $1}'` + h=`echo $usr | awk -F ":" '{print $2}'` + p=`echo $usr | awk -F ":" '{print $3}'` + echo "-- Grants for '${u}'@'${h}';" + [[ ! -z "${p}" ]] && echo "CREATE USER IF NOT EXISTS '${u}'@'${h}' IDENTIFIED WITH 'mysql_native_password' AS '${p}' ;" + $MYSQL -se "SHOW GRANTS FOR '${u}'@'${h}';" | sed 's/$/;/g' + [ $? -ne 0 ] && ret=1 && break + done + return $ret + } + + backup_user_priv + ``` + + After you get the user and privilege information, run the generated SQL statements in the downstream TiDB instance to restore the user and privilege information.