Change Data Capture (CDC)
- Objective: Learn to deploy TiCDC in a TiDB cluster on AWS (with Kubernetes)
- Prerequisites:
- Background knowledge of TiDB components
- Background knowledge of Kubernetes and TiDB Operator
- Background knowledge of TiCDC
- Optionality: Optional
- Estimated time: 30 mins
Deploy Downstream TiDB Cluster
- Optionality: Optional
Change Data Capture requires a second (downstream) cluster for the primary (upstream) cluster to write to.
Please consult the instructions in Create Downstram TiDB Cluster.
Provision TiCDC Nodes
variable "create_cdc_node_pool" {
description = "whether creating node pool for cdc"
default = true
}
variable "cluster_cdc_count" {
default = 3
}
variable "cluster_cdc_instance_type" {
default = "c5.2xlarge"
}
To apply the changes, you can run:
It might take 10 minutes or more to finish the process.
Deploy TiCDC
To deploy TiCDC, you can edit TidbCluster
CR:
In the editor, add the TiCDC specification:
ticdc:
baseImage: pingcap/ticdc
replicas: 3
Once you have save the changes, TiDB operator starts to deploy TiCDC. You can use the following command to check the status of TiCDC pods:
NAME READY STATUS RESTARTS AGE
basic-discovery-6bb656bfd-sps8z 1/1 Running 0 4h7m
basic-pd-0 1/1 Running 0 4h7m
basic-pd-1 1/1 Running 0 4h7m
basic-pd-2 1/1 Running 2 4h7m
basic-ticdc-0 1/1 Running 0 3h15m
basic-ticdc-1 1/1 Running 0 3h15m
basic-ticdc-2 1/1 Running 0 3h15m
basic-tidb-0 2/2 Running 0 4h6m
basic-tidb-1 2/2 Running 0 4h6m
basic-tikv-0 1/1 Running 0 4h7m
basic-tikv-1 1/1 Running 0 4h7m
basic-tikv-2 1/1 Running 0 4h7m
You can use the following command to check the status of TiCDC service:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
basic-discovery ClusterIP 10.108.13.111 <none> 10261/TCP 3h37m
basic-pd ClusterIP 10.103.226.105 <none> 2379/TCP 3h37m
basic-pd-peer ClusterIP None <none> 2380/TCP 3h37m
basic-ticdc-peer ClusterIP None <none> 8301/TCP 165m
basic-tidb ClusterIP 10.108.186.92 <none> 4000/TCP,10080/TCP 3h35m
basic-tidb-peer ClusterIP None <none> 10080/TCP 3h35m
basic-tikv-peer ClusterIP None <none> 20160/TCP 3h36m
You should take notes of the ClusterIP
of basic-pd
and basic-tidb
, which will be used by TiCDC to create changefeed.
Create Changefeed
To create a change feed, you first login to one of the TiCDC pod:
Inside the pod, you can first inspect the TiCDC cluster:
[
{
"id": "391d4695-a4fb-456a-b800-5a07fb1bc9d6",
"is-owner": false,
"address": "basic-ticdc-0.basic-ticdc-peer.demo.svc:8301"
},
{
"id": "659b88a5-0656-47bf-997f-f47956ae9e1e",
"is-owner": true,
"address": "basic-ticdc-2.basic-ticdc-peer.demo.svc:8301"
},
{
"id": "c83b6c55-8293-4613-9f49-73c6142abc75",
"is-owner": false,
"address": "basic-ticdc-1.basic-ticdc-peer.demo.svc:8301"
}
]
To create a changefeed, you can execute the following command:
/cdc cli changefeed create --sink-uri="mysql://root:@{tidb_CLUSTER-IP}:4000/" --pd="http://${pd_CLUSTER-IP}:2379"
Create changefeed successfully!
ID: 145ee6dd-1220-43f2-8d0b-423ab175944f
Info: {"sink-uri":"mysql://root:@10.104.118.45:4000/","opts":{},"create-time":"2020-05-30T19:34:11.4398499Z","start-ts":417036304749166593,"target-ts":0,"admin-job-type":0,"sort-engine":"memory","sort-dir":".","config":{"case-sensitive":true,"filter":{"ignore-txn-start-ts":null,"ddl-white-list":null},"mounter":{"worker-num":16},"sink":{"dispatch-rules":null},"cyclic-replication":{"enable":false,"replica-id":0,"filter-replica-ids":null,"id-buckets":0,"sync-ddl":false}}}
You can check the current in progress processes:
[
{
"changefeed-id": "145ee6dd-1220-43f2-8d0b-423ab175944f",
"capture-id": "e2692613-9aaf-408e-8718-3d710fd2117e"
}
]
Run Sysbench
It is recommended to explore TiCDC with an empty database.
mysql-host=${upstream_tidb_EXTERNAL-IP}
mysql-port=4000
mysql-user=root
mysql-db=cdc
time=1200
threads=8
report-interval=10
db-driver=mysql
To prepare data, you can run the following command:
Verify Data
Verify Data is Synced
You can get the checksum of the cdc.sbtest1
table in both the upstream and downstream TiDB clusters:
The value of the checksum should match. You can run SQL queries for further data verifications,.
Cleanup
Remove Changefeed
/cdc cli changefeed remove --changefeed-id=145ee6dd-1220-43f2-8d0b-423ab175944f --pd="http://10.103.226.105:2379"
Check the remove is successful:
[]
Remove TiCDC in TidbCluster CR
You can remove TiCDC frin TidbCluster
CR:
Delete TiCDC StatefulSet
After that, you can delete the TiCDC StatefulSet:
NAME READY AGE
basic-pd 0/3 2d12h
basic-ticdc 0/3 2d11h
basic-tidb 0/2 2d12h
basic-tikv 0/3 2d12h
statefulset.apps "basic-ticdc" deleted
You can verify that the StatefulSet is successfully deleted:
Troubleshooting
In case that the TiCDC pods are stuck in the Terminating
state, you can force delete them by:
Comments
0 comments
Please sign in to leave a comment.