Data Migration
1 Introduction
Data migration refers to the process of moving data from one system, storage medium, or application to another system, storage medium, or application. When TuGraph needs to be upgraded or the system hardware environment changes, The data in the original TuGraph service needs to be migrated. Based on the system hardware environment and software version, this paper divides data migration into three schemes:
Compatible migration: When the system environment before and after the migration is consistent and the TuGraph software is compatible, you can directly use the backup and recovery method to migrate data;
Upgrade and migration: When the system environment before and after the migration is inconsistent or the TuGraph software is not compatible, it is necessary to migrate the data by first exporting the data and then re-importing;
Online migration: When data migration is performed on a high-availability cluster and the network environment of the cluster is good, the original cluster can be smoothly switched to the new cluster by adding or deleting nodes. The following article will introduce these three schemes in detail.
2. Compatible Migration
Compatible migration means that when the system environment remains unchanged and the TuGraph software version is compatible, the data and stored procedures of the original service can be used in the new service, so it can be directly migrated.
Users can first use the lgraph_backup
tool to back up the data, then transfer the data to a new machine and restart the service. The specific migration steps are as follows:
2.1. Backup data
Backup data using lgraph_backup
tool
lgraph_backup -s db -d db.bck
You can also directly use the cp
command in this step, but the cp
command will copy some redundant metadata, and the raft metadata will also be copied in the HA mode, causing the cluster to fail to restart after migration.
Therefore, it is recommended to use the lgraph_backup
tool instead of the cp
command during data migration.
2.2. Start a new service
Use the following command to start the new service, and the stored procedure will be automatically loaded into the new service
lgraph_server -c /usr/local/etc/lgraph.json --directory db.bck -d start
2.3. Stop the original service
Use the following command to stop the original service
lgraph_server -c /usr/local/etc/lgraph.json --directory db.bck -d stop
3. Upgrade migration
When the user wants to migrate the original service to a differentiated environment (such as migrating from centos7 to ubuntu18.04), or when the version of TuGraph changes greatly and is incompatible (such as 3.4.0 and 3.6.0),
Users can first use the lgraph_export
tool to export the data into a file, transfer it to a new machine, and then use the lgraph_import
tool to re-import and restart the cluster.
This can ensure that it can be used in the new environment, but the efficiency is low, and the stored procedure needs to be reloaded. The specific migration steps are as follows:
3.1. Export data
Use the lgraph_export
tool to export the data and transfer the data to the new machine
lgraph_export -d db -e db.export
3.2. Import data
Use the lgraph_import
tool to import data and manually load the stored procedure (see client operation steps for details)
lgraph_import -c db.export/import.config -d db
3.3. Start a new service
Start the new service with the following command
lgraph_server -c /usr/local/etc/lgraph.json --directory db.export -d start
3.4. Stop the original service
Use the following command to stop the original service
lgraph_server -c /usr/local/etc/lgraph.json --directory db.export -d stop
4. Online Migration
When performing data migration on the server cluster deployed by the high-availability version of TuGraph, if the network bandwidth is sufficient, you can directly migrate the service online by adding or deleting nodes. The specific migration steps are as follows:
4.1. Copy data
Use the following commands to copy the data on the leader node and transfer it to the machine nodes of the new cluster. Since the leader node has the most complete raft log, copying the leader’s data can minimize The time for the log to catch up.
cp -r db db.cp
4.2. Starting a new node
Use the following command to join the new node to the cluster. After joining the cluster, the incremental data will be automatically synchronized to the new node
lgraph_server -c /usr/local/etc/lgraph_ha.json --directory db.cp --ha_conf 192.168.0.1:9090,192.168.0.2:9090,192.168.0.3:9090 -d start
4.3. Stop the original node
Stop the original node service, and send subsequent application requests directly to the new cluster
lgraph_server -c /usr/local/etc/lgraph_ha.json --directory db.cp --ha_conf 192.168.0.1:9090,192.168.0.2:9090,192.168.0.3:9090 -d stop