TuGraph DataX

This document mainly introduces the installation, compilation and usage examples of TuGraph DataX


On the basis of Ali’s open source DataX, TuGraph implements the support of writing plug-ins and jsonline data format, and other data sources can write data into TuGraph through DataX. TuGraph DataX introduces https://github.com/TuGraph-family/DataX, Supported features include:

  • Import TuGraph from various heterogeneous data sources such as MySQL, SQL Server,Oracle, PostgreSQL, HDFS, Hive, HBase, OTS, ODPS, Kafka and so on.

  • Import TuGraph to the corresponding target source (to be developed).

Reference for DataX Original Project Introduction https://github.com/alibaba/DataX

2.Compile and Install

git clone https://github.com/TuGraph-family/DataX.git
yum install maven
mvn -U clean package assembly:assembly -Dmaven.test.skip=true

The compiled DataX file is in the target directory

3.Import TuGraph

3.1.Text data imported into TuGraph with DataX

Using the data from the lgraph_import section of the TuGraph manual as an example, we have three csv data files, as follows: actors.csv

nm015950,Stephen Chow
nm0628806,Man-Tat Ng
nm0156444,Cecilia Cheung
nm2514879,Yuqi Zhang


tt0188766,King of Comedy,1999,7.3
tt0286112,Shaolin Soccer,2001,7.3
tt4701660,The Mermaid,2016,6.3


nm015950,Tianchou Yin,tt0188766
nm015950,Steel Leg,tt0286112
nm0156444,PiaoPiao Liu,tt0188766
nm2514879,Ruolan Li,tt4701660

Then create three DataX job profiles: job_actors.json

  "job": {
    "setting": {
      "speed": {
        "channel": 1
    "content": [
        "reader": {
          "name": "txtfilereader",
          "parameter": {
            "path": ["actors.csv"],
            "encoding": "UTF-8",
            "column": [
                "index": 0,
                "type": "string"
                "index": 1,
                "type": "string"
            "fieldDelimiter": ","
        "writer": {
          "name": "tugraphwriter",
          "parameter": {
            "host": "",
            "port": 7071,
            "username": "admin",
            "password": "73@TuGraph",
            "graphName": "default",
            "schema": [
                "label": "actor",
                "type": "VERTEX",
                "properties": [
                  { "name": "aid", "type": "STRING" },
                  { "name": "name", "type": "STRING" }
                "primary": "aid"
            "files": [
                "label": "actor",
                "format": "JSON",
                "columns": ["aid", "name"]


  "job": {
    "setting": {
      "speed": {
        "channel": 1
    "content": [
        "reader": {
          "name": "txtfilereader",
          "parameter": {
            "path": ["movies.csv"],
            "encoding": "UTF-8",
            "column": [
                "index": 0,
                "type": "string"
                "index": 1,
                "type": "string"
                "index": 2,
                "type": "string"
                "index": 3,
                "type": "string"
            "fieldDelimiter": ","
        "writer": {
          "name": "tugraphwriter",
          "parameter": {
            "host": "",
            "port": 7071,
            "username": "admin",
            "password": "73@TuGraph",
            "graphName": "default",
            "schema": [
                "label": "movie",
                "type": "VERTEX",
                "properties": [
                  { "name": "mid", "type": "STRING" },
                  { "name": "name", "type": "STRING" },
                  { "name": "year", "type": "STRING" },
                  { "name": "rate", "type": "FLOAT", "optional": true }
                "primary": "mid"
            "files": [
                "label": "movie",
                "format": "JSON",
                "columns": ["mid", "name", "year", "rate"]


  "job": {
    "setting": {
      "speed": {
        "channel": 1
    "content": [
        "reader": {
          "name": "txtfilereader",
          "parameter": {
            "path": ["roles.csv"],
            "encoding": "UTF-8",
            "column": [
                "index": 0,
                "type": "string"
                "index": 1,
                "type": "string"
                "index": 2,
                "type": "string"
            "fieldDelimiter": ","
        "writer": {
          "name": "tugraphwriter",
          "parameter": {
            "host": "",
            "port": 7071,
            "username": "admin",
            "password": "73@TuGraph",
            "graphName": "default",
            "schema": [
                "label": "play_in",
                "type": "EDGE",
                "properties": [{ "name": "role", "type": "STRING" }]
            "files": [
                "label": "play_in",
                "format": "JSON",
                "SRC_ID": "actor",
                "DST_ID": "movie",
                "columns": ["SRC_ID", "role", "DST_ID"]

/lgraph_server -c lgraph_standalone.json -d 'run' ‘Start TuGraph and run the following commands in sequence:

python3 datax/bin/datax.py  job_actors.json
python3 datax/bin/datax.py  job_movies.json
python3 datax/bin/datax.py  job_roles.json

3.2.MySQL’s data imported into TuGraph with DataX

We create the following table of movies under ‘test’ database

CREATE TABLE `movies` (
  `mid`  varchar(200) NOT NULL,
  `name` varchar(100) NOT NULL,
  `year` int(11) NOT NULL,
  `rate` float(5,2) unsigned NOT NULL,
  PRIMARY KEY (`mid`)

Insert some data into the table

insert into
test.movies (mid, name, year, rate)
('tt0188766', 'King of Comedy', 1999, 7.3),
('tt0286112', 'Shaolin Soccer', 2001, 7.3),
('tt4701660', 'The Mermaid',   2016,  6.3);

Create a DataX job configuration file


Configuring Field

  "job": {
    "setting": {
      "speed": {
        "channel": 1
    "content": [
        "reader": {
          "name": "mysqlreader",
          "parameter": {
            "username": "root",
            "password": "root",
            "column": ["mid", "name", "year", "rate"],
            "splitPk": "mid",
            "connection": [
                "table": ["movies"],
                "jdbcUrl": ["jdbc:mysql://"]
        "writer": {
          "name": "tugraphwriter",
          "parameter": {
            "host": "",
            "port": 7071,
            "username": "admin",
            "password": "73@TuGraph",
            "graphName": "default",
            "schema": [
                "label": "movie",
                "type": "VERTEX",
                "properties": [
                  { "name": "mid", "type": "STRING" },
                  { "name": "name", "type": "STRING" },
                  { "name": "year", "type": "STRING" },
                  { "name": "rate", "type": "FLOAT", "optional": true }
                "primary": "mid"
            "files": [
                "label": "movie",
                "format": "JSON",
                "columns": ["mid", "name", "year", "rate"]

Write simple sql

  "job": {
    "setting": {
      "speed": {
        "channel": 1
    "content": [
        "reader": {
          "name": "mysqlreader",
          "parameter": {
            "username": "root",
            "password": "root",
            "connection": [
                "querySql": [
                  "select mid, name, year, rate from test.movies where year > 2000;"
                "jdbcUrl": ["jdbc:mysql://"]
        "writer": {
          "name": "tugraphwriter",
          "parameter": {
            "host": "",
            "port": 7071,
            "username": "admin",
            "password": "73@TuGraph",
            "graphName": "default",
            "schema": [
                "label": "movie",
                "type": "VERTEX",
                "properties": [
                  { "name": "mid", "type": "STRING" },
                  { "name": "name", "type": "STRING" },
                  { "name": "year", "type": "STRING" },
                  { "name": "rate", "type": "FLOAT", "optional": true }
                "primary": "mid"
            "files": [
                "label": "movie",
                "format": "JSON",
                "columns": ["mid", "name", "year", "rate"]

./lgraph_server -c lgraph_standalone.json -d 'run' Start TuGraph and run the following command:

python3 datax/bin/datax.py  job_mysql_to_tugraph.json

4.Export TuGraph

4.1. Configuration example

TuGraph supports exporting data using DataX. Use the following configuration to export data to text data

  "job": {
    "setting": {
      "speed": {
        "channel": 1
    "content": [
        "reader": {
          "name": "tugraphreader",
          "parameter": {
            "username": "admin",
            "password": "73@TuGraph",
            "graphName": "Movie_8C5C",
            "queryCypher": "match (n:person) return n.id,n.name,n.born;",
            "url": "bolt://"
        "writer": {
          "name": "txtfilewriter",
          "parameter": {
            "path": "./result",
            "fileName": "luohw",
            "writeMode": "truncate"

Using this configuration file, you can export all the id, name and born attributes of the person node in the TuGraph Movie_8C5C subgraph, export them to the result directory under the current directory, and the file name is luohw+random suffix.

4.2. Parameter Description

When using DataX to export TuGraph data, you need to set the reader to tugraphreader and configure the following 5 parameters:

  • url

    • Description: TuGraph’s bolt server address

    • Required: Yes

    • Default value: None

  • username

    • Description: TuGraph’s username

    • Required: Yes

    • Default value: None

  • password

    • Description: TuGraph’s password

    • Required: Yes

    • Default value: None

  • graphName

    • Description: The selected TuGraph subgraph to be synchronized

    • Required: Yes

    • Default value: None

  • queryCypher

    • Description: Read data in TuGraph through cypher statements

    • Required: No

    • Default value: None