# TuGraph DataX
> This document mainly introduces the installation, compilation and usage examples of TuGraph DataX
## 1.Introduction
On the basis of Ali's open source DataX, TuGraph implements the support of writing plug-ins and jsonline data format, and other data sources can write data into TuGraph through DataX.
TuGraph DataX introduces [https://github.com/TuGraph-family/DataX](https://github.com/TuGraph-family/DataX), Supported features include:
- Import TuGraph from various heterogeneous data sources such as MySQL, SQL Server,Oracle, PostgreSQL, HDFS, Hive, HBase, OTS, ODPS, Kafka and so on.
- Import TuGraph to the corresponding target source (to be developed).
Reference for DataX Original Project Introduction [https://github.com/alibaba/DataX](https://github.com/alibaba/DataX)
## 2.Compile and Install
```bash
git clone https://github.com/TuGraph-family/DataX.git
yum install maven
mvn -U clean package assembly:assembly -Dmaven.test.skip=true
```
The compiled DataX file is in the target directory
## 3.Import TuGraph
### 3.1.Text data imported into TuGraph with DataX
Using the data from the lgraph_import section of the TuGraph manual as an example, we have three csv data files, as follows:
`actors.csv`
```
nm015950,Stephen Chow
nm0628806,Man-Tat Ng
nm0156444,Cecilia Cheung
nm2514879,Yuqi Zhang
```
`movies.csv`
```
tt0188766,King of Comedy,1999,7.3
tt0286112,Shaolin Soccer,2001,7.3
tt4701660,The Mermaid,2016,6.3
```
`roles.csv`
```
nm015950,Tianchou Yin,tt0188766
nm015950,Steel Leg,tt0286112
nm0628806,,tt0188766
nm0628806,coach,tt0286112
nm0156444,PiaoPiao Liu,tt0188766
nm2514879,Ruolan Li,tt4701660
```
Then create three DataX job profiles:
`job_actors.json`
```json
{
"job": {
"setting": {
"speed": {
"channel": 1
}
},
"content": [
{
"reader": {
"name": "txtfilereader",
"parameter": {
"path": ["actors.csv"],
"encoding": "UTF-8",
"column": [
{
"index": 0,
"type": "string"
},
{
"index": 1,
"type": "string"
}
],
"fieldDelimiter": ","
}
},
"writer": {
"name": "tugraphwriter",
"parameter": {
"host": "127.0.0.1",
"port": 7071,
"username": "admin",
"password": "73@TuGraph",
"graphName": "default",
"schema": [
{
"label": "actor",
"type": "VERTEX",
"properties": [
{ "name": "aid", "type": "STRING" },
{ "name": "name", "type": "STRING" }
],
"primary": "aid"
}
],
"files": [
{
"label": "actor",
"format": "JSON",
"columns": ["aid", "name"]
}
]
}
}
}
]
}
}
```
`job_movies.json`
```json
{
"job": {
"setting": {
"speed": {
"channel": 1
}
},
"content": [
{
"reader": {
"name": "txtfilereader",
"parameter": {
"path": ["movies.csv"],
"encoding": "UTF-8",
"column": [
{
"index": 0,
"type": "string"
},
{
"index": 1,
"type": "string"
},
{
"index": 2,
"type": "string"
},
{
"index": 3,
"type": "string"
}
],
"fieldDelimiter": ","
}
},
"writer": {
"name": "tugraphwriter",
"parameter": {
"host": "127.0.0.1",
"port": 7071,
"username": "admin",
"password": "73@TuGraph",
"graphName": "default",
"schema": [
{
"label": "movie",
"type": "VERTEX",
"properties": [
{ "name": "mid", "type": "STRING" },
{ "name": "name", "type": "STRING" },
{ "name": "year", "type": "STRING" },
{ "name": "rate", "type": "FLOAT", "optional": true }
],
"primary": "mid"
}
],
"files": [
{
"label": "movie",
"format": "JSON",
"columns": ["mid", "name", "year", "rate"]
}
]
}
}
}
]
}
}
```
`job_roles.json`
```json
{
"job": {
"setting": {
"speed": {
"channel": 1
}
},
"content": [
{
"reader": {
"name": "txtfilereader",
"parameter": {
"path": ["roles.csv"],
"encoding": "UTF-8",
"column": [
{
"index": 0,
"type": "string"
},
{
"index": 1,
"type": "string"
},
{
"index": 2,
"type": "string"
}
],
"fieldDelimiter": ","
}
},
"writer": {
"name": "tugraphwriter",
"parameter": {
"host": "127.0.0.1",
"port": 7071,
"username": "admin",
"password": "73@TuGraph",
"graphName": "default",
"schema": [
{
"label": "play_in",
"type": "EDGE",
"properties": [{ "name": "role", "type": "STRING" }]
}
],
"files": [
{
"label": "play_in",
"format": "JSON",
"SRC_ID": "actor",
"DST_ID": "movie",
"columns": ["SRC_ID", "role", "DST_ID"]
}
]
}
}
}
]
}
}
```
`/lgraph_server -c lgraph_standalone.json -d 'run'` 'Start TuGraph and run the following commands in sequence:
```
python3 datax/bin/datax.py job_actors.json
```
```
python3 datax/bin/datax.py job_movies.json
```
```
python3 datax/bin/datax.py job_roles.json
```
### 3.2.MySQL's data imported into TuGraph with DataX
We create the following table of movies under 'test' database
```sql
CREATE TABLE `movies` (
`mid` varchar(200) NOT NULL,
`name` varchar(100) NOT NULL,
`year` int(11) NOT NULL,
`rate` float(5,2) unsigned NOT NULL,
PRIMARY KEY (`mid`)
);
```
Insert some data into the table
```sql
insert into
test.movies (mid, name, year, rate)
values
('tt0188766', 'King of Comedy', 1999, 7.3),
('tt0286112', 'Shaolin Soccer', 2001, 7.3),
('tt4701660', 'The Mermaid', 2016, 6.3);
```
Create a DataX job configuration file
`job_mysql_to_tugraph.json`
**Configuring Field**
```json
{
"job": {
"setting": {
"speed": {
"channel": 1
}
},
"content": [
{
"reader": {
"name": "mysqlreader",
"parameter": {
"username": "root",
"password": "root",
"column": ["mid", "name", "year", "rate"],
"splitPk": "mid",
"connection": [
{
"table": ["movies"],
"jdbcUrl": ["jdbc:mysql://127.0.0.1:3306/test?useSSL=false"]
}
]
}
},
"writer": {
"name": "tugraphwriter",
"parameter": {
"host": "127.0.0.1",
"port": 7071,
"username": "admin",
"password": "73@TuGraph",
"graphName": "default",
"schema": [
{
"label": "movie",
"type": "VERTEX",
"properties": [
{ "name": "mid", "type": "STRING" },
{ "name": "name", "type": "STRING" },
{ "name": "year", "type": "STRING" },
{ "name": "rate", "type": "FLOAT", "optional": true }
],
"primary": "mid"
}
],
"files": [
{
"label": "movie",
"format": "JSON",
"columns": ["mid", "name", "year", "rate"]
}
]
}
}
}
]
}
}
```
**Write simple sql**
```json
{
"job": {
"setting": {
"speed": {
"channel": 1
}
},
"content": [
{
"reader": {
"name": "mysqlreader",
"parameter": {
"username": "root",
"password": "root",
"connection": [
{
"querySql": [
"select mid, name, year, rate from test.movies where year > 2000;"
],
"jdbcUrl": ["jdbc:mysql://127.0.0.1:3306/test?useSSL=false"]
}
]
}
},
"writer": {
"name": "tugraphwriter",
"parameter": {
"host": "127.0.0.1",
"port": 7071,
"username": "admin",
"password": "73@TuGraph",
"graphName": "default",
"schema": [
{
"label": "movie",
"type": "VERTEX",
"properties": [
{ "name": "mid", "type": "STRING" },
{ "name": "name", "type": "STRING" },
{ "name": "year", "type": "STRING" },
{ "name": "rate", "type": "FLOAT", "optional": true }
],
"primary": "mid"
}
],
"files": [
{
"label": "movie",
"format": "JSON",
"columns": ["mid", "name", "year", "rate"]
}
]
}
}
}
]
}
}
```
`./lgraph_server -c lgraph_standalone.json -d 'run'` Start TuGraph and run the following command:
```shell
python3 datax/bin/datax.py job_mysql_to_tugraph.json
```
## 4.Export TuGraph
### 4.1. Configuration example
TuGraph supports exporting data using DataX. Use the following configuration to export data to text data
```json
{
"job": {
"setting": {
"speed": {
"channel": 1
}
},
"content": [
{
"reader": {
"name": "tugraphreader",
"parameter": {
"username": "admin",
"password": "73@TuGraph",
"graphName": "Movie_8C5C",
"queryCypher": "match (n:person) return n.id,n.name,n.born;",
"url": "bolt://100.83.30.35:27687"
}
},
"writer": {
"name": "txtfilewriter",
"parameter": {
"path": "./result",
"fileName": "luohw",
"writeMode": "truncate"
}
}
}
]
}
}
```
Using this configuration file, you can export all the id, name and born attributes of the person node in the TuGraph Movie_8C5C subgraph,
export them to the result directory under the current directory, and the file name is luohw+random suffix.
### 4.2. Parameter Description
When using DataX to export TuGraph data, you need to set the reader to tugraphreader and configure the following 5 parameters:
* **url**
* Description: TuGraph's bolt server address
* Required: Yes
* Default value: None
* **username**
* Description: TuGraph's username
* Required: Yes
* Default value: None
* **password**
* Description: TuGraph's password
* Required: Yes
* Default value: None
* **graphName**
* Description: The selected TuGraph subgraph to be synchronized
* Required: Yes
* Default value: None
* **queryCypher**
* Description: Read data in TuGraph through cypher statements
* Required: No
* Default value: None