London, July 6, 2016
In the previous blog post we analyzed the reasons that may lead us to configure a migration performed through OrientDB Teleporter. We have mainly seen that the issue with missing foreign keys’ definition can be overcome thanks to the mapping between logical Relationships of the source database and the Edge classes of the Graph Model. Today we’ll see that we can exploit Teleporter’s configuration even further.
Thanks to this configuration you can:
- Add new Edge classes in the Graph Model even though Relationships are missing.
- Modify existing mappings between Relationships and Edge classes. In fact, if the Relationship you are mapping is already present in your database, it will be overridden with the parameters you defined in the configuration.In this way you can change the name, the direction and the properties of the Edge class correspondent to a Relationship already present in the database schema.
- Define Edge classes for join tables during a migration exploiting the aggregation strategy.
Last time we analyzed a use case where we exploited the first feature using mapping, in order to avoid that missing foreign keys in the database schema could lead to a graph database without edges.
By analyzing another use case in this blog post, we will demonstrate how take advantage of the other two functionalities.
In fact, in some instances we may want to change the default names chosen by Teleporter for the Edge classes, as they are inferred starting from the attributes’ names and they may not be so significant; we might need to add or remove properties to a specific Edge class, or want to modify the attributes of an “aggregator-edge” representing complex Relationships N-N, direction included.
Let’s suppose we have the following E-R Model:
Then let’s presume all the foreign keys are defined in the database schema at the physical layer.
Let’s suppose we want adopt the aggregation strategy for the migration towards OrientDB in order to aggregate the Project_Employee join table. In this way, the join table is not converted into a Vertex class but into an appropriate Edge class, and each field not involved in any relationship with other tables is aggregated in the properties of the new built edge.
Even though the new Graph Model does not reflect the original E-R Model, the aggregation leads to a great saving in terms of resources and avoids a substantial overhead as the aggregation process simplifies the structure of the resulting Graph Model, producing a simpler graph. In fact, each record in the join table will not correspond to a vertex, but an edge between two vertices.
If you want to know more about the aggregation strategy, you can refer to the documentation or you can read this article.
Therefore, performing a direct aggregating migration without any configuration we will obtain the following graph Model:
Now suppose we want to map the following Relationships and Edge classes in a different way:
- Aggregate-Relationship between Project and Employee: we want to invert the direction of the Project2Employee Edge class and consequently change its name to WorksAtProject. Moreover, we want to add a new role property to the Edge class properties list.
- Relationship between Employee and Department: we want to have WorksIn as the new name for the HasDepartment Edge class and add a new sinceproperty to the Edge class.
Regarding the Relationships between Department and City, we want to maintain the default mapping, so we will not configure anything about them.
Consequently, it’s important to allow for the configuration to override the default mapping inferred and executed by Teleporter: so if you must simply change a few details in the final graph model, you will not have to write down the complete configuration, but you may just specify what you want to change.
This is the full configuration we must submit to Teleporter in order to achieve the set goals:
{ "edges": [{ "WorksIn": { "mapping": { "fromTable": "EMPLOYEE", "fromColumns": ["department"], "toTable": "DEPARTMENT", "toColumns": ["id"], "direction": "direct" }, "properties": { "since": { "type": "DATE", "mandatory": true, "readOnly": false, "notNull": false } } }, "WorksAtProject": { "mapping": { "fromTable": "PROJECT", "fromColumns": ["id"], "toTable": "EMPLOYEE", "toColumns": ["id"], "joinTable": { "tableName": "PROJECT_EMPLOYEE", "fromColumns": ["project_id"], "toColumns": ["employee_id"] }, "direction": "inverse" }, "properties": { "role": { "type": "STRING", "mandatory": true, "readOnly": false, "notNull": true } } } }] }
Let’s take a closer look at each Relationship-Edge class mapping.
1. Relationship: Employee-Department, cardinality 1-N.
"WorksIn": { "mapping": { "fromTable": "EMPLOYEE", "fromColumns": ["department"], "toTable": "DEPARTMENT", "toColumns": ["id"], "direction": "direct" }, "properties": { "since": { "type": "DATE", "mandatory": true, "readOnly": false, "notNull": false } } }
In the configuration above we have defined which parameters must to be overridden. We decided to change the name of the Edge class through the WorksIn field, add a since property with a DATE type and maintain the original direction from Employee vertices to Department vertices.
Teleporter will recognize the Relationship you want to override on the basis of the values:
- fromTable
- fromColumns
- toTable
- toColumns
These values must to be coherent with the direction of the foreign key representing the above-mentioned Relationship. Otherwise Teleporter will interpret the relationship as a new one. So, if we want override the Relationship starting from Employee and ending with the Department Vertex class through the following configuration:
"WorksIn": { "mapping": { "fromTable": "DEPARTMENT", "fromColumns": ["id"], "toTable": "EMPLOYEE", "toColumns": ["department"], "direction": "direct" }, …
As result we would obtain the adding of a second relationship with an inverted direction between the two tables. Therefore, our graph model would have two Edge classes where the second one is totally wrong.
So remember to be coherent with the underlying physical schema during the mapping definition.
If you want to reverse the Edge, do not use the fromTable and toTable fields, instead take advantage of the direction field using “inverse” as value.
2. Relationship: Project-Employee, cardinality N-N.
"WorksAtProject": { "mapping": { "fromTable": "PROJECT", "fromColumns": ["id"], "toTable": "EMPLOYEE", "toColumns": ["id"], "joinTable": { "tableName": "PROJECT_EMPLOYEE", "fromColumns": ["project_id"], "toColumns": ["employee_id"] }, "direction": "inverse" }, "properties": { "role": { "type": "STRING", "mandatory": true, "readOnly": false, "notNull": true } } }
The aggregation during the migration is performed on join tables which only allow join operations between two tables.
Each candidate join table is converted into an appropriate Edge class, and each attribute of the table itself is aggregated into the properties of the new built Edge. We can define this “aggregator-edge” both if foreign keys between joint table and external tables are defined or not.
If no foreign keys are defined you can kill two birds with one stone. In fact, you can declare the two Relationships with the external tables and define the mapping with an Aggregator-Edge class in one shot. Otherwise, if foreign keys are present in the schema you can act on the mapping anyway. In our example, we changed the direction of the Edge class and consequently, through the WorksAtProject field, its name as well.
The implicit direction of the Edge is set according to the fromTable and toTable fields, setting them as follows:
- "fromTable": "PROJECT"
- "toTable": "EMPLOYEE"
We decided to express the relationship between Projects and Employees through edges starting from Project vertices and ending into Employee vertices.
But we are even free to exploit the semantics of the direction field in order to reverse the direction of the final Edge, conveniently renamed WorksAtProject.
Moreover we added a role property of type STRING using the syntax already adopted for the previous mapping.
As shown in the example, in order to activate the aggregation you must define the additional joinTable field:
"joinTable": { "tableName": "PROJECT-EMPLOYEE", "fromColumns": ["project_id"], "toColumns": ["employee_id"] }
This info is essential for Teleporter to infer all the single relationships between the records belonging to the "EMPLOYEE" and "PROJECT" tables and to coherently build all the edges:
- tableName: the name of the join table which will be aggregated into the declared Edge.
- fromColumns: the join table’s columns involved in the relationship with the "fromTable".
- toColumns: the join table’s columns involved in the relationship with the "toTable".
Submitting this configuration to Teleporter, we will finally obtain the following Graph Model:
That’s it, we achieved our goals by acting directly on the mapping between the Relationships of the source database and the Edge classes of the OrientDB graph database.
Now you are ready to start with your own data!
Stay tuned,
Gabriele Ponzi
OrientDB Ltd
References
Teleporter documentation index:
https://github.com/orientechnologies/orientdb-docs/blob/master/Teleporter-Index.md
Teleporter configuration:
https://github.com/orientechnologies/orientdb-docs/blob/master/Teleporter-Import-Configuration.md