Import from a Relational Database
Relational databases typically query and manipulate data with SQL. Given that OrientDB supports a subset of SQL, it is relatively straightfoward to import data from a Relational databases to OrientDB. You can manage imports using the Java API, OrientDB Studio or the OrientDB Console. The examples below use the Console.
This guide covers importing into the Document Model. Beginning with version 2.0, you can import into the Graph Model using the ETL Module. From version 1.7.x you can still use ETL by installing it as a separate module
For these examples, assume that your Relational database, (referred to as
reldb in the code), contains two tables:
Comment. The relationship between these tables is one-to-many.
SELECT * FROM post;+----+----------------+ | id | title | +----+----------------+ | 10 | NoSQL movement | | 20 | New OrientDB | +----+----------------+ reldb>
SELECT * FROM comment;+----+--------+--------------+ | id | postId | text | +----+--------+--------------+ | 0 | 10 | First | | 1 | 10 | Second | | 21 | 10 | Another | | 41 | 20 | First again | | 82 | 20 | Second Again | +----+--------+--------------+
Given that the Relational Model doesn't use concepts from Object Oriented Programming, there are some things to consider in the transition from a Relational database to OrientDB.
In Relational databases there is no concept of class, so in the import to OrientDB you need to create on class per table.
In Relational databases, one-to-many references invert from the target table to the source table.
Table POST <- (foreign key) Table COMMENT
In OrientDB, it follows the Object Oriented Model, so you have a collection of links connecting instances of
Class POST ->* (collection of links) Class COMMENT
Exporting Relational Databases
Most Relational database management systems provide a way to export the database into SQL format. What you specifically need from this is a text file that contains the SQL
INSERT commands to recreate the database from scratch. For example,
- MySQL: the
- Oracle Database: the Datapump utilities.
- Microsoft SQL Server: the Import and Export Wizard.
When you run this utility on the example database, it produces an
.sql file that contains the exported SQL of the Relational database.
DROP TABLE IF EXISTS post; CREATE TABLE post ( id INT(11) NOT NULL AUTO_INCREMENT, title VARCHAR(128), PRIMARY KEY (id) ); DROP TABLE IF EXISTS comment; CREATE TABLE comment ( id INT(11) NOT NULL AUTO_INCREMENT, postId INT(11), text TEXT, PRIMARY KEY (id), CONSTRAINT `fk_comments` FOREIGN KEY (`postId` ) REFERENCES `post` (`id` ) ); INSERT INTO POST (id, title) VALUES( 10, 'NoSQL movement' ); INSERT INTO POST (id, title) VALUES( 20, 'New OrientDB' ); INSERT INTO COMMENT (id, postId, text) VALUES( 0, 10, 'First' ); INSERT INTO COMMENT (id, postId, text) VALUES( 1, 10, 'Second' ); INSERT INTO COMMENT (id, postId, text) VALUES( 21, 10, 'Another' ); INSERT INTO COMMENT (id, postId, text) VALUES( 41, 20, 'First again' ); INSERT INTO COMMENT (id, postId, text) VALUES( 82, 20, 'Second Again' );
Modifying the Export File
Importing from the Relational database requires that you modify the SQL file to make it usable by OrientDB. In order to do this, you need to open the SQL file, (called
export.sql below), in a text editor and modify the commands there. Once this is done, you can execute the file on the Console using batch mode.
In order to import a data into OrientDB, you need to have a database ready to receive the import. Note that the example
export.sql file doesn't include statements to create the database. You can either create a new database or use an existing one.
Using New Databases
In creating a database for the import, you can either create a volatile in-memory database, (one that is only available while OrientDB is running), or you can create a persistent disk-based database. For a persistent database, you can create it on a remote server or locally through the PLocal mode.
The recommended method is PLocal, given that it offers better performance on massive inserts.
Using the embedded Plocal mode:
CREATE DATABASE PLOCAL:/tmp/db/blog admin_user admin_passwd PLOCAL DOCUMENT
CREATE DATABASEcommand creates a new database at
Using the Remote mode:
CREATE DATABASE REMOTE:localhost/blog root_user dkdf383dhdsj PLOCAL DOCUMENT
This creates a database at the URL
NOTE: When you create remote databases, you need the server credentials to access it. The user
rootand its password are stored in the
Using Existing Databases
In the event that you already have a database set up and ready for the import, instead of creating a database add a line that connects to that databases, using the
Using the embedded PLocal mode:
CONNECT PLOCAL:/tmp/db/blog admin_user admin_passwd
This connects to the database at
Using the Remote mode:
CONNECT REMOTE:localhost/blog admin_user admin_passwd
This connects to the database at the URL
In the SQL file, after you create or connect to the database, you need to declare your intention to perform a massive insert. Intents allow you to utilize automatic tuning OrientDB for maximum performance on particular operations, such as large inserts or reads.
DECLARE INTENT MASSIVEINSERT
Relational databases have no parallel to concepts in Object Oriented programming, such as classes. Conversely, OrientDB doesn't have a concept of tables in the Relational sense.
Modify the SQL file, changing
CREATE TABLE statements to
CREATE CLASS commands:
CREATE CLASS Post CREATE CLASS Comment
NOTE: In cases where your Relational database was created using Object Relational Mapping, or ORM, tools, such as Hibernate or Data Nucleus, you have to rebuild the original Object Oriented Structure directly in OrientDB.
In the Relational database, the relationship between the
comment was handled through foreign keys on the
id fields. OrientDB handles relationships differently, using links between two or more records of the Document type.
By default, the
CREATE LINK command creates a direct relationship in your object model. Navigation goes from
Comment and not vice versa, which is the case for the Relational database. You'll need to use the
INVERSE keyword to make the links work in both directions.
Add the following line after the
CREATE LINK comments TYPE LINKSET FROM comment.postId TO post.id INVERSE
Unlike how Relational databases handle tables, OrientDB does not require you to create a strict schema on your classes. The properties on each class are defined through the
INSERT statements. That is,
Given that you created a link in the above section, the property
postId is no longer necessary. Instead of modifying each
INSERT statement, you can use the
UPDATE command to remove them at the end:
UPDATE comment REMOVE postId
Bear in mind, this is an optional step. The database will still function if you leave this field in place.
When you've finished, remove any statements that OrientDB does not support. With the changes above this leaves you with a file similar to the one below:
cat export.sqlCONNECT plocal:/tmp/db/blog admin admin DECLARE INTENT MASSIVEINSERT CREATE CLASS Post CREATE CLASS Comment INSERT INTO Post (id, title) VALUES( 10, 'NoSQL movement' ) INSERT INTO Post (id, title) VALUES( 20, 'New OrientDB' ) INSERT INTO Comment (id, postId, text) VALUES( 0, 10, 'First' ) INSERT INTO Comment (id, postId, text) VALUES( 1, 10, 'Second' ) INSERT INTO Comment (id, postId, text) VALUES( 21, 10, 'Another' ) INSERT INTO Comment (id, postId, text) VALUES( 41, 20, 'First again' ) INSERT INTO Comment (id, postId, text) VALUES( 82, 20, 'Second Again' ) CREATE LINK comments TYPE LINKSET FROM Comment.postId TO Post.id INVERSE UPDATE Comment REMOVE postId
When you finish modifying the SQL file, you can execute it through the Console in batch mode. This is done by starting the Console with the SQL file given as the first argument.
When the OrientDB starts, it executes each of the commands given in the SQL files, creating or connecting to the database, creating the classes and inserting the data from the Relational database. You now have a working instance of OrientDB to use.
Using the Database
You now have an OrientDB Document database where relationships are direct and handled without the use of joins.
Query for all posts with comments:
SELECT FROM Post WHERE comments.size() > 0
Query for all posts where the comments contain the word "flame" in the
SELECT FROM Post WHERE comments CONTAINS(text LIKE '%flame%')
Query for all posts with comments made today, assuming that you have added a
dateproperty to the
SELECT FROM Post WHERE comments CONTAINS(date > '2011-04-14 00:00:00')
For more information, see