OrientDB with Apache TinkerPop 3
(since v 3.0)
OrientDB adheres to the Apache TinkerPop standard and implements TinkerPop Stack interfaces.
In versions pior to 3.0, OrientDB uses the TinkerPop 2.x implementation as the default for Java Graph API.
Starting from version 3.0, OrientDB ships its own APIs for handling Graphs (Multi-Model API). Those APIs are used to implement the TinkerPop 3 interfaces.
The OrientDB TinkerPop development happens here
Installation
Since TinkePop stack has been removed as a dependency from the OrientDB Community Edition, starting with version 3.0 it will be available for download as an Apache TinkerPop 3 enabled edition of OrientDB based on the Community Edition.
It contains all the features of the OrientDB Community Edition plus the integration with the Tinkerpop stack:
Source Code Installation
In addition to the binary packages, you can compile the OrientDB TinkerPop enabled distribution from the source code. This process requires that you install Git and Apache Maven on your system.
As a pre-requirement, you have to build the same branch for OrientDB core distribution:
$ git clone https://github.com/orientechnologies/orientdb
$ cd orientdb
$ git checkout develop
$ mvn clean install -DskipTests
$ cd ..
Then you can proceed and build orientdb-gremlin
$ git clone https://github.com/orientechnologies/orientdb-gremlin
$ cd orientdb-gremlin
$ git checkout develop
$ mvn clean install
It is possible to skip tests:
$ mvn clean install -DskipTests
This project follows the branching system of OrientDB.
The build process installs all jars in the local maven repository and creates archives under the distribution
module inside the target
directory. At the time of writing, building from branch develop
gave:
$ls -l distribution/target/
total 266640
drwxr-xr-x 2 staff staff 68B Mar 21 13:07 archive-tmp/
drwxr-xr-x 3 staff staff 102B Mar 21 13:07 dependency-maven-plugin-markers/
drwxr-xr-x 12 staff staff 408B Mar 21 13:07 orientdb-community-3.0.0-SNAPSHOT/
drwxr-xr-x 3 staff staff 102B Mar 21 13:07 orientdb-gremlin-community-3.0.0-SNAPSHOT.dir/
-rw-r--r-- 1 staff staff 65M Mar 21 13:08 orientdb-gremlin-community-3.0.0-SNAPSHOT.tar.gz
-rw-r--r-- 1 staff staff 65M Mar 21 13:08 orientdb-gremlin-community-3.0.0-SNAPSHOT.zip
$
The directory orientdb-gremlin-community-3.0.0-SNAPSHOT.dir
contains the OrientDB Gremlin distribution uncompressed.
Gremlin Console
The Gremlin Console is an interactive terminal or REPL that can be used to traverse graphs and interact with the data that they contain. For more information about it see the documentation here
To start the Gremlin console run the gremlin.sh (or gremlin.bat on Windows OS) script located in the bin directory of the OrientDB Gremlin Distribution
$ ./gremlin.sh
\,,,/
(o o)
-----oOOo-(3)-oOOo-----
plugin activated: tinkerpop.orientdb
gremlin>
Alternatively if you downloaded the Gremlin Console from the Apache Tinkerpop Site, just install the OrientDB Gremlin Plugin with
gremlin> :install com.orientechnologies orientdb-gremlin {version}
and then activate it
gremlin> :plugin use tinkerpop.orientdb
Open the graph database
Before playing with Gremlin you need a valid OrientGraph instance that points to an OrientDB database. To know all the database types look at Storage types.
When you're working with a local or an in-memory database, if the database does not exist it's created for you automatically. Using the remote connection you need to create the database on the target server before using it. This is due to security restrictions.
Once created the OrientGraph instance with a proper URL is necessary to assign it to a variable. Gremlin is written in Groovy, so it supports all the Groovy syntax, and both can be mixed to create very powerful scripts!
Example with a memory database (see below for more information about it):
gremlin> graph = OrientGraph.open();
==>orientgraph[memory:orientdb-0.5772652169975153]
Some useful links:
Working with in-memory database
In this mode the database is volatile and all the changes will be not persistent. Use this in a clustered configuration (the database life is assured by the cluster itself) or just for test.
gremlin> graph = OrientGraph.open()
==>orientgraph[memory:orientdb-0.4274754723616194]
Working with local database
This is the most often used mode. The console opens and locks the database for exclusive use. This doesn't require starting an OrientDB server.
gremlin> graph = OrientGraph.open("embedded:/tmp/gremlin/demo");
==>orientgraph[plocal:/tmp/gremlin/demo]
Working with a remote database
To open a database on a remote server be sure the server is up and running first. To start the server just launch server.sh (or server.bat on Windows OS) script. For more information look at OrientDB Server
gremlin> graph = OrientGraph.open("remote:localhost/demodb");
==>orientgraph[remote:localhost/demodb]
Use security
OrientDB supports security by creating multiple users and roles associated with certain privileges. To know more look at Security. To open the graph database with a different user than the default, pass the user and password as additional parameters:
gremlin> graph = OrientGraph.open("remote:localhost/demodb","reader","reader");
==>orientgraph[remote:localhost/demodb]
With Configuration
gremlin> config = new BaseConfiguration()
==>org.apache.commons.configuration.BaseConfiguration@2d38edfd
gremlin> config.setProperty("orient-url","remote:localhost/demodb")
==>null
gremlin> graph = OrientGraph.open(config)
==>orientgraph[remote:localhost/demodb]
Available configurations are :
- orient-url: Connection URL.
- orient-user: Database User.
- orient-pass: Database Password.
- orient-transactional: Transactional. Configuration. If true the
OrientGraph
instance will support transactions (Disabled by default).
Mutating the Graph
Create a new Vertex
To create a new vertex, use the addVertex() method. The vertex will be created and a unique id will be displayed as the return value.
graph.addVertex();
==>v[#5:0]
Create an edge
To create a new edge between two vertices, use the v1.addEdge(label, v2) method. The edge will be created with the label specified.
In the example below two vertices are created and assigned to a variable (Gremlin is based on Groovy), then an edge is created between them.
gremlin> v1 = graph.addVertex();
==>v[#10:0]
gremlin> v2 = graph.addVertex();
==>v[#11:0]
gremlin> e = v1.addEdge('friend',v2)
==>e[#17:0][#10:0-friend->#11:0]
Close the database
To close a graph use the close() method:
gremlin> graph.close()
==>null
This is not strictly necessary because OrientDB always closes the database when the Gremlin Console quits.
Transactions
OrientGraph supports Transactions. By default Tx are disabled. Use configuration to enable it.
gremlin> config = new BaseConfiguration()
==>org.apache.commons.configuration.BaseConfiguration@2d38edfd
gremlin> config.setProperty("orient-url","remote:localhost/demodb")
==>null
gremlin> config.setProperty("orient-transactional",true)
==>null
gremlin> graph = OrientGraph.open(config)
==>orientgraph[remote:localhost/demodb]
When using transactions OrientDB assigns a temporary identifier to each vertex and edge that is created.
use graph.tx().commit()
to save them.
Transaction Example
gremlin> config = new BaseConfiguration()
==>org.apache.commons.configuration.BaseConfiguration@6b7d1df8
gremlin> config.setProperty("orient-url","remote:localhost/demodb")
==>null
gremlin> config.setProperty("orient-transactional",true)
==>null
gremlin> graph = OrientGraph.open(config)
==>orientgraph[remote:localhost/demodb]
gremlin> graph.tx().isOpen()
==>true
gremlin> v1 = graph.addVertex()
==>v[#-1:-2]
gremlin> v2 = graph.addVertex()
==>v[#-1:-3]
gremlin> e = v1.addEdge("friends",v2)
==>e[#-1:-4][#-1:-2-friends->#-1:-3]
gremlin> graph.tx().commit()
==>null
Traversal
The power of Gremlin is in traversal. Once you have a graph loaded in your database you can traverse it in many different ways. For more info about traversal with Gremlin see here
The entry point for traversal is a TraversalSource that can be easily obtained by an opened OrientGraph instance with:
gremlin> graph = OrientGraph.open()
==>orientgraph[memory:orientdb-0.41138060448051794]
gremlin> g = graph.traversal()
==>graphtraversalsource[orientgraph[memory:orientdb-0.41138060448051794], standard]
Retrieve a vertex
To retrieve a vertex by its ID, use the V(id) method passing the RecordId as an argument (with or without the prefix '#'). This example retrieves the first vertex created in the above example.
gremlin> g.V('#33:0')
==>v[#33:0]
Get all the vertices
To retrieve all the vertices in the opened graph use .V() (V in upper-case):
gremlin> g.V()
==>v[#33:0]
==>v[#33:1]
==>v[#33:2]
==>v[#33:3]
Retrieve an edge
Retrieving an edge is very similar to retrieving a vertex. Use the E(id)
method passing the RecordId as an argument (with or without the prefix '#'). This example retrieves the first edge created in the previous example.
gremlin> g.E('145:0')
==>e[#145:0][#121:0-IsFromCountry->#33:29]
Get all the edges
To retrieve all the edges in the opened graph use .E()
(E in upper-case):
gremlin> g.E()
==>e[#145:0][#121:0-IsFromCountry->#33:29]
==>e[#145:1][#121:1-IsFromCountry->#40:11]
==>e[#145:2][#121:2-IsFromCountry->#37:0]
Basic Traversal
To display all the outgoing edges of a vertex use .outE(<label>)
.
Label is optional and if passed the outgoing edges will be filtered by that label
Example:
gremlin> g.V('#41:1').outE()
==>e[#221:6][#41:1-HasFriend->#42:1]
==>e[#222:6][#41:1-HasFriend->#43:1]
To display all the incoming edges of a vertex use .inE(<label>)
. Example:
gremlin> g.V('#41:1').inE()
==>e[#224:0][#41:0-HasFriend->#41:1]
==>e[#217:2][#42:0-HasFriend->#41:1]
==>e[#217:3][#43:0-HasFriend->#41:1]
==>e[#224:3][#44:0-HasFriend->#41:1]
==>e[#222:4][#45:0-HasFriend->#41:1]
==>e[#219:5][#46:0-HasFriend->#41:1]
==>e[#223:5][#47:0-HasFriend->#41:1]
==>e[#218:6][#48:0-HasFriend->#41:1]
==>e[#185:0][#121:0-HasProfile->#41:1]
For more information look at the Gremlin Traversal.
Filter results
This example returns all the outgoing edges of all the vertices with label equal to 'friend'.
gremlin> g.V('#41:1').inE('HasProfile')
==>e[#185:0][#121:0-HasProfile->#41:1]
gremlin>
Create complex paths
Gremlin allows you to concatenate expressions to create more complex traversals in a single line:
g.V().in().out()
Of course this could be much more complex. Below is an example with the graph taken from the official documentation:
gremlin> g.V('44:0').out('HasFriend').out('HasFriend').values('Name').dedup()
==>Frank
==>Emanuele
==>Paolo
==>Colin
==>Andrey
==>Sergey
gremlin>
Gremlin Server
There are two ways to use OrientDB inside the Gremlin Server
- Use the OrientDB-TP3 distribution that embedd the Gremlin Server
- Install the OrientDB-Gremlin Driver into a GremlinServer
OrientDB-TP3
Dowload the latest version of OrientDB-TP3 here.
and start OrientDB to automatically start the embedded Gremlin Server.
The configuration of the Gremlin Server is in $ORIENTDB_HOME/config
.
When using the embedded version by default the Authentication manager authenticate the users against OrientDB server user
with permission gremlin.server
.
Install OrientDB-Gremlin
Download the latest Gremlin Server distribution here
and then install the OrientDB Gremlin driver with
bin/gremlin-server.sh -i com.orientechnologies orientdb-gremlin ${version}
OrientDB Gremlin Server configuration
YAML configuration example
host: localhost
port: 8182
scriptEvaluationTimeout: 30000
channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
graphs: {
graph : conf/orientdb-empty.properties
}
scriptEngines: {
gremlin-groovy: {
plugins: { org.apache.tinkerpop.gremlin.server.jsr223.GremlinServerGremlinPlugin: {},
org.apache.tinkerpop.gremlin.orientdb.jsr223.OrientDBGremlinPlugin: {},
org.apache.tinkerpop.gremlin.jsr223.ImportGremlinPlugin: {classImports: [java.lang.Math], methodImports: [java.lang.Math#*]},
org.apache.tinkerpop.gremlin.jsr223.ScriptFileGremlinPlugin: {files: [ scripts/empty-sample.groovy]}}}}
serializers:
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { ioRegistries: [org.apache.tinkerpop.gremlin.orientdb.io.OrientIoRegistry] }} # application/vnd.gremlin-v3.0+gryo
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { serializeResultToString: true }} # application/vnd.gremlin-v3.0+gryo-stringd
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0, config: { ioRegistries: [org.apache.tinkerpop.gremlin.orientdb.io.OrientIoRegistry] }} # application/json
processors:
- { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
- { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor, config: { cacheExpirationTime: 600000, cacheMaxSize: 1000 }}
metrics: {
consoleReporter: {enabled: true, interval: 180000},
csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
jmxReporter: {enabled: true},
slf4jReporter: {enabled: true, interval: 180000}}
strictTransactionManagement: false
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 65536
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 65536
ssl: {
enabled: false}
Graph Configuration properties example
gremlin.graph=org.apache.tinkerpop.gremlin.orientdb.OrientFactory
orient-url=plocal:/tmp/graph
orient-user=admin
orient-pass=admin
OrientDB TinkerPop Graph API
In order to use the OrientDB TinkerPop Graph API implementation, you need to create an instance of the OrientGraph
class. If the database already exists, the Graph API opens it.
NOTE: When creating a database through the Graph API, you can only create
PLocal
andMemory
databases. Remote databases must already exist.NOTE: In v. 2.2 and following releases, when using PLocal or Memory,please set MaxDirectMemorySize (JVM setting) to a high value, like 512g
-XX:MaxDirectMemorySize=512g
When building multi-threaded application, use one instance of OrientGraph
per thread. Bear in mind that all graph components, such as vertices and edges, are not thread safe. So, sharing them between threads may result in unpredictable results.
Remember to always close the graph instance when you are done with it, using the .close()
method.
Working with in-memory database
OrientGraph graph = OrientGraph.open();
try {
...
} finally {
graph.close();
}
Working with local database
OrientGraph graph = OrientGraph.open("embedded:/tmp/gremlin/demo");
try {
...
} finally {
graph.close();
}
Working with remote database
OrientGraph graph = OrientGraph.open("remote:localhost/demodb");
try {
...
} finally {
graph.close();
}
Using the Graph Factory
The OrientGraphFactory
act as a factory for OrientGraph
instances
Using a Graph Factory in your application is relatively straightforward. First, create and configure a factory instance using the OrientGraphFactory
class. Then, use the factory whenever you need to create an OrientGraph
instance and shut it down to return it to the pool. When you're done with the factory, close it to release all instances and free up system resources.
OrientGraphFactory factory = new OrientGraphFactory("remote:localhost/demodb","admin","admin");
Then you can:
- retrieve a transactional instance, use the getTx() method on your factory object:
OrientGraph txGraph = factory.getTx();
- retrieve a non-transactional instance, use the getNoTx() method on your factory object:
OrientGraph noTxGraph = factory.getNoTx();
Shutting Down Graph Instances
When you're done with a Graph Database instance, you can return it to the pool by calling the close()
method on the instance. This method does not close the instance. The instance remains open and available for the next requester:
graph.close();
Releasing the Graph Factory
When you're ready to release all instances, call the close()
method on the factory. In the case of pool usage, this also frees up the system resources claimed by the pool:
factory.close();