Gremlin is a language specialized to work with Property Graphs. Gremlin is part of TinkerPop Open Source products. For more information:
To know more about Gremlin and TinkerPop's products subscribe to the Gremlin Group.
Launch the gremlin.sh (or gremlin.bat on Windows OS) console script located in bin directory:
> gremlin.bat
\,,,/
(o o)
-----oOOo-(_)-oOOo-----
Before to play with Gremlin you need a valid OrientGraph instance that points to a OrientDB database. To know all the database types look at Storage types.
When you're working with a local or memory database if the database not exists it's created automatically. Using the remote connection you need to create the database on the target server before to use it. This is due to security restrictions.
Once created the OrientGraph instance with a proper URL is necessary to assign it to a variable. Gremlin is written in Groovy, so it supports all the Groovy syntax and both can be mixed to create very powerful scripts!
Example with a local database (see below for more information about it):
gremlin> g = new OrientGraph("local:/home/gremlin/db/demo");
==>orientgraph[local:/home/gremlin/db/demo]
Some useful links:
This is the most used mode. The console opens and locks the database for exclusive use. Doesn't require to start a OrientDB Server.
gremlin> g = new OrientGraph("local:/home/gremlin/db/demo");
==>orientgraph[local:/home/gremlin/db/demo]
Open a database on a remote server. Assure the server is up and running. To start the server just launch server.sh (or server.bat on Windows OS) script. For more information look at OrientDB Server
gremlin> g = new OrientGraph("remote:localhost/demo");
==>orientgraph[remote:localhost/demo]
In this mode the database is volatile and all the changes will be not persistent. Use this in cluster configuration (the database life is assured by the cluster itself) or just for test.
gremlin> g = new OrientGraph("memory:demo");
==>orientgraph[memory:demo]
OrientDB supports the security by creating multiple users and roles to associate privileges. To know more look at Security. To open the graph database with a different user than default pass user and password as additional parameters:
gremlin> g = new OrientGraph("memory:demo", "reader", "reader");
==>orientgraph[memory:demo]
To create a new vertex use the addVertex() method. The vertex will be created and the unique id will be displayed as return value.
g.addVertex();
==>v[#5:0]
To create a new edge between two vertices use the addEdge(v1, v2, label) method. The edge will be created with the label specified.
In the example below 2 vertices are created and assigned to a variable (Gremlin is based on Groovy), then an edge is created between them.
gremlin> v1 = g.addVertex();
==>v[#5:0]
gremlin> v2 = g.addVertex();
==>v[#5:1]
gremlin> e = g.addEdge(v1, v2, 'friend');
==>e[#6:0][#5:0-friend->#5:1]
OrientDB assigns a temporary identifier to each vertex and edge that is created. For saving them to the database stopTransaction(SUCCESS) should be called
gremlin> g.stopTransaction(SUCCESS)
To retrieve a vertex by its ID, use the v(id) method passing the RecordId as argument (with or without the prefix '#'). This example retrieves the first vertex created in the upon example.
gremlin> g.v('5:0')
==>v[#5:0]
To retrieve all the vertices in the opened graph use .V (V in upper-case):
gremlin> g.V
==>v[#5:0]
==>v[#5:1]
Retrieving an edge it's very similar to [use the e(id) method passing the Concepts#RecordId RecordId as argument (with or without the prefix '#'). This example retrieves the first edge created in the upon example.
gremlin> g.e('6:0')
==>e[#6:0][#5:0-friend->#5:1]
To retrieve all the edges in the opened graph use .E (E in upper-case):
gremlin> g.E
==>e[#6:0][#5:0-friend->#5:1]
The power of Gremlin is on traversal. Once you have a graph loaded in your database you can traverse it in many ways.
To display all the outgoing edges of the first vertex just created postpone the .outE at the vertex. Example:
gremlin> v1.outE
==>e[#6:0][#5:0-friend->#5:1]
And to display all the incoming edges of the second vertex created in the previous examples postpone the .inE at the vertex. Example:
gremlin> v2.inE
==>e[#6:0][#5:0-friend->#5:1]
In this case the edge is the same because it's the outgoing of 5:0 and the goes up to 5:1 where is the incoming edge.
For more information look at the Basic Traversal with Gremlin.
This examples returns all the outgoing edges of all the vertices with label equals to 'friend'.
gremlin> g.V.outE('friend')
==>e[#6:0][#5:0-friend->#5:1]
To close a graph use the shutdown() method:
gremlin> g.shutdown()
==>null
This is not strictly necessary because OrientDB always closes the database when the Gremlin console quits.
Gremlin allows to concatenate expressions to create more complex traversal in a single line:
v1.outE.inV
Of course this could be much more complex. Below an examples with the graph taken from the official documentation:
g = new OrientGraph('memory:test')
// calculate basic collaborative filtering for vertex 1
m = [:]
g.v(1).out('likes').in('likes').out('likes').groupCount(m)
m.sort{a,b -> a.value <=> b.value}
// calculate the primary eigenvector (eigenvector centrality) of a graph
m = [:]; c = 0;
g.V.out.groupCount(m).loop(2){c++ < 1000}
m.sort{a,b -> a.value <=> b.value}
Some Gremlin expressions require declaration of input parameters to be run. This is the case, for example, of bound variables, as described in JSR223 Gremlin Script Engine. OrientDB has enabled a mechanism to pass variables to a Gremlin pipeline declared in a command as described below:
Map<String, Object> params = new HashMap<String, Object>();
params.put("map1", new HashMap());
params.put("map2", new HashMap());
db.command(new OCommandSQL("select gremlin('
current.as('id').outE.label.groupCount(map1).optional('id').sideEffect{map2=it.map();map2+=map1;}
')")).execute(params);
You can also use native Java GremlinPipeline like:
new GremlinPipeline(g.getVertex(1)).out("knows").property("name").filter(new PipeFunction<String,Boolean>() {
public Boolean compute(String argument) {
return argument.startsWith("j");
}
}).back(2).out("created");
For more information: Using Gremlin through Java
In the simplest case, the output of the last step (https://github.com/tinkerpop/gremlin/wiki/Gremlin-Steps) in the Gremlin pipeline corresponds to the output of the overall Gremlin expression. However, it is possible to instruct the Gremlin engine to consider any of the input variables as output. This can be declared as:
Map<String, Object> params = new HashMap<String, Object>();
params.put("map1", new HashMap());
params.put("map2", new HashMap());
params.put("output", "map2");
db.command(new OCommandSQL("select gremlin('
current.as('id').outE.label.groupCount(map1).optional('id').sideEffect{map2=it.map();map2+=map1;}
')")).execute(params);
There are more possibilities to define the output in Gremlin pipelines so this mechanism is expected to be extended in the future. Please, contact OrientDB mailing list to discuss customized outputs.
Now you learned how to use Gremlin on top of OrientDB the best place to go in deep with this powerful language is the Gremlin WiKi.