OrientDB Manual 1.7.8

Indexes

OrientDB supports 4 kinds of indexes:

Index type Durable Transactional Range queries Best features Description
SB-Tree YES YES YES Good mix of all the default one is durable and transactional
Hash YES YES no Super fast lookup, very light on disk Works like a HashMap so it's faster on punctual lookup (select from xxx where salary = 1000) and consumes less resources, but you cannot use it for range queries (select from xxx where salary between 1000 and 2000)
Lucene YES YES YES Good on full-text and spatial indexes Lucene indexes can be used only for full-text and spatial
MVRB-Tree no YES YES Deprecated since v1.6 It's a proprietary algorithm (Open Sourced) with the best of B+Tree (the most used by RDBMS and NoSQL products) and Red-Black Tree. This means fast insertion and update and fast lookup. It's not durable but it's transactional

What's an index?

Indexes can be handled like classes (or tables for RDBMS users) using the SQL language and prefixing with "index:" the index name. The index is like a class (or table) with 2 properties:

  • key, as the index's key
  • rid, as the RecordId that points to the record associated with the key

Index target

Indexes can be updated:

  • Automatically when are bound to schema properties, example "User.id". If you have a schema-less database and you want to create an automatic index, then you need to create the class and the property before using indexing.
  • Manually, handled by the developer using Java API and SQL commands (see below). You can use them as Persistent Maps where they entry's value are the records pointed by index.

Index types

The index type cannot be changed once created. The supported index types are the following:

  • SB-Tree algorithm:
    • UNIQUE, doesn't allow duplicates. For composite index means uniqueness of composite keys.
    • NOTUNIQUE, allows duplicates
    • FULLTEXT, by indexing any single word of the text. It's used in query with the operator CONTAINSTEXT
    • DICTIONARY, like UNIQUE but in case the key already exists replace the record with the new one
  • HashIndex algorithm:
    • UNIQUE_HASH_INDEX, doesn't allow duplicates. For composite index means uniqueness of composite keys.
    • NOTUNIQUE_HASH_INDEX, allows duplicates
    • FULLTEXT_HASH_INDEX, by indexing any single word of the text. It's used in query with the operator CONTAINSTEXT
    • DICTIONARY_HASH_INDEX, like UNIQUE but in case the key already exists replace the record with the new one
  • Lucene engine:

    • FULLTEXT, it uses Lucene to index the string content. Use the LUCENE operator to retrieve it.
    • SPATIAL, it uses Lucene to index the geo spatial coordinates.
  • Any 3rd party index plugged

    Dictionary

    Every single database has a default manual index of type "DICTIONARY" called dictionary with Strings as keys. This is very useful to:

  • handle root records of trees and graphs
  • handle singleton records used for configuration

Operations against indexes

Create an index

Creates a new index. To create an automatic index bound to a schema property use section "ON" of create index command or use as name the <class.property> notation. But assure to have created the schema for it before the index. See the example below.

Syntax:

CREATE INDEX <name> [ON <class-name> (prop-names)] <type> [<key-type>]
                    [METADATA {<metadata>}]

Where:

  • name logical name of index. Can be <class>.<property> to create an automatic index bound to a schema property. In this case class is the class of the schema and property, is the property created into the class. Notice that in another case index name can't contain '.' symbol
  • class-name name of class that automatic index created for. Class with such name must already exist in database
  • prop-names comma-separated list of properties for which automatic index is created for. Property with such name must already exist in schema. If property belongs to one of the Map types (LINKMAP, EMBEDDEDMAP) you can specify the keys or values used for index generation. Use "by key" or "by value" expressions for that, if nothing will be specified keys will be used during index creation.
  • type, can be any index among the supported ones:
    • unique, uses the MVRB-Tree algorithm. Supports range queries.
    • notunique, uses the MVRB-Tree algorithm. Supports range queries.
    • fulltext, uses the MVRB-Tree algorithm. Supports range queries.
    • dictionary, uses the MVRB-Tree algorithm. Supports range queries.
    • unique_hash_index, uses the HashIndex algorithm. Doesn't supports range queries. Available since 1.5.x.
    • notunique_hash_index, uses the HashIndex algorithm. Doesn't supports range queries. Available since 1.5.x.
    • fulltext_hash_index, uses the HashIndex algorithm. Doesn't supports range queries. Available since 1.5.x.
    • dictionary_hash_index, uses the HashIndex algorithm. Doesn't supports range queries. Available since 1.5.x.
  • key-type, is the type of key (Optional). On automatic indexes is auto-determined by reading the target schema property where the index is created. If not specified for manual indexes, at run-time during the first insertion the type will be auto determined by reading the type of the class.
  • metadata is a json representing all the additional metadata as key/value

Examples of custom index:

CREATE INDEX mostRecentRecords unique date

Examples of automatic index bound to the property "id" of class "User":

CREATE PROPERTY User.id BINARY
CREATE INDEX User.id UNIQUE

Another index for "id" property of class "User":

CREATE INDEX indexForId ON User (id) unique

Examples of index for "thumbs" property of class "Movie".

CREATE INDEX thumbsAuthor ON Movie (thumbs) unique
CREATE INDEX thumbsAuthor ON Movie (thumbs by key) unique
CREATE INDEX thumbsValue ON Movie (thumbs by value) unique

Example of composite index

CREATE PROPERTY Book.author STRING
CREATE PROPERTY Book.title STRING
CREATE PROPERTY Book.publicationYears EMBEDDEDLIST INTEGER
CREATE INDEX books ON Book (author, title, publicationYears) unique

For more information look at Create index command.

Drop an index

Drop an index. Linked records will be not removed. Syntax:

DROP INDEX <name>

Where:

  • name of the index to drop

For more information look at Drop index command.

Lookup

Returns all the records with the requested key.

select from index:<index-name> where key = <key>

Example:

select from index:dictionary where key = 'Luke'

Put an entry

Inserts a new entry in the index with key and rid.

insert into index:<index-name> (key,rid) values (<key>,<rid>)

Example:

insert into index:dictionary (key,rid) values ('Luke',#10:4)

Query range

Retrieves the key ranges between min and max.

select from index:<index-name> where key between <min> and <max>

Example:

select from index:coordinates where key between 10.3 and 10.7

Remove entries by key

Deletes all the entries with the requested key.

delete from index:<index-name> where key = <key>

Example:

delete from index:addressbook where key = 'Luke'

Remove an entry

Deletes an entry by passing key and rid. Returns true if removed, otherwise false if the entry wasn't found.

delete from index:<index-name> where key = <key> and rid = <rid>

Example:

delete from index:dictionary where key = 'Luke' and rid = #10:4

Remove all references to a record

Removes all the entries with the rid passed.

delete from index:<index-name> where rid = <rid>

Example:

delete from index:dictionary where rid = #10:4

Count all the entries

Returns the number of entries on that index.

select count(*) as size from index:<index-name>

Example:

select count(*) as size from index:dictionary

Retrieve all the keys

Retrieves all the keys of the index.

select key from index:<index-name>

Example:

select key from index:dictionary

Retrieve all the entries

Retrieves all the entries of the index as pairs key and rid.

select key, value from index:<index-name>

Example:

select key, value from index:dictionary

Clear the index

Removes all the entries. The index will be empty after this call. This removes all the entries of an index.

delete from index:<index-name>

Example:

delete from index:dictionary

Null values

Indexes by default ignore null values. For such reason queries against NULL value that use indexes return no entries.

If you want to index also null values set { ignoreNullValues : false } as metadata. Example:

CREATE INDEX addresses ON Employee (address) notunique
             METADATA {ignoreNullValues : false}

Composite keys

You can do the same operations with composite indexes.

A composite key is a collection of values by its nature, so syntactically it is defined as a collection. For example, if we have class book, indexed by its three fields:

  • author,
  • title and
  • publication year

We can use following query to look up a book:

select from index:books where key = ["Donald Knuth", "The Art of Computer Programming", 1968]

Or to look up a book within a publication year range:

select from index:books where key between ["Donald Knuth", "The Art of Computer Programming", 1960] and ["Donald Knuth", "The Art of Computer Programming", 2000]

This is a mechanism that allows searching index record by several first fields of its composite key. In this case the remaining fields with undefined value can match any value in result.

Composite indexes are used for partial match search only when the declared fields in composite index are used from left to right. Using the example above, if you search only for title, the composite index cannot be used, but it will be used if you search for author + title.

For example, if we don't care when books have been published, we can throw away the publication year field from the query. So, the result of the following query will be all books with this author and title and any publication year

select from index:books where key = ["Author", "The Art of Computer Programming"]

If we also don't know title, we can keep only author field in query. Result of following query will be all books of this author.

select from index:books where key = ["Donald Knuth"]

Or equivalent

select from index:books where key = "Donald Knuth"

Range Queries

In case of range queries, the field subject to the range must be the last one. Example:

select from index:books
  where key between ["Donald Knuth", "The Art of Computer Programming", 1900] 
    and ["Donald Knuth", "The Art of Computer Programming", 2014]

Direct insertion for composite indexes

Unsupported yet.

Custom keys

OrientDB since release 1.0 supports custom keys for indexes. This could give a huge improvement if you want to minimize memory used using your own serializer.

Below an example to handle SHA-256 data as binary keys without using a STRING to represent it saving disk space, cpu and memory.

Create your own type

public static class ComparableBinary implements Comparable<ComparableBinary> {
  private byte[]    value;

  public ComparableBinary(byte[] buffer) {
    value = buffer;
  }

  public int compareTo(ComparableBinary o) {
    final int size = value.length;
    for (int i = 0; i < size; ++i) {
      if (value[i] > o.value[i])
    return 1;
      else if (value[i] < o.value[i])
        return -1;
    }
    return 0;
  }

  public byte[] toByteArray() {
    return value;
  }
}

Create your own binary serializer

public static class OHash256Serializer implements OBinarySerializer<ComparableBinary> {

  public static final OBinaryTypeSerializer INSTANCE = new OBinaryTypeSerializer();
  public static final byte ID = 100;
  public static final int LENGTH = 32;

  public int getObjectSize(final int length) {
    return length;
  }

  public int getObjectSize(final ComparableBinary object) {
    return object.toByteArray().length;
  }

  public void serialize(final ComparableBinary object, final byte[] stream, final int startPosition) {
    final byte[] buffer = object.toByteArray();
    System.arraycopy(buffer, 0, stream, startPosition, buffer.length);
  }

  public ComparableBinary deserialize(final byte[] stream, final int startPosition) {
    final byte[] buffer = Arrays.copyOfRange(stream, startPosition, startPosition + LENGTH);
    return new ComparableBinary(buffer);
  }

  public int getObjectSize(byte[] stream, int startPosition) {
    return LENGTH;
  }

  public byte getId() {
    return ID;
  }
}

Register your serializer

OBinarySerializerFactory.INSTANCE.registerSerializer(new OHash256Serializer(), null);
index = database.getMetadata().getIndexManager().createIndex("custom-hash", "UNIQUE", new ORuntimeKeyIndexDefinition(OHash256Serializer.ID), null, null);

Usage

ComparableBinary key1 = new ComparableBinary(new byte[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1 });
ODocument doc1 = new ODocument().field("k", "key1");
index.put(key1, doc1);

Tips and Tricks

Retrieve the list of indexes

Since OrientDB 1.6:

select expand(indexes) from metadata:indexmanager

Since OrientDB 1.0:

select expand(indexes) from cluster:0

Before OrientDB 1.0:

select expand(indexes) from #0:2

Create your index engine

Here a guide how to create a custom index engine.