Clusters

The Cluster is a place where a group of records are stored. Like the Class, it is comparable with the collection in traditional document databases, and in relational databases with the table. However, this is a loose comparison given that unlike a table, clusters allow you to store the data of a class in different physical locations.

To list all the configured clusters on your system, use the CLUSTERS command in the console:

orientdb> CLUSTERS

CLUSTERS:
-------------+------+-----------+-----------+
 NAME        | ID   | TYPE      | RECORDS   |
-------------+------+-----------+-----------+
 account     | 11   | PHYSICAL  |      1107 |
 actor       | 91   | PHYSICAL  |         3 |
 address     | 19   | PHYSICAL  |       166 |
 animal      | 17   | PHYSICAL  |         0 |
 animalrace  | 16   | PHYSICAL  |         2 |
 ....        | .... | ....      |      .... |
-------------+------+-----------+-----------+
 TOTAL                                23481 |
--------------------------------------------+

Understanding Clusters

By default, OrientDB creates one cluster for each Class. Starting from v2.2, OrientDB automatically creates multiple clusters per each class (the number of clusters created is equals to the number of CPU's cores available on the server) to improve using of parallelism. All records of a class are stored in the same cluster, which has the same name as the class. You can create up to 32,767 (or, 215 - 1) clusters in a database. Understanding the concepts of classes and clusters allows you to take advantage of the power of clusters in designing new databases.

While the default strategy is that each class maps to one cluster, a class can rely on multiple clusters. For instance, you can spawn records physically in multiple locations, thereby creating multiple clusters.

Class-Custer

Here, you have a class Customer that relies on two clusters:

  • USA_customers, which is a cluster that contains all customers in the United States.

  • China_customers, which is a cluster that contains all customers in China.

In this deployment, the default cluster is USA_customers. Whenever commands are run on the Customer class, such as INSERT statements, OrientDB assigns this new data to the default cluster.

Class-Cluster

The new entry from the INSERT statement is added to the USA_customers cluster, given that it's the default. Inserting data into a non-default cluster would require that you specify the cluster you want to insert the data into in your statement.

When you run a query on the Customer class, such as SELECT queries, for instance:

Class-Cluster

OrientDB scans all clusters associated with the class in looking for matches.

In the event that you know the cluster in which the data is stored, you can query that cluster directly to avoid scanning all others and optimize the query.

Class-Cluster

Here, OrientDB only scans the China_customers cluster of the Customer class in looking for matches

Note: The method OrientDB uses to select the cluster, where it inserts new records, is configurable and extensible. For more information, see Cluster Selection.

Working with Clusters

In OrientDB there are two types of clusters:

  • Physical Cluster (known as local) which is persistent because it writes directly to the file system
  • Memory Cluster where everything is volatile and will be lost on termination of the process or server if the database is remote

For most cases, physical clusters are preferred because databases must be persistent. OrientDB creates physical clusters by default.

You may also find it beneficial to locate different clusters on different servers, physically separating where you store records in your database. The advantages of this include:

  • Optimization Faster query execution against clusters, given that you need only search a subset of the clusters in a class.
  • Indexes With good partitioning, you can reduce or remove the use of indexes.
  • Parallel Queries: Queries can be run in parallel when made to data on multiple disks.
  • Sharding: You can shard large data-sets across multiple instances.

Adding Clusters

When you create a class, OrientDB creates a default cluster of the same name. In order for you to take advantage of the power of clusters, you need to create additional clusters on the class. This is done with the ALTER CLASS statement in conjunction with the ADDCLUSTER parameter.

To add a cluster to the Customer class, use an ALTER CLASS statement in the console:

orientdb> ALTER CLASS Customer ADDCLUSTER UK_Customers

Class updated successfully

You now have a third cluster for the Customer class, covering those customers located in the United Kingdom.

Viewing Records in a Cluster

Clusters store the records contained by a class in OrientDB. You can view all records that belong to a cluster using the BROWSE CLUSTER command and the data belonging to a particular record with the DISPLAY RECORD command.

In the above example, you added a cluster to a class for storing records customer information based on their locations around the world, but you did not create these records or add any data. As a result, running these commands on the Customer class returns no results. Instead, for the examples below, consider the ouser cluster.

OrientDB ships with a number of default clusters to store data from its default classes. You can see these using the CLUSTERS command. Among these, there is the ouser cluster, which stores data of the users on your database.

To see records stored in the ouser cluster, run the BROWSE CLUSTER command:

orientdb> BROWSE CLUSTER OUser

---+------+--------+--------+----------------------------------+--------+-------+
 # | @RID | @CLASS | name   | password                         | status | roles |
---+------+-------+--------+-----------------------------------+--------+-------+
 0 | #5:0 | OUser | admin  | {SHA-256}8C6976E5B5410415BDE90... | ACTIVE | [1]   |
 1 | #5:1 | OUser | reader | {SHA-256}3D0941964AA3EBDCB00CC... | ACTIVE | [1]   |
 2 | #5:2 | OUser | writer | {SHA-256}B93006774CBDD4B299389... | ACTIVE | [1]   |
---+------+--------+--------+----------------------------------+--------+-------+

The results are identical to executing BROWSE CLASS on the OUser class, given that there is only one cluster for the OUser class in this example.

In the example, you are listing all of the users of the database. While this is fine for your initial setup and as an example, it is not particularly secure. To further improve security in production environments, see Security.

When you run BROWSE CLUSTER, the first column in the output provides the identifier number, which you can use to display detailed information on that particular record.

To show the first record browsed from the ouser cluster, run the DISPLAY RECORD command:

orientdb> DISPLAY RECORD 0

------------------------------------------------------------------------------+
 Document - @class: OUser                      @rid: #5:0      @version: 1    |
----------+-------------------------------------------------------------------+
     Name | Value                                                             |
----------+-------------------------------------------------------------------+
     name | admin                                                             |
 password | {SHA-256}8C6976E5B5410415BDE908BD4DEE15DFB167A9C873F8A81F6F2AB... |
   status | ACTIVE                                                            |
    roles | [#4:0=#4:0]                                                       |
----------+-------------------------------------------------------------------+

Bear in mind that this command references the last call of BROWSE CLUSTER. You can continue to display other records, but you cannot display records from another cluster until you browse that particular cluster.

results matching ""

    No results matching ""