Workiva/eva - Bioreports

What is Eva?
- Getting Started
Development
Configuration
About the Eva Data Model
Components
Running with Docker
Additional Resources
- FAQ
- Maintainers and Contributors
  - Active Maintainers
  - Previous Contributors

Eva is a distributed database-system implementing an entity-attribute-value data-model
that is time-aware, accumulative, and atomically consistent. Its API is
by-and-large compatible with Datomic’s. This software should be considered alpha
for the purposes of quality and stability. Check out the FAQ for more info.

Getting Started

If you are brand new to Eva, we suggest reading through this entire readme to familiarize yourself with Eva as a whole. Afterwards, be sure to check out the Eva tutorial series, which break down and go over almost everything you will want to know.

Required Tools

Example: Hello World

First we kick off the repl with:

lein repl

Next we create a connection (conn) to an in-memory Eva database. We also need to define the fact (datom) we want to add to Eva. Finally we use the transact call to add the fact into the system.

(def conn (eva/connect {:local true}))
(def datom [:db/add (eva/tempid :db.part/user)
            :db/doc "hello world"])
(deref (eva/transact conn [datom]))

Note: deref can be used interchangeably with the @ symbol.

Now we can run a query to get this fact out of Eva. We don’t use conn to make a query but rather we obtain an immutable database value like so:

Next we execute a query that returns all entity ids in the system matching the doc string "hello world".

(eva/q '[:find ?e :where [?e :db/doc "hello world"]] db)

If we want to return the full representation of these entities, we can do that by adding pull to our query.

(eva/q '[:find (pull ?e [*]) :where [?e :db/doc "hello world"]] db)

Project Structure

project.clj contains the project build configuration
core/* primary releasable codebase for Eva Transactor and Peer-library
1. core/src clojure source files
2. core/java-src java source files
3. core/test test source files
4. core/resources non-source files
dev/* codebase used during development, but not released
1. dev/src clojure source files
2. dev/java-src java source files
3. dev/test test source files
4. dev/resources non-source files
5. dev/test-resources non-source files used to support integration testing

Development Tasks

Running the Test Suite

lein test

Eva exposes a number of configuration-properties that can be configured
using java system-properties. Some specific configuration-properties can
also be configured using environment-variables.

The eva.config namespace, linked here, contains
descriptions and default values for the config vars.

Entity-Attribute-Value (EAV)

EAV data-entities consist of:

a distinct entity-id
1-or-more attribute-value pairs associated with a single entity-id

EAV data can be represented in the following (equivalent) forms:

as an object or map:

{:db/id 12345,
 :attribute1 "value1",
 :attribute2 "value2"}

as a list of EAV tuples:

[
  [12345, :attribute1, "value1"],
  [12345, :attribute2, "value2"]
]

Time-Aware

To make the EAV data-model time-aware, we extend the EAV-tuple
into an EAVT-tuple containing the transaction-id (T) that introduced the tuple:

[
;;  E      A            V         T
   [12345, :attribute1, "value1", 500],
   [12345, :attribute2, "value2", 500]
]

Accumulative

To make the EAVT data-model accumulative, we extend the EAVT-tuple
with a final flag that indicates if the EAV information was added or removed
at the transaction-id (T).

[
;;  E      A            V         T    added?
   [12345, :attribute1, "value1", 500, true],
   [12345, :attribute2, "value2", 500, true]
]

Under this model, common data operations (create, update, delete) are represented like this:

Create: a single tuple with added? == true

[[12345, :attribute1, "create entity 12345 with field :attribute1 at transaction 500", 500, true]]

Delete: a single tuple with added? == false

[[12345, :attribute1, "create entity 12345 with field :attribute1 at transaction 500", 501, false]]

Update: a pair of deletion and creation tuples

[
 ;; At transaction 502
 ;;   invalidate the old entry for :attribute2
      [12345, :attribute2, "old-value", 502, false]
 ;;   add a new entry for :attribute2
      [12345, :attribute2, "new-value", 502, true]
]

The complete history of the database is the cumulative list of these tuples.

Atomic Consistency

Data-updates are submitted as transactions that are processed atomically.
This means that when you submit a transaction, either all the changes in
the transaction are applied, or none of the changes are applied.

Transactions

Transactions are submitted as a list of data-modification commands.

The simplest data-modification commands (:db/add, :db/retract) correspond
to the accumulative tuples described above:

[
  [:db/retract 12345 :attribute2 "old-value"]
  [:db/add 12345 :attribute2 "new-value"]
]

When this transaction is committed it will produce the following tuples in the database history
(where is the next transaction-number):

[
  [12345, :attribute2, "old-value", , false]
  [12345, :attribute2, "new-value", , true]
]

Using Object/Map form in transactions

In addition to the command-form, you can also create/update data using the
object/map form of an entity:

[
  {:db/id 12345
   :attribute1 "value1"
   :attribute2 "value2"}
]

This form is equivalent to the command-form:

[
  [:db/add 12345 :attribute1 "value1"]
  [:db/add 12345 :attribute2 "value2"]
]

Schemas

Because all stored data reduces to EAVT tuples,
schemas are defined per Attribute, rather than per Entity.

Schemas definitions are simply Entities that have special schema-attributes.

Defining the schema for `:attribute1`:

[
  {:db/id #db/id[:db.part/db]
   :db/ident :attribute1
   :db/doc "Schema definition for attribute1"
   :db/valueType :db.type/string
   :db/cardinality :db.cardinality/one
   :db.install/_attribute :db.part/db}
]

Taking each key-value pair of the example in turn:

:db/id #db/id[:db.part/db]: declares a new entity-id in the :db.part/db id-partition
:db/ident :attribute1: declares that :attribute1 is an alias for the entity-id
:db/doc "Schema definition for attribute1": human-readable string documenting the purpose of :attribute1
:db/valueType :db.type/string: declares that only string values are allowed for :attribute1
:db/cardinality :db.cardinality/one: declares that an entity may no-more-than one :attribute1.
This means that for an given entity-id, there will only ever be one current tuple of [ :attribute1 ].
Adding a new tuple with this attribute will cause any existing tuple to be removed.
:db.install/_attribute :db.part/db: declares that this :attribute1 is registered with the database as
an installed attribute

The included docker compose can be used to spin up a completely integrated Eva environment. This includes:

transactor
eva-catalog
activemq
maria-db image

To spin up said environment run the following commands:

make gen-docker-no-tests # to build Eva with the latest changes
make run-docker

To shut down the the environment use the following command:

In order to open a repl container that can talk to the environment use:

And run the following to initially setup the repl environment:

(require '[eva.catalog.client.alpha.client :as catalog-client])
(def config (catalog-client/request-flat-config "http://eva-catalog:3000" "workiva" "eva-test-1" "test-db-1"))
(def conn (eva/connect config))

Finally, test that everything is working with an empty transaction:

(deref (eva/transact conn []))

A similar result to this should be expected:

{:tempids {}, :tx-data (#datom[4398046511105 15 #inst "2018-06-06T17:35:07.516-00:00" 4398046511105 true]), :db-before #DB[0], :db-after #DB[1]}

Additional Documentation.
The reference documentation at http://docs.datomic.com/ is highly relevant for development within Eva.
http://www.learndatalogtoday.org/ is a great resource for learning Datalog, the underlying query language in Eva.
https://github.com/kristianmandrup/datascript-tutorial is another great resource for learning Datalog and how it applies to Datomic (or Eva).

FAQ

Is this project or Workiva in any way affiliated with Cognitect?

No. Eva is its own project we built from the ground up. The API and high-level
architecture are largely compatible with Datomic, but the database, up to
some EPL code, was entirely built in-house. We have a list of the most
notable API differences here.

Should I use Eva instead of Datomic?

If you are looking for an easy system to move to production quicky, almost
certainly not. Eva is far less mature and has seen
far less time in battle. Datomic Cloud is an amazing (and supported) product
that is far easier to stand up and run with confidence. Eva is provided as-is.

What are the key differences between Eva and Datomic?

There are a handful of small API differences in the Peer, whereas the Clients
are quite distinct. For example, a Connection in Eva is constructed using a
baroque configuration map, not a string. From an operational standpoint,
Datomic is far more turn-key. There are also likely some low-level
architectural differences between the systems that will cause them to
exhibit different run-time characteristics. For example, our indexes are
backed by a persistent B^𝜀-tree, whereas Datomic’s indexes seem to exhibit
properties more like a persistent log-structured merge-tree. For a more detailed
list check here.

Why did Workiva build Eva?

Workvia’s business model requires fine-grained and scalable
multi-tenancy with a degree of control and flexibility that fulfill our
own evolving and unique compliance requirements. Additionally, development
on Eva began before many powerful features of Datomic were released, including
Datomic Cloud and Datomic Client.

Why is Workiva open sourcing Eva?

The project by-and-large is nearly feature complete and we believe is generally
technically sound. Workiva has decided to discontinue closed development on Eva,
but sees a great deal of potential value in opening the code base to the OSS
community. It is our hope that the community finds value in the project as a
whole.