For some applications, typically those where the volume of data is reasonable small (e.g. configuration databases), it is useful to access a database in a INSERT-only fashion. i.e. no UPDATEs or DELETEs.
Every time you perform an UPDATE or DELETE you are throwing away data.
Even if you currently consider the old data to be useless, that may not always be the case.
Amending and Hiding Data
I can imagine your response – OK, how do I amend or hide data?
Assume your domain object is a
foo. Normally you would create a table called
foo, and that would be the end of the matter.
When using an Event Sourcing design, you have similar table called
foo_created which stores the details of foos at the time they were created.
Amendments are simply stored in various other tables, linked to the
Deletetions are simply yet another table, called
foo_deleted, linked to the
The Current State of Foo
Getting the current state requires linking the various tables together. This will always be slower that retrieving one record, but you can cache the results if required.
By writing queries with care, the complexity of joining the tables can be kept to O(1). i.e. constant time, regardless of the size of the database.
To demonstrate an implementation with MySQL, we will model a simple ‘users’ class, which has an email address, a password hash, the time of last password change, and the time of last login.
This would traditionally be modelled in MySQL as:
1 2 3 4 5 6 7 8 9
Each time that a user changed their password, or logged in, the old state would be thrown away.
To store the same information in an Event Sourcing system, we need tables to store events, and queries or views to serve up the current state.
The events/tables will be
The schemas for these tables are as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Next we need to be able to find the last event, for a particular user_id, efficiently.
The following form of query does that (for
1 2 3 4 5 6 7
First we’ll form a query which uses
user_created as its base,
with the password_hash value being superseded
by the latest
user_password_changed new_password_hash (if any).
1 2 3 4 5 6 7 8 9 10
COALESCE is used to give us either the
user_password_changed or (if the former is NULL) from
Now we can add in the latest values for the
Here is the Current State Query for a user:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Isn’t this Query the Spawn of Satan?
If you run ‘explain’ on it you’ll see that three foreign field selects have a note ‘Select tables optimized away’.
This means that MySQL is not having the look at the tables in order to find out which record contains the latest event for the
user_id (for the fields
It is still more expensive than a query on the
but it’s not too bad.
What can you do with an Event Sourcing design, that you couldn’t with a traditional one?
Locking – with no updates, no locking is required, and instead we can move from prohibiting clashes to resolving them later. This works well with sharding and high availablity. It’s what banks do – they don’t use transactions for everything – it’s important that a cash machine works even if its link is down.
History – by adding
timestamp < Xclauses to the Current State Query we can see the state of an object at any point in the past.
Undo – because we store every event, we can undo changes by simply adding a new event. We can undo any change, not just the latest change.
Debugging – we now know about every event in the system, which makes debugging a breeze.
See Martin Fowler’s page if you are unfamiliar with Event Sourcing.