This is the first part of a series on building an event sourced application. We’ll build a simple blogging application (inspired by the Ruby on Rails “Getting Started” tutorial), so the domain should be familiar. This allows us to focus on implementing a memory image based architecture using event sourcing. Another goal is to show that this kind of architecture is not more complex (and arguably, simpler) than those implemented by traditional database centered applications. The example code is in Scala using the Play! web framework, but should be understandable enough if you come from a Java (or similar) background.
The example application is build using a Memory Image. This means that the current state of the application is not stored in a database, but kept in main memory instead. This immediately raises the question of how our state is going to survive application restarts or crashes. Since we cannot save the entire state of our application on every change, we’ll need to keep something like a durable transaction log, just like databases do. To implement this every change our application makes is first represented as a domain event. These domain events are stored as our application’s transaction log, using an event store.
After a restart, we can replay the saved domain events to rebuild the in-memory data structure. This concept is known as event sourcing. Keeping this durable record of domain events also means we will continue to have access to all historical information, which is very useful for auditing, troubleshooting, or data mining purposes.
Keeping our current state in memory has some obvious advantages:
There are some disadvantages too:
The code for this part can be found
on https://github.com/zilverline/event-sourced-blog-example. You can use git
clone -b part-1 https://github.com/zilverline/event-sourced-blog-example
command to clone and checkout the code that matches the contents of this
part.
To run the example you need to install either Play! 2.0.2 (or later) or sbt
0.11.3 or later. If you have a Mac and use homebrew you can simply install these using brew install play
or
brew install sbt
.
After installing execute play run
or sbt run
to start the application in development mode. After downloading all required dependencies and
compilation, the application should be available on http://localhost:9000/.
Click around a bit to see how everything works. The functionality is pretty minimal, so this shouldn’t take long.
The domain events (and supporting classes) for blog postings are defined in PostEvents.scala. Since the application is (still) extremely simple, the events basically model adding a new blog posts, and editing or deleting an existing one. These basically mimic the typical create/update/delete actions, but are named in terms our users understand. This way we also capture the intent of the user, not just the effects of the action they just performed. Later we’ll add more events, such as Post Published and Comment Added.
sealed trait PostEvent {
def postId: PostId
}
case class PostAdded(postId: PostId, content: PostContent) extends PostEvent
case class PostEdited(postId: PostId, content: PostContent) extends PostEvent
case class PostDeleted(postId: PostId) extends PostEvent
Notice that events are named in the past tense. This can be a bit confusing when writing code that actually generates the events, but at all other times events have always happened in the past so this naming convention makes sense.
The events are implemented using Scala case classes. The Scala compiler will translate
these into ordinary JVM classes but will add the fields specified in the constructor, accessors, field based equals
/hashCode
implementations, and
a factory method so that you do not need to use the new
keyword when instantiating an object. Case classes can also be used with Scala’s match
expression.
In Scala the type is written after the name of a variable or parameter. Explicit types are usually only required on parameters, as the Scala compiler can usually figure out the type in other cases by itself.
The events all extend the PostEvent
trait (which is translated into a JVM interface). The
PostEvent
trait is marked as sealed
so that it can only be extended in the same source file. This allows the Scala compiler to do compile-time
analysis of match statements to ensure all possible cases are covered. This is very useful when new events are added later!
The events also use two support classes to represent the blog post’s identifier and content, respectively:
case class PostContent(author: String, title: String, content: String)
case class PostId(uuid: UUID)
object PostId {
def generate(): PostId = PostId(UUID.randomUUID())
def fromString(s: String): Option[PostId] = s match {
case PostIdRegex(uuid) => catching(classOf[RuntimeException]) opt { PostId(UUID.fromString(uuid)) }
case _ => None
}
private val PostIdRegex = """PostId\(([a-fA-F0-9-]{36})\)""".r
}
The PostContent
class is a simple case class with three fields, while the PostId
class has
a companion object with a generate
factory method and fromString
parse method.
Since events make up the durable record of everything that happened in our domain, they need to be stable. Stability can be achieved by getting the design right (hard!) and ensuring the events related definitions have very few dependencies on other code. See Stability (PDF) by Robert C. Martin for a great explanation of managing stability and dependencies within a program.
Although events are very useful to track transactional and historical information, they cannot easily be used to determine the current state. So in addition to capturing the events, we also need to derive the current state from these events. We do not necessarily need to track every piece of data, only what we need for our application to make decisions (by creating new events) or to show information to the user (by rendering views, or sending out emails, etc). The state needed for the blogging application is implemented in Post.scala:
/**
* A specific blog post with its current content.
*/
case class Post(id: PostId, content: PostContent)
/**
* The current state of blog posts, derived from all committed PostEvents.
*/
case class Posts(byId: Map[PostId, Post] = Map.empty, orderedByTimeAdded: Seq[PostId] = Vector.empty) {
def get(id: PostId): Option[Post] = byId.get(id)
def mostRecent(n: Int): Seq[Post] = orderedByTimeAdded.takeRight(n).reverse.map(byId)
def apply(event: PostEvent): Posts = event match {
case PostAdded(id, content) =>
this.copy(byId = byId.updated(id, Post(id, content)), orderedByTimeAdded = orderedByTimeAdded :+ id)
case PostEdited(id, content) =>
this.copy(byId = byId.updated(id, byId(id).copy(content = content)))
case PostDeleted(id) =>
this.copy(byId = byId - id, orderedByTimeAdded = orderedByTimeAdded.filterNot(_ == id))
}
}
object Posts {
def fromHistory(events: PostEvent*): Posts = events.foldLeft(Posts())(_ apply _)
}
Since the model classes are in-memory only, we’re free to use regular Scala data structure classes to implement whatever query capabilities we
need. There is no need to worry about Object-Relational Mapping, etc. Here we use a map to track each blog post by its identifier and a simple Seq
(ordered collection or list) to keep track of the order that posts were added. Notice that this entire model is represented using immutable values,
which allows us to safely share the current state between multiple concurrent requests.
The Posts.get
method simply looks up a post by its identifier. The Posts.mostRecent
method takes the last n
added post identifers, reverses the
result (so the most recently added post is first), and translates each identifier into a post using the byId
map.
The Posts.apply
method implements updating the current state based on an event. It basically matches on the type of event and updates its state
accordingly. Posts.fromHistory
builds up the current state by folding a sequence of events using the Posts.apply
method. As with all immutable
structures, an update results in a new copy of the original state to be returned with the necessary changes applied. Fortunately we do not need to
copy everything to apply changes, only the parts that are changed need to be copied. This makes immutable data structures efficient enough to be
practical.
Another advantage of using an in-memory model is that it can be thoroughly tested, with no need to start or stub a database. PostsSpec.scala uses both manually written examples and randomly generated data to test the model. Using the randomly generated date we can run hundreds of tests in less than a second.
The UI pulls everything together into a working application. In this case it is a standard Play! 2.0 Scala application and the main work is done by the PostsController.
In this example application the PostsController
keeps
a Software Transactional Memory (STM) reference to the current state of the application. This
reference is our only piece of mutable data in the entire application!
object PostsController extends Controller {
/**
* A Scala STM reference holding the current state of the application,
* which is derived from all committed events.
*/
val posts = Ref(Posts()).single
By using an STM reference any controller method can always access the current state simply by reading from the reference. New events are applied to
the current state using the commit
method:
/**
* Commits an event and applies it to the current state.
*/
def commit(event: PostEvent): Unit = {
posts.transform(_.apply(event))
Logger.debug("Committed event: " + event)
}
The posts
reference is updated by using the transform
method. This ensures the update occurs in a concurrency safe manner. In this version of the
application the events are not yet saved to durable storage, but to help you see what is happening while running the application the event is printed
to the debug log instead. Later we’ll implement saving the committed events to durable storage before updating the current state. The commit
method
returns Unit
, which is similar to void
in Java and provides no useful information to the caller.
The PostsController.index
method is invoked by Play! to render a list of posts (the mapping from URL to a controller method is configured in
the routes file). The index
method returns a
Play! action which generates an HTTP response from an HTTP request. In this case the
response is an HTTP OK (200) containing the most recent 20 posts as rendered by
the index template. Notice the use of the
parentheses to read the current value of the posts
reference:
/**
* Show an overview of the most recent blog posts.
*/
def index = Action { implicit request =>
Ok(views.html.posts.index(posts().mostRecent(20)))
}
To change an existing post we need to implement a GET to render an HTML form, and a HTTP POST to validate the form and update the post. The
postContentForm
defines the mapping between the HTML add/edit form and the PostContent
class using
a Play! form:
/*
* Blog content form definition.
*/
private val postContentForm = Form(mapping(
"author" -> trimmedText.verifying(minLength(3)),
"title" -> trimmedText.verifying(minLength(3)),
"content" -> trimmedText.verifying(minLength(3)))(PostContent.apply)(PostContent.unapply))
The show
method first tries to look up the specified post by its id. If successful it fills the form with the current contents of the post and
renders the views.html.posts.edit
template. Otherwise it returns an HTTP 404 (Not Found) response:
def show(id: PostId) = Action { implicit request =>
posts().get(id) match {
case Some(post) => Ok(views.html.posts.edit(id, postContentForm.fill(post.content)))
case None => NotFound(notFound(request, None))
}
}
When the user has made the modifications and hits the save button the submit
method is invoked. First we bind the HTTP POST parameters to the form
(bindFromRequest
) and then perform validation (fold
). If form validation succeeds it commits a new PostEdited
event and redirects the browser to
the PostsController.show
action to show the updated post. Otherwise the method rerenders the HTML form together with the validation errors:
def submit(id: PostId) = Action { implicit request =>
postContentForm.bindFromRequest.fold(
formWithErrors => BadRequest(views.html.posts.edit(id, formWithErrors)),
postContent => {
commit(PostEdited(id, postContent))
Redirect(routes.PostsController.show(id)).flashing("info" -> "Post saved.")
})
}
Note that our “domain logic” is implemented directly in the controller. This is fine for simple applications like this example, but a rich domain model is usually introduced when the logic to generate the events becomes less straightforward.
Like we discusses above one of the advantages of an in-memory model is that there is no need to talk to a database just to render a view. Even though this application has not been performance tuned, it is still interesting to get a basic idea of the performance. I ran some trivial benchmarks on my laptop (an early 2010 MacBook Pro with 2.66 GHz dual core i7 with hyper-threading).
Just rendering the blog posts index view runs at about 5,500 GETs per second (with the three example blog posts you see when you start the application). This was measured with Apache JMeter using 25 concurrent client threads running on the same machine as the server.
Just submitting the blog post edit form runs at 4,200 HTTP POSTs per second or so.
In both tests the server is basically CPU bound, since we’re not performing any disk I/O. It will be interesting to see how well we can do with an
actual event store implementation. The server was running on JDK 1.7.0u5 with the Garbage First GC (JVM option -XX:+UseG1GC
).
The example application shows a simplified implementation of the event sourcing and memory image principles. Domain events are generated and used to apply updates to the memory image, which is then used to render views, without the need for a traditional database.
Hopefully you’ve noticed that none of the code is very complex. Each part (events, model, controller actions) simply focuses on a single task, without
any of the “magic” that is so common with many web- or CRUD-frameworks. Even the most complicated method (Posts.apply
) is a rather straightforward
translation of events into updates to the current state and is easy to thoroughly test.
But the events are not yet committed to durable storage, which is necessary to build a production-ready application. In the next part we’ll see what is needed to capture the events correctly so that we can start writing the events to durable storage.