Domain-Driven Design with Functional Programming in Scala
Implementing a domain model using functional programming in Scala.
November 02, 2020 ·It’s been a long time since I was thinking about writing such an article. The final enlightenment occurred came after working on a few projects with complex domains when I came across Domain Modeling Made Functional by Scott Wlaschin. Reading this book led me to several observations I’d like to share.
Scala is a perfect match for modeling complex domains. Its advanced type system allows expressing business concepts and rules more precisely and directly. The benefits are evident in code that becomes more type-safe, descriptive, and comprehensible. It improves internal and external communication within the team, prevents from bugs, and reduces the number of unit tests, as we can make use of the compiler to the maximum.
When I’m thinking about domain modeling in Scala, I see the following aspects:
- Using Algebraic Data Types (ADTs) to model data structures, that represent different concepts, states, and relations in our domain model.
- Using Option, Either, IO/Future types to model different effects of operations in our domain model.
- Using function composition to describe business processes occurring in our domain model.
- Making illegal states in the domain model unrepresentable.
Domain model does not only consist of entities like Orders, or Payments. Orders can be placed, shipped, or fulfilled. Payments can be authorized, captured, or settled. All of those state transitions are usually parts of larger business processes, which are modeled as chains of composed functions operating on ADTs, where the compiler fails the compilation if illegal states are detected. This is where functional programming shines if you’re modeling non-trivial domains.
In this article, we’ll be talking about implementing a domain model using Functional Programming in Scala. We’ll be using Scala 2, but we’ll also see what changes are coming with Scala 3. We’ll see how libraries like cats, refined, newtype and enumeratum can be used when implementing a domain model.
But before we jump into the code, let’s talk about Domain-Driven Design (DDD) first.
Domain-Driven Design 101
A domain model is the only touchpoint between the real world and the code. It’s neither a diagram nor classes/ADTs in your code. Not a database schema either. It’s a shared mental model of how the business works between the developers and domain experts (a.k.a. the Business). You can, however, implement the domain model as a code using your favorite programming language, as a diagram, or as textual documentation. With that in mind, Domain-Driven Design is all about making the domain model the focus point in the project. DDD comes with a set of techniques ranging from the ways of working with domain experts to explore the model (Event Storming, Domain Story Telling), to improving communication (Ubiquitous Language), splitting the domain into cohesive groups (Bounded Contexts), and even architectural patterns (Hexagonal Architecture, Event Sourcing, CQRS). All of them revolving around the domain model.
What is a model? Model is a simplification of reality, that is used to solve a particular problem. A city map is a simplification, that allows you to find the route. It doesn’t contain unnecessary details, like what color buildings are, how they look, or what type of pavement is used. A city map is just a network of lines, that cross with each other, that is just enough to solve the problem of finding the route.
What is the domain? Cambridge dictionary gives some nice explanations:
- one’s area of interest or of knowledge
- a particular interest, activity, or type of knowledge
In the context of software engineering, I’d slightly extend that definition and say: one’s area of interest or of knowledge where we apply the code to solve problems. For example, E-commerce or Payments fall in that category and are very valid domains some of you might have already been involved in. In other words, it’s an area of expertise of domain experts.
Why do I need to know about DDD as a Scala developer? If you’re reading this article, then this might, or might not shock you if I say you’re probably using DDD more or less consciously to some extent already. DDD was first presented in 2003 in Domain-Driven Design: Tackling Complexity in the Heart of Software by Eric Evans (a.k.a. the Blue Book). The book is just a collection of observations and patterns, that naturally emerge “in the wild”. It’s likely they’re present in your environment as well. Again, DDD at its core is all about a shared understanding of the domain model and improved communication by making it explicitly the focus point in a project.
If there’s a shared understanding of the domain between the developers and domain experts, then the application code should reflect the reality. This is where you, the Scala developer, step in and make sure it does by using available language constructs and libraries.
For what kind of Scala applications can I use DDD? Where not? DDD can be applied in backend systems, where your goal is to create software that solves non-trivial business problems recognized by domain experts. Gaming or hardware might not fall in that category. Simple CRUDs might not either. However, more complex systems with non-trivial business logic like processing payment transactions or orders in an e-commerce store are good candidates.
What about Spark? Well, as mentioned, DDD is about designing backend systems where the focus point is the domain model. On the other hand, DDD is a set of techniques, and I’d say some of those might be definitely used with Spark applications (e.g. Event Storming or Ubiquitous Language).
Enough about DDD. There are so many things about DDD, that would be relevant to add here, but I want to focus more on the practical use of Scala.
However, if you’re keen to do more reading on that, here some recommended resources:
- Domain Modeling Made Functional: Tackle Software Complexity with Domain-Driven Design and F# by Scott Wlaschin
- Domain-Driven Design: Tackling Complexity in the Heart of Software by Eric Evans
- https://github.com/heynickc/awesome-ddd
DDD + FP + Scala
Features of Scala and FP concepts that make it easy to implement domain models. How would this diagram look with different programming languages?
So we’ve talked about DDD and its goal of improved communication and understanding between the developers and domain experts. Now let’s assume we used Event Storming to understand business processes we want to automate, we have now a more clear picture of what we want to deliver in the first iteration and it’s just the right moment to start writing code!
Types…. Types, everywhere…..
The first thing we can do when implementing a domain model is to use types, that should be a part of the ubiquitous language and would make the implementation of the domain model more precise. Let’s see how to do that by looking first at a counterexample.
In some codebases, you can spot the following primitive type based implementation of a domain model:
case class Invoice(
companyName: String,
taxId: String,
street: String,
zipCode: String,
city: String,
country: String,
amount: BigDecimal,
currency: String,
payableUntil: ZonedDateTime
)
The values don’t have any domain meaning. They’re just strings with labels. It’ll hurt you in the long run, especially when you’ll be doing more operations on the Invoice as it’s easy to confuse the fields. You could end up with:
Invoice(
companyName = "123-45-67-890",
taxId = "",
street = "SW1A 0AA",
zipCode = "",
city = "UK",
country = "London",
amount = BigDecimal(10),
currency = "Oxford Street",
payableUntil = ZonedDateTime.now().plusWeeks(2)
)
and the project would still compile.
We can do better with Entities and Value Objects
Domain modeling with types is all about making the business knowledge explicit in the code. This is where DDD comes into play with concepts called Entities and Value Objects.
- Entities are objects (not in OOP terms) that have a domain meaning with an identity and often a life cycle. For example Order, Customer, Shipment, Contract.
- Value Objects represent values with a domain meaning. They lack identity and lifecycle. This is what makes them different from entities. Money, Address, or CompanyName would fall in that category.
It can happen that in one domain some concepts can be modeled as value objects, in another one as entities. In an address book application, an address would probably be an entity, while in an invoicing system it could be a value object. There are more details regarding those constructs and you can find more about them in “The Blue Book” by Eric Evans.
Going back to our Invoice example, how we can improve it? A good starting point can be replacing all primitives with types that are more meaningful:
case class InvoiceId(value: UUID)
case class CompanyName(value: String)
case class TaxId(value: String)
case class Street(value: String)
case class ZipCode(value: String)
case class City(value: String)
case class Country(value: String)
case class Amount(value: BigDecimal)
case class Currency(value: String)
case class Invoice(
invoiceId: InvoiceId,
companyName: CompanyName,
taxId: TaxId,
street: Street,
zipCode: ZipCode,
city: City,
country: Country,
amount: Amount,
currency: Currency,
payableUntil: ZonedDateTime
)
That’s better, as now we can’t confuse any fields, as they have distinct types. We can improve that even further:
case class InvoiceId(value: UUID)
case class Company(
name: CompanyName,
billingAddress: BillingAddress,
taxId: TaxId
)
case class BillingAddress(
street: Street,
zipCode: ZipCode,
city: City,
country: Country
)
sealed trait Currency
case object PLN extends Currency
case object EUR extends Currency
case object GBP extends Currency
case class Money(amount: BigDecimal, currency: Currency)
case class Invoice(
invoiceId: InvoiceId,
company: Company,
amount: Money,
payableUntil: ZonedDateTime
)
Side note: You might have doubts about runtime performance. Introducing so many new types, each of which will be instantiated at once with Invoice must impose some performance penalty. We’ll address this problem later.
We extracted all meaningful values to separate case classes (a.k.a. Value Objects), that are now a part of an Invoice case class (a.k.a. Entity). We gain several benefits from such design:
- All terms, like billing address, currency, or tax id exist in the ubiquitous language and we use it every day to communicate with domain experts. If you show the snippet above to a non-technical domain expert, they’ll understand and validate it. They wouldn’t get what Strings or BigDecimals are since they know only about invoices, tax ids, or company names.
- We can add validation to every value object we extracted. Then, using function composition, we can construct a valid invoice. We’ll see how to do that in the next section.
- Every value object can contain behavior associated with it. What I mean is that you can define functions operating on e.g. Money, that would for example sum 2 amounts, and it’d be of type
sum: (Money, Money) => Money
instead of operating on raw BigDecimals. Again: better type safety, readability, and less confusion.
It might be also worth pointing out, that Currency is an enum and can be easily implemented using Enumeratum library in Scala 2.x. What we gain is automatic type class derivation for enumeratum enums (e.g. Circe or Doobie codecs), convenience constructors (withName
, withNameOption
, withNameUppercaseOnlyOption
, …), or a possibility to list all enum entries. Scala 3, however, comes with the new enum construct, that might be the way to go.
Making illegal states unrepresentable
Making illegal states unrepresentable is another aspect that might bring our implementation of the domain model closer to reality.
This concept boils down to forbidding illegal states in our application instead of performing defensive runtime checks or validations. If we try to encode such an invalid state, the compiler will raise an error. There are three primary ways to achieve that:
- By using ADTs properly.
- By using smart constructors.
- By using type refinement.
Sounds enigmatic? Let’s see them in action.
Making illegal states unrepresentable using ADTs
Let’s imagine we have a system, that processes jobs asynchronously. It allows fetching the result of each task by ID. A naive way of modeling the task result would be:
case class JobResult[R](
result: Option[R],
progress: Double,
eta: Option[Duration],
isComplete: Boolean,
isFailed: Boolean
)
When we see such a definition, we can only assume the intentions. It’s easy to put JobResult in an invalid state. For example, what would be the meaning of:
JobResult(
result = Some(42),
progress = 47.34,
eta = Some(Duration(3, TimeUnit.HOURS)),
isComplete = true,
isFailed = false
)
Instead, we can model the job result as follows:
sealed trait JobResult
case class Done[R](result: R) extends JobResult
case class InProgress(progress: Double, eta: Option[Duration]) extends JobResult
case class Failed(error: Throwable) extends JobResult
Now it’s impossible to put the job result in an invalid state and the intentions are clearly expressed using types.
Hint: if you see booleans in your domain model, think if it makes sense to convert them to distinct types.
Making illegal states unrepresentable using smart constructors
In the past, I had an interesting discussion with a colleague about validation. When I joined the project I noticed all validations were performed inside domain services. Here’s a simplified code of the OrderService:
case class Order(customerId: CustomerId, amount: Money)
class OrderService(...) {
def fulfill(order: Order): IO[Either[Error, Unit]] = {
if (order.amount.amount > BigDecimal(10)) {
Left(Error("Order amount is too high")).liftTo[IO]
}
if (customerId.value.isEmpty) {
Left(Error("CustomerId is empty")).liftTo[IO]
}
// continue with order fulfillment
}
}
What they wanted to achieve was reasonable. We should make sure Order is valid before starting the fulfilment process. This was a requirement from the domain expert.
Unfortunately, from a technical perspective, such a defensive approach to validation is not scalable. If the number of validation rules grows, the order fulfilment function will be cluttered with validation logic. Not to mention an exploding amount of unit tests. What if we want to implement an order shipping functionality? We’d probably have to duplicate all the validations there.
What if instead of performing validation inside the OrderService, we could ensure it’s impossible to create an invalid Order in our system? So whenever an Order is constructed, it is already valid, and OrderService now:
- Doesn’t have to perform Order validation.
- Doesn’t have to figure out what to do if it’s invalid.
In other words, we’d make Order in an illegal state (e.g. with invalid amount) unrepresentable in our application. Any other piece of code operating on an Order, not necessarily the OrderService, could immediately assume it’s valid. Let’s try it!
First of all, let’s make the constructor private:
case class Order private(customerId: CustomerId, amount: Money)
Unfortunately, we’re still able to use the apply
in companion object to construct an Order:
@ Order(CustomerId("a"), Money(BigDecimal(10), PLN))
res10: Order = Order(CustomerId("a"), Money(10, PLN))
To forbid that, we’ll make the apply method private:
object Order {
private def apply(
customerId: CustomerId,
amount: Money
): Order = new Order(customerId, amount)
}
It’s still possible to create an invalid order by creating a valid one and then copying it using copy
method and setting invalid values. To prevent that, we can use a trick to define Order as sealed abstract case class
. This will make copying impossible. Whether to use sealed abstract case class
, or just a case class
, is a matter of taste. On one hand, you forbid copying, on the other, the signature becomes longer. Luckily, in Scala 3 this is not an issue anymore. Case classes with private constructors will have apply
and copy
methods private as well.
Now we want to allow constructing only valid *Orders*. To do that, we’ll create a smart constructor, that will perform the validation:
object Order {
def create(
customerId: CustomerId,
amount: Money
): Either[ValidationError, Order] = ???
}
Implementing validation
Let’s have a look at how could we approach the implementation of the create
smart constructor. A naive way to do it would be:
def create(
customerId: CustomerId,
amount: Money
): Either[ValidationError, Order] =
if (customerId.value.isEmpty) {
Left(ValidationError("customerId should be nonempty"))
} else if (amount.amount < 0) {
Left(ValidationError("Amount should be greater than 0"))
} else {
Right(Order(customerId, amount, items))
}
But it’s not scalable. With more fields and validation rules, create
method would grow quickly. When introducing Value Objects, I mentioned they could perform validation on their own. With such an approach, we wouldn’t be able to create an invalid CustomerId or Money in our application.
case class CustomerId private (value: String)
object CustomerId {
private def apply(
value: String
): CustomerId = new CustomerId(value)
def create(
value: String
): Either[ValidationError, Order] =
if (value.nonEmpty) Right(CustomerId(value))
else Left(ValidationError("customerid should be nonempty"))
}
Assuming that we do the same with Money, the Order create
method would look as follows:
def create(
customerId: CustomerId,
amount: Money
): Either[ValidationError, Order] = Order(customerId, amount, items)
We don’t need to validate CustomerId, nor the Money when constructing an Order. We could even make the Order#apply
method public again now. That’s because we pushed the responsibility of constructing a valid CustomerId and Money to the caller.
Constructing *Order* from primitive types
Yet, we still might want to construct an Order from primitive types. One way to implement it is by using a for-comp:
def create(
customerId: String,
amount: BigDecimal,
currency: String
): Either[ValidationError, Order] =
for {
customerId <- CustomerId.create(customerId)
money <- Money.create(amount, currency)
} yield Order(customerId, money)
In this approach, however, the execution would short-circuit on the first validation error found. If customerId is invalid, then the program wouldn’t even try to create money. The for-comprehension would immediately return a ValidationError
instead. Let’s see that in action:
@ Order.create("", -10, "PLN")
res8: Left(ValidationError(customerid should be nonempty))
@ Order.create("nonemptyCustomerId", -10, "PLN")
res9: Left(ValidationError(amount should be greater than 0))
@ Order.create("nonemptyCustomerId", 42, "PLN")
res10: Right(Order(CustomerId(nonemptyCustomerId),Money(42,PLN)))
Maybe we expect such fail-fast behavior, maybe not. Oftentimes when performing validation we want to verify all the fields at once and accumulate errors, if occurred. Unfortunately, it’s not doable with Either
. To achieve this, we can use another data type called Validated available in Cats.
Using Validated data type
Validated data type is very similar to Either. It has the same shape as Either: Validated[E, A]
, and serves a similar purpose. Why would you consider using it if it’s the same though? The difference is that Validated performs validation with accumulating all the errors, while Either does not.
… but why? This difference in behavior between Validated and Either comes from the fact, that Validated is an applicative, and Either is a monad. That means, that you cannot flatMap Validated, but you can only compose it using applicative composition (e.g. mapN).
Side note: Functional constructs like functors, applicatives (a.k.a. applicative functors), or monads are not the goal of this article. While you’re not required to understand them as a functional programmer, it’s worth doing that, and Validated is in fact a great way to learn about applicative functors. I can recommend reading documentation about Validated, which describes what an applicative is and how it differs from a monad.
Let’s get back to the code. With Validated, the implementation would look as follows:
import cats.implicits._
import cats.data.NonEmptyList
def create(
customerId: String,
amount: BigDecimal,
currency: String
): Either[NonEmptyList[ValidationError], Order] =
(
CustomerId.create(customerId).toValidatedNel,
Money.create(amount, currency).toValidatedNel
).mapN(Order.apply).toEither
We converted all smart constructors returning Eithers to Validated using toValidatedNel
. The toValidatedNel
helps us to convert from Either[E, A]
to Validated[NonEmptyList[E], A]
, as the errors will be accumulated inside the NonEmptyList. We’re using mapN
, which is one of the ways to compose applicatives. In the end, we convert back to Either[NonEmptyList[ValidationError], Order]
using toEither
.
Let’s try it out:
@ Order.create("", -10, "PLN")
res10: Left(NonEmptyList(ValidationError(customerid should be nonempty), ValidationError(amount should be greater than 0)))
@ Order.create("nonemptyCustomerId", 42, "PLN")
res11: Right(Order(CustomerId(nonemptyCustomerId),Money(42,PLN)))
As we can see all the validation errors are accumulated now.
Error accumulation with Either using cats Parallel
While it’s impossible to use just Either to achieve accumulative behavior, we can use a trick to emulate that using cats Parallel type class.
The Parallel type class, that is applicable to Monads, that could be also composed in parallel as if they were Applicative. All Monads are applicatives. It means there’s an applicative instance available for every monad (but not the other way around), but it’s implemented in terms of flatMap (there’s so-called monad-applicative consistency law). In other words, you can do mapN on e.g. Either monad, but it’ll compose sequentially anyways i.e. with fail-fast behavior.
Source: https://impurepics.com/posts/2019-03-18-monad-applicative-consistency-law.html
Parallel solves that problem by converting a Monad to Applicative under the hood. What it means in the context of Order validation is that we could implement it as follows:
def create(
customerId: String,
amount: BigDecimal,
currency: String
): Either[NonEmptyList[ValidationError], Order] =
(
CustomerId.create(customerId).toEitherNel,
Money.create(amount, currency).toEitherNel
).parMapN(Order.apply)
We use toEitherNel
, that simply converts any Either[E, A]
to Either[NonEmptyList[E], A]
. Then, instead of mapN
, we use parMapN
, that does the job for us: converts Either to Validated, runs validation, and converts back to Either. Neat!
Making illegal states unrepresentable using type refinement
Type refinement is another way to make invalid states unrepresentable. For example, having Order modeled as follows:
case class CustomerId(value: String)
case class Money(amount: BigDecimal, currency: Currency)
case class Order(customerId: CustomerId, amount: Money)
Let’s say, that we want customerId to be always non-empty, and money amount to be always great than 0. All other values should be forbidden. We could achieve that using so-called refined types. What it means is we can express the business constraints (customerId != “”
and Money amount > 0
) using types and validation will be working both during compile and runtime. We’ll use fthomas/refined library for that.
For example, to make sure CustomerId is always non-empty, we’ll NonEmptyString instead of String. The same for the amount: instead of BigDecimal, we’ll use NonNegBigDecimal.
import eu.timepit.refined.types.all._
case class CustomerId(value: NonEmptyString)
case class Money(amount: NonNegBigDecimal, currency: Currency)
case class Order(customerId: CustomerId, amount: Money)
To create values of the refined types we can use the following functions:
import eu.timepit.refined.refineMV
import eu.timepit.refined.refineV
@ refineMV[NonEmpty]("a")
res0: NonEmptyString = a
@ refineV[NonEmpty]("b")
res1: Either[String, NonEmptyString] = Right(bb)
@ refineV[NonEmpty]("")
res2: Either[String, NonEmptyString] = Left(Predicate isEmpty() did not fail.)
refineMV
turns a literal value (known at compile time) into a refined type.refineV
turns a runtime value into a refined type.
We can still create smart constructors that would internally use refineV
, and expose our validation error types instead of the errors types coming from Refined.
With type refinement, we see what are the possible values just by looking at types. We also don’t have to implement the validation, as refined does that for us with the refineV
construct. There are more and all of them can be found in the docs.
Optimizing runtime performance
We learned, that to make our domain model rich, we want to use value objects instead of primitive types, and build the behaviors around them. However, such an approach costs us performance.
case class Street(value: String)
case class ZipCode(value: String)
case class City(value: String)
case class Country(value: String)
case class BillingAddress(
street: Street,
zipCode: ZipCode,
city: City,
country: Country
)
To create a BillingAddress, we need to instantiate Street, ZipCode, City, and Country. They’ll also need to be garbage collected at some point. That all sounds like we’d hamper the runtime performance.
Fortunately, Scala and its libraries offer a few mechanisms to avoid that:
a) Value classes allow avoiding unnecessary allocations just by extending a case class holding a single value with AnyVal
. The wrapped type will be used in the runtime instead. With value classes, the code above would look as follows:
case class Street(value: String) extends AnyVal
case class ZipCode(value: String) extends AnyVal
case class City(value: String) extends AnyVal
case class Country(value: String) extends AnyVal
case class BillingAddress(
street: Street,
zipCode: ZipCode,
city: City,
country: Country
)
Yet it’s not a silver bullet. There is a variety of situations where it would actually box and that’s why it’s not the ultimate solution. The documentation specifies situations where allocation would be necessary, but in practice, it’s hard to predict.
b) newtypes come with @newtype
macro annotation, that will completely avoid boxing in the runtime for case classes holding a single value. Newtypes in Scala is available through estatico/scala-newtype library. They come with limitations. Newtypes are definitely a content for a separate article. Here’s a brief comparison of value classes and new types.
c) Opaque types (Scala 3) allow to create zero runtime cost wrappers for our types, and will potentially become the way to go when Scala 3 is released. You can see some usage examples here or in the docs.
Summary
I started this article by saying, that Scala is a perfect match for modeling complex domains and I think the examples above confirm that hypothesis.
We saw what DDD means in the context of backend Scala applications. With DDD mindset, we try to keep the domain model as close as possible to reality, knowing it’s the only touchpoint between the code and fuzzy real world. Scala, in contrast to other programming languages, definitely helps us with that.
By using more types and ADTs we make the domain model expressive and more understandable even to non-technical domain experts. With pure functions, we avoid surprises, as we immediately know about all kinds of possible results of business operations. Function composition helps us construct big business processes out of smaller actions and build big entities out of small value objects. Finally, the compiler helps us to keep illegal states not representable in our application.
Now I have a question for you: what constructs in Scala 2/3 or libraries do you find helpful when implementing a domain model? Please share your observations in the comments.
One last thing: there’s a concept I deliberately didn’t cover here: modeling DDD aggregates. This topic is big enough for its own article that I might write in the future.
I hope you pulled some interesting ideas from this article and now you can start using them in your project!
You can find all the code examples on Github: bszwej/ddd-with-fp-in-scala