Semantic Typing We Ignore

AI Summary15 min read

TL;DR

Semantic typing in Kotlin involves creating meaningful type subsets to express intent and prevent misuse, enhancing compile-time safety, documentation, and validation. It shifts from structural types like 'String' to semantic types like 'Email', centralizing constraints and reducing duplication. This practice applies to both application and library code, guiding deliberate type narrowing for clearer, safer software.

Key Takeaways

  • Semantic typing creates meaningful type subsets (e.g., 'Email' instead of 'String') to express intent and prevent accidental misuse, improving code clarity and safety.
  • Benefits include compile-time safety by making invalid calls impossible, self-documenting code that reduces duplication, and centralized validation that eliminates scattered checks.
  • It is particularly useful in domain layers (e.g., Value Objects in DDD) to enforce business rules and in libraries to ensure robust, maintainable APIs.
  • While it adds upfront effort, it reduces long-term maintenance by centralizing logic and preventing bugs, making it a worthwhile practice in many contexts.

Tags

kotlinarchitectureprogrammingcodequality

In Kotlin, we constantly narrow our types. We prefer String over Any for text, Int over Any or even Number for integers. This feels obvious — almost trivial. Done, right?

So what is this article actually about?

While we subconsciously avoid Any in most situations — for good reasons, or sometimes simply because "why would I put Any here instead of Int?" — we often don't apply the same thinking when modeling our own software: its types, its behavior, and the relationships between them.

This article explores the practice of semantic typing — what it means, why you might want it, when it improves your code, and when it becomes a hindrance. We'll look at how semantic typing shows up in real Kotlin projects, both in application code and library design, and what tends to work well — or not.

Even if you're not following Domain-Driven Design, these ideas still apply. You already narrow types every day; this article is about doing it deliberately and thoughtfully.

We'll distill these lessons into practical rules to help you decide when to narrow types semantically — guided by reason, not intuition or habit.

What is semantic typing?

Let's start by defining what we're actually talking about, why it matters, and why it's not something you can just ignore.

Semantic typing is the practice of creating semantically meaningful subsets of general-purpose types to express intent and prevent accidental misuse. It's about shifting from structural types ("this is a string") to behavioral or semantic types ("this is a email").

The idea of semantic typing isn't unique to Kotlin — it has long been a common practice in languages like Java, where developers define lightweight wrapper classes (e.g. UserId, Email) around primitives to encode domain meaning and prevent accidental misuse.

For example:

@JvmInline
value class EmailAddress(val raw: String) {...}
Enter fullscreen mode Exit fullscreen mode

It's worth mentioning that semantic typing isn't limited to wrapping primitives; it applies equally to wrapping any type — whether a simple Int or a complex Map<K, V>. The core idea is to give semantic meaning and stronger type safety by distinguishing values beyond their raw structure.

Why should you care? This is usually where skepticism appears. Whenever this approach comes up, the reactions often sounds like this:

What do I gain in return for this extra code? Isn't this just more boilerplate?

Those are completely valid questions — it makes you invest more time into thinking how to model your code and overall takes more time to write it. I'm all for following ideas of KISS and avoiding wrong abstractions problem.. and now onto a big BUT! 🌚

What you get in return is actually not an abstraction for its own sake — it's clarity. Let's look at what this means in practice.

Why?

Compile-time safety

Let's create an example:

/**
 * @param value The width in density-independent pixels. 
 */
fun setWidth(value: Int) { ... }
Enter fullscreen mode Exit fullscreen mode

At first glance, this looks fine. But it's easy to accidentally swap the measurement unit when calling it:

val containerSize = 1000 // random int or just raw pixels value
card.setWidth(containerSize) // dp was expected
Enter fullscreen mode Exit fullscreen mode

Code successfully compiles and even may seem valid, but there's lying a big bug that you can't validate inside a function.

By narrowing types, we make invalid calls impossible at compile time. For example, we can introduce semantic type Dp

@JvmInline
value class Dp(public val raw: Int) {...}

/**
 * @param value The width in density-independent pixels. 
 */
fun setWidth(value: Dp) { ... }
Enter fullscreen mode Exit fullscreen mode

Now when the value argument is passed, we make sure that the call-site is well-aware of measurement unit we expect.

Documentation

I'm not the only one who's occasionally too lazy to check or write documentation, right? But don't blame me — like a wise man once said (Robert C. Martin, in Clean Code):

The proper use of comments is to compensate for our failure to express ourself in code. Note that I used the word failure. I meant it. Comments are always failures. We must have them because we cannot always figure out how to express ourselves without them, but their use is not a cause for celebration. So when you find yourself in a position where you need to write a comment, think it through and see whether there isn’t some way to turn the tables and express yourself in code.

Semantic typing doesn't just let you wrap a value in a class — it lets you document the type itself. You convey exactly what kind of data it expects, without repeating that information in every function that uses it. Self-documenting code, done right.

Coming back to our example:

/**
 * @param value The width in density-independent pixels. 
 */
fun setWidth(value: Dp) { ... }
Enter fullscreen mode Exit fullscreen mode

You might find the documentation a little funny — as if you could just pass something else instead of Dp. But besides making that comment almost obsolete, we also eliminate another problem: documentation duplication.

Think about it: if we have setWidth, we probably also have setHeight, setSpacing, and similar functions. Without semantic typing, the same documentation gets copied everywhere — or worse, it's incomplete, missing or outdated entirely somewhere, because somebody was lazy or simply forgot. Then anyone reading the code has to guess the expected input based on other parts of the code they might not even be calling. With a narrowed type, that guesswork disappears — you just reuse the type where it's semantically appropriate.

But there's more. Beyond "wrapping" data, you need to consider identity and semantics. You're not just slapping a random name on it, like the one GitHub suggested for your repo, you're giving it real meaning. A type should stand alone, without requiring extra context, so you don't end up with something like this:

fun setWidth(dp: Int) { ... }
// vs
fun setWidth(dp: Size) { ... }
Enter fullscreen mode Exit fullscreen mode

Where you're trying to express the unit of measurement through parameter naming. And documenting function instead don't do better job either. Who forces you or me to check the parameter name? What if after a long day of work I falsely assumed that it accepts raw pixels?

Even in the second example, you could still mess up the units, effectively making it another Int, but just wrapped and with a little less flaws (we guarded it from being just a random Int to a random Size). And the old Int version? Totally generic — it tells us something and nothing at the same time. semantic typing should and forces the code to say what it means, loud and clear. That what makes it powerful and self-documentable.

Validation

Another benefit of introducing semantic typing — is validation. We're circling back to self-documentation, but this time the focus is on reuse.  

In our previous examples, every function had to validate density-independent pixels individually to fail fast (hopefully we were doing it, right?) and avoid hidden bugs:

fun setWidth(dp: Int) {
    require(dp < 0) { "dp cannot be negative" }
    // ...
}

fun setHeight(dp: Int) {
    require(dp < 0) { "dp cannot be negative" }
    // ...
}
// ...
Enter fullscreen mode Exit fullscreen mode

Now we can move that logic into the type itself, making it a true subset type, just as we defined earlier:

@JvmInline
value class Dp(public val raw: Int) {
    init {
        require(dp < 0) { "dp cannot be negative" }
    }
}
Enter fullscreen mode Exit fullscreen mode

Notice how this eliminates repeated validation, guarantees correctness everywhere, and clearly documents the intended constraints.

Note
You can, of course, introduce more robust validation to be more type-safe depending on where and how it's used; I'm planning to publish a separate article about it in a few days, so stay tuned and don't panic when seeing simplified validation. 😄

We could have actually merged Documentation and Validation into one section — but I kept them separate deliberately. Why? Because some validation challenges are more subtle, and using a semantic type highlights them perfectly.

Take a real-world example we hinted at in the introduction:

@JvmInline
value class EmailAddress(val raw: String) {...}
Enter fullscreen mode Exit fullscreen mode

Email validation has always been a headache. Different parts of a system may enforce different rules, often without anyone noticing. One function might just check that the string contains @. Another might use a simplified StackOverflow regex. A third might try to implement the full RFC standard.

The result? Three functions that semantically expect the same input may behave differently: one accepts a string, others reject it. Bugs like this are subtle, hard to catch, and annoying to debug.

By narrowing it to a semantic type, you centralize the constraint:

  • Any EmailAddress instance is guaranteed to be valid according to your rules. And they remain the same.
  • Consumers of the type don't need to repeat validation.
  • The compiler enforces that only valid data flows through your system.

Validation + self-documentation + compile-time safety: this is the power of semantic typing in practice.

When to use it?

With the examples in mind we may proceed to create a rule when we should actually use semantic typing and when most likely not.

Depending on your project or application-level code requirements, it may differ, but not too much. We definitely don't want to over-engineer!

Application code

Let's start with what Kotlin focuses on first — application-level code. Usually, in our projects we apply architecture patterns such as Clean Architecture, Domain-Driven Design, and sometimes even Hexagonal Architecture. All of them share a common layer — the domain — which derives in some form from DDD (check my article if you're not familiar with it: Digging Deep to Find the Right Balance Between DDD, Clean and Hexagonal Architectures).

In the context of domain layers, we typically enforce business rules, constraints, and overall core business logic, isolated from infrastructure or application concerns. Since domains are intended to reflect the language of domain experts, it's usually beneficial to introduce semantic typing, and more specifically, Value Objects.

Let's take an example — a User aggregate:

data class User(
    val id: Int,
    val email: String,
    val name: String,
    val bio: String,
)
Enter fullscreen mode Exit fullscreen mode

To enforce business rules and constraints, you could go down a couple of paths. Let's start with the opposite of semantic typing, just to see what can go wrong:

data class User(
    val id: Int,
    val email: String,
    val name: String,
    val bio: String?,
) {
    init {
        require(id >= 0)
        require(email.matches(emailRegex))
        require(name.length in 2..50)
        require(bio == null || bio.length in 0..500)
    }
}
Enter fullscreen mode Exit fullscreen mode

This is not an invalid approach, but let's reason about what can go wrong or feel suboptimal here.

Duplication is one obvious problem. It's common to see multiple representations of the same entity (in this case, User) with the same properties, which forces you to duplicate validation:

data class PublicUser(
    val id: Int,
    val name: String,
    val bio: String
) {
    init {
        require(id >= 0)
        require(name.length in 2..50)
        require(bio == null || bio.length in 0..500)
    }
}
Enter fullscreen mode Exit fullscreen mode

Here, User contains full information for the owner, while PublicUser is returned to everyone else without the email. Aside from feeling unease of duplication, the rules and constraints tend to change over time, making this decentralized approach fragile and prone to being forgotten.

The solution? Introduce Value Objects with semantic typing. Each property becomes a type that encodes its constraints, centralizing validation and making your domain model self-documenting:

@JvmInline
value class UserId(val rawInt: Int) {
    init { require(raw >= 0) }
} 

@JvmInline
value class Email(val rawString: String) {
    init { require(raw.matches(emailRegex)) }
}

@JvmInline
value class UserName(val rawString: String) {
    init { require(raw.length in 2..50) }
}

@JvmInline
value class Bio(val rawString: String) {
    init { require(raw.length in 0..500) }
}

data class User(
    val id: UserId,
    val email: Email,
    val name: UserName,
    val bio: Bio?,
)
Enter fullscreen mode Exit fullscreen mode

With this approach, validation is centralized, duplication disappears, and your domain objects become self-explanatory and safer by design. Each property now carries semantic meaning, and any changes to rules need to be made only in the Value Object itself, not scattered across multiple classes.

Let's also consider a more complex structure instead of just wrapping primitives. Imagine we have two bounded contexts (features):

  • One is responsible for the shopping cart,
  • The other is responsible for payments.

Both features share the same underlying data — selected products — but the business rules differ. For the shopping cart, it's valid to have an empty cart while the user is still selecting products. For the payment feature, however, it's crucial that the cart is not empty — the user must arrive with at least one selected product:

data class PaymentCart(
    val userId: Int,
    val products: List<Product>
) {
    init {
        require(userId >= 0)
        require(products.isNotEmpty()) // must not be empty for payment
    }
}
Enter fullscreen mode Exit fullscreen mode

Meanwhile, you might have something like this instead:

@JvmInline
value class ProductSelection(val raw: List<Product>) {
    // can be something more robust with factory pattern and type-safe result for outer layer, as it's part of user input
    init { require(raw.isNotEmpty()) } // enforces non-empty list
}

data class PaymentCart(
    val userId: Int,
    val products: ProductSelection
)
Enter fullscreen mode Exit fullscreen mode

Now, the type itself guarantees the constraint: you cannot construct a ProductSelection with an empty list, eliminating the risk of forgetting validation if we have, for example, more than one aggregate, like PaymentCart and Bill.

We can also give ProductSelection additional behavior. For instance, if certain campaigns restrict delivery costs depending on the products purchased:

@JvmInline
value class ProductSelection(val raw: List<Product>) {
    init { require(raw.isNotEmpty()) }

    val specialProducts: List<Product>
        get() = raw.filter { /* some filtering logic */ }
}

data class PaymentCart(
    val userId: Int,
    val products: ProductSelection
) {
    val deliveryCosts: Money
        get() = if (products.specialProducts.size >= 3) Money.ZERO else /* normal cost */
}
Enter fullscreen mode Exit fullscreen mode

While you could technically implement this logic inside the aggregate itself, localizing responsibilities in the Value Object (semantic type) simplifies the code, makes the API more explicit, and keeps the aggregate focused on core domain logic. And why should it apply only to code in the domain layer?

Library code

Even though library code is often an additional component of any application — meaning it doesn't always follow the strict rules, conventions, or approaches typical of application code (aside from generic best practices) — I would still strongly recommend using the same approach to narrow your types almost everywhere.

Application code is usually written with the goal of minimizing upfront cost. This means that while it can be worthwhile to apply strict modeling in specific parts (like the domain layer) to reduce future testing and maintenance overhead, maintaining the same strict model everywhere often seems unnecessary or excessive. However, in my opinion, libraries don't fall under this "cost-saving" rationale.

Consider the following example:

@Serializable
sealed interface JsonRpcResponse<T> {
    val jsonrpc: JsonRpcVersion
    val id: JsonRpcRequestId

    data class Success<T>(
        val jsonrpc: JsonRpcVersion,
        val id: JsonRpcRequestId,
        val result: T,
    ) : JsonRpcResponse<T>

    data class Error(
        val jsonrpc: JsonRpcVersion,
        val id: JsonRpcRequestId,
        val error: JsonRpcResponseError,
    )
}

@Serializable
@JvmInline
value class JsonRpcVersion(val string: String) {
    companion object {
        val V2_0 = JsonRpcVersion("2.0")
    }
}

@Serializable
@JvmInline
value class JsonRpcRequestId(val jsonElement: JsonElement) {
    init {
        require(jsonElement is JsonNull || jsonElement is JsonPrimitive) {
            "JSON-RPC ID must be a primitive or null"
        }
        
 </

Visit Website