A question that simplifies state management

This content originally appeared on DEV Community and was authored by Ayron Wohletz

Is the state essential or derived? Making the distinction helps simplify programs by eliminating unnecessary mutable state.

Essential data sits at the root of the software system. It is the information inherent to the problem domain. For example, in a note-taking app, the text of the note is essential data. No matter the surrounding implementation, we're going to have to store the text that the user has entered.

Derived data gets created based on essential data. Based on the note text, the note-taking app can create a full-text search index. It can display the text in the UI. It can give the user statistics on average sentence length, and so on.

When creating/storing derived state, you can make it mutable or immutable.

Immutable derived state gets re-computed on the fly (i.e. whenever essential state changes upstream.) It doesn't live in any persistent location that you have to keep updated. For example, if you give input data to a chain of pure functions, you always get the correct derived state as output. The functions do not store anything. Or you might use a reactive library/framework that takes care of propagating changes to derived state. This is great for simplicity; when the essential state changes, you don't have to worry about all the derived places you have to update.

Frameworks like React demonstrate this approach. In the old days, frontend devs had mutable state everywhere in the DOM and would manually update it with jQuery. It was easy to miss a spot. With React, you change the essential state (e.g. the component props), your component transforms that into VDOM, and then React transforms the VDOM into real DOM mutations. So you can trust that when state changes, the DOM will accurately reflect it. In database terms, the DOM is a "materialized view" of the app state.

The potential downside of immutable derived state is performance. If it's expensive to derive on the fly, then you can consider making it mutable derived state -- like a "materialized view" of the essential state. To be precise, the derived state itself may be an immutable value, but that value is stored in a mutable location. That way you can quickly access it without re-deriving it, e.g. a caching layer. That however causes another problem, a data synchronization/replication problem.

That problem happens whenever you have data in one place and need to keep a (derived) copy of it in another place up to date. That's a hard problem. Databases have long had solutions, e.g. primary-to-secondary replication or materialized views. The frontend ecosystem encountered this problem too – we need to keep our app state in memory and keep a derived transformation of it in the DOM up to date. The ecosystem addressed this with reactive frameworks like Elm, ClojureScript re-frame, React, et al. The problem pops up again if you have a caching layer in your architecture (e.g. AWS ElastiCache or HTTP caching) -- how to keep the cache fresh?

In the small scale, every variable that contains derived state causes a little data sync problem. Now you have to somehow update that variable every time the essential state changes.

Here's a contrived example, just to illustrate the point. A React component could do this:

const UsernameInput = () => {
    const [firstName, setFirstName] = useState("");
    const [lastName, setLastName] = useState("");
    const [fullName, setFullName] = useState("");

    useEffect(() => {
        setFullName(firstName + " " + lastName);
    }, [firstName, lastName]);

    return <form>
        ...form inputs...
        Your name is {fullName}
    </form>
}

But here fullName is mutable derived state. Unless there's a reason it needs to be its own piece of state, it's simpler to make it immutable derived state:

const UsernameInput = () => {
    const [firstName, setFirstName] = useState("");
    const [lastName, setLastName] = useState("");
    const fullName = firstName + " " + lastName;

    return <form>
        ...form inputs...
        Your name is {fullName}
    </form>
}

Adding these instances up over a whole codebase could mean lots of lines of unnecessary code.

What is the "real" essential data?

An almost philosophical question. Software cannot "know" anything beyond what it "perceives" through input devices like mouse, keyboard, network connection, file system, etc. So I would say the closest software can get to the essence of things is storing the raw perceptions. For example, let's say a note-taking app stores the notes in a SQLite database. If the app instead stored an immutable log of all the user input events (mouse and keyboard), then it can derive the contents of the database by scanning through that log from the beginning. Thus, I could say that mutable databases typically don't contain purely essential data. It's just that, for pragmatic reasons, we don't typically design systems that store raw perceptions.

Levels of abstraction

What is considered essential and derived varies with levels of abstraction. The system as a whole has essential data. And individual components/modules have their own definition of essential. That definition is based on what that component can perceive. For example, a single React component cannot perceive anything outside of the props it receives (if it's a pure component). So the props are its essential data.

What is the "real" essential data?

Levels of abstraction

Further reading

Related Posts