The Source of Truth - thecultofian.com

I have a weird background, in the sense that, largely because I’m old and these things weren’t really as important to doing work in the technology world when I started as they are now, I don’t have formal computer science training or education. Instead, I have formal training in electronics and communication engineering, and formal education in philosophy and the history of the natural sciences and mathematics and several decades experience making computers do things and talk to one another, starting as a fixer who has to understand why the complex, inscrutable thing failed before being able to restore it, before becoming the designer of the complex, inscrutable thing. I occasionally like to say that it gives me “an outsider’s perspective with an insider’s knowledge”.

One of the side-effects of this weird background is that there are commonly accepted phrases, idioms, or jargon that I have visceral reactions to because they are very much “in the bubble” and people in the bubble don’t know how wrong they are, either as a matter of etymology or as a matter of semantics. “The source of truth” is one of them.

The quote below is from Kin Lane, and this isn’t a diss track on him or even on this blog post. I was lucky enough to work with Kin for a short time, and even though office politics and stupid manager tricks prevented that from being as fruitful as it could have been, I really liked talking with him and thought we agreed a lot more than we disagreed. What is great about this quote is that it clearly defines this phrase in a way that helps me understand why I hate it.

I use the phrase “source of truth” a lot to describe the authoritative, centralized, up-to-date artifact and discussion surrounding APIs.
Kin Lane, API Evangelist
The API Source of Truth As Well As The Echoes and Overlays of That
Truth

Authoritative. Centralized. Up-to-date. These are not qualities of truth, they are qualities of facts. A fact can be authoritative, because it derives from an authoritative act of creation or an authoritative observation. Collections of facts can be stored in central locations, because facts are immutable and therefor portable. Facts exist in time, and consequently there is recency to the information provided by a fact — “the jar is blue” is a fact, and that fact will remain a fact even as time moves forward and something happens to make “the jar is green” a fact. We can time-order facts to find the most recent ones and that lets us understand change a little better.

For more about facts

see Rick Hickey’s fantastic talks Deconstructing the Database and The Database as a Value

But facts are not, in a philosophical sense, Truth. What is True is, what is Untrue is not. And when we use truth and fact synonymously we lose the very precision that folks who care about things enough to want a “source of truth” are seeking in the first place.

A git repository can be your source of facts, but neither absolute nor relative truth is going to be found there because they are the result of a collective reality that emerges from many points of view. Program listing text is not True or Untrue, because text files that list program instructions are abstractions, and abstractions are lies. Useful, productive, valuable lies, but they are lies.

We do not tell a computer what to do with a programming language, we tell a compiler or interpreter what to do, then it does things, and produces some less abstract set of instructions, and that set of instructions may tell a computer what to do, or it may simply tell a different compiler or interpreter what to do, and so on until finally some set of instructions in the instruction set of the CPU is produced, and they are finally converted into binary state in the billions of transistors that collectively make up the processor.

Turtles all the way down.

This Baumkuchen of lies is what makes finding Truth in a computer or network or any complex system so very difficult and why finding facts is much more useful. It is a fact that “the code says this” and that “when it says that, the compiler produces this instruction”. Keep doing this for a few iterations and you find yourself doing the same things that are done when tracing a program, and if you pause to look around, you’ll see that there are tools, techniques, standards, and research to make this operational task more efficient. An event is a fact with a timestamp, and events are the bedrock foundation of the entire practice of Observability, which seeks to orient people to the landscape of cyberspace so they can being to form a clear view of reality, despite the fact that much of that reality exists solely as digital data.

Any collection of people, from a small team to a global enterprise, is a complex, inscrutable thing because it is made up of people and a person is a complex, inscrutable thing. This is a fact. A person is complex because it is, by best estimate, made up of 10²⁷ atoms and 5.32 x 10¹² cells per 10 kilograms of mass, both of which are an astoundingly large numbers that we cannot comprehend, and yet they interact to make meat instead of rock or goo or whatever. And not just any meat, but meat that thinks and talks and conquers planets. And a person is inscrutable precisely because they are not singular. Why we are thinking, talking, conquering meat is mysterious, but out of that mystery emerges a species that behaves, and that behavior is, as far as we have learned, erratic and unpredictable, implying a kind of wild rule-less-ness nature, or at the very least, if there are rules, those rules are much less abstract than what we can see.

In some very real sense, the desire for a “source of truth” in anything related to the World Wide Web platform is tied back to how the WWW is a place-oriented system — “go to this place and GET what is there” is the fundamental command of the Web, and “what is there” is under no obligation to stay the same. This is foundational. And that foundation is bad for building remote procedure calls because what is found at https://my.example.com/thing-1.html is under no obligation to be immutable, in fact, if thing-1.html is dynamically created by the server, then if you send the GET request 1000 times, you could get back 1000 unique answers. And we don’t know what is the True thing-1.html or even if there is such a thing as “the True thing-1.html“. We do know, however, the fact that the resource https://my.example.com/thing-1.html is mutable.

So much of building programs, protocols, platforms, and systems ends up being conflicts between how complex, inscrutable people orient themselves to the landscape so they can form a clear view of reality, despite the fact that much of that reality exists solely as the emergent behavior of complex, inscrutable things. The tools, techniques, standards, and discovery that we make to help us understand are either going to reinforce our existing reality or they are going to define reality in new ways, but they can’t do both. When I read the end of Kin’s blog post, I think that the code, or the OpenAPI spec or the Postman collection being independent is the actual technical problem. We have tools that reinforce our existing realities. And those realities are each lying to us in different ways, keeping us in conflict by giving us different abstractions that we want to be of the same thing but because that thing is complex and inscrutable, we can never actually be sure that they are.

Authoritative, centralized up-to-date facts can help us agree on what we think we’re abstracting, but finding common abstractions across incompatible realities is about tossing out the tools that maintain and reinforce our existing, incompatible realities so that we can share the same set of useful, productive, valuable lies.