Overview
- Status
-
Delivered 2024-06-11 for GOTO Amsterdam 2024
- Home
- Slides
- Code
-
NA
- Video
- Audio
-
NA
- Transcript
Trnascript
Thank you. Thank you. Yes, I did mention that there will be 75 things covered, but it’s, like, a list. There’ll be a list. So, we don’t have to memorize everything right away. But it’s good to see you here today. Thank you very much. It’s great to be at the event as well, and talking about this notion of RESTful web API patterns and practices. That’s a nice, long title, with enough, sort of, like, trigger words that should interest somebody along the way. So, this is me. This is how you can find me on LinkedIn and GitHub, and YouTube, and Mastodon, and whatever other social media you dare take part in. I would love to hear from you. Almost everything that I will tell you today I’ve learned from someone else, and I would love to learn from you. I would love to start to include your experience that I can share with others. We did a workshop yesterday on this. We spent a day on this, and I got a chance to learn from that class as well, so please connect with me, tell me what you’re working on, and I’d be happy to continue.
So, the material that I’ll be talking about today is primarily from this book, which was released two years ago. Two years ago? Two years ago, by O’Reilly, and it’s literally a collection of these patterns, these recipes, they like to call them at O’Reilly, that I’ve used over and over again, that I’ve learned from other people on how to construct applications. And pattern thinking, I think, is very, very important. It’s a very powerful way to go about your work, because pattern thinking gives you an opportunity to be a bit more generalized, a bit more abstract, and actually to see the connections in other things. And seeing the connections, making connections between software that maybe you haven’t even seen before, is really kind of what the web is really all about, right?
So, this is an opportunity to sort of apply that same ethos, or that same viewpoint, to your everyday work. So, I thought what I’d do, I have a series of these patterns, grouped into collections, design, clients, services, data, and workflow. We’ll talk about those, and highlight, like, at least one from each, but I also wanted to talk about pattern thinking in general, to start.
So, what is pattern thinking? Why do I think it’s important? Why do I think it comes in really, really handy? So, I think this is a pretty good definition: A framework for understanding, designing, and constructing systems. Many of us participate in some kind of design or construction or maintenance. Sometimes all three. Sometimes we know all the other people we’re working with. Sometimes we don’t. But often, what we’re really doing is we’re repeating patterns, repeating steps over and over again, and recognizing those patterns ahead of time often is what we use to apply our skills to a new problem.
If you work as a, sort of, an expert in a field, or a consultant type, or we would say in America, the "hired gun" type of developer or programmer, very often what you’re doing is you’re thinking in patterns. You, to quote the "Jurassic Park" line, "Oh, this is Linux. I’ve seen this before." Right? I’ve seen this before. I know what this pattern is. I know what your problem is. I have a pretty good sense of what it is you’re doing. So, thinking in patterns helps us do this.
Another way to think about pattern thinking is to think of it as inductive reasoning. Now, we hear a lot about deductive reasoning, but inductive reasoning is a bit different. So, in inductive reasoning, we observe. We observe the world, and we observe details about the world, and we create some kind of generalization. Okay, and some kind of general idea. You know what? These are all similar in a certain kind of way. And then we take that generalization and we try to convert it into a paradigm, into a way of thinking. You know, I build lots of applications for the web, and there’s lots of lists, and people wanna see the first page of lists and the next page of list, and the previous page of list, the last pages of list.
We know what the list navigation pattern is, right? So, no matter what kind of data we’re shipping around, we can start to have this paradigm about list navigation, or paging, right? And we can apply that idea to lots and lots of cases. So, inductive reasoning is observation, generalization, and then creating a paradigm, and that’s really what these patterns are. These are generalizations from existing application systems.
That’s very different than deductive reasoning, where, in deductive reasoning, we start with a theory first. We say, you know, "I think it would probably happen like this." We go ahead and make predictions about what that theory would produce, and then we look in the real world, we do experiments, and we look in the real world, and we see if those predictions match. If those predictions have a high level of match, then we think that our theory is pretty valuable, right? So, we’re really just kinda reversing the process. Observation, generalization, paradigm.
Now, this notion of pattern thinking, especially applied to software, comes from this gentleman, Christopher Alexander. So, Christopher Alexander is a physical architect, was active in the '70s and '80s, and before that, and he wrote a series of books about patterns in architecture. His observation was that every house has a kitchen, every house has a doorway, every house has bedrooms, every house has living spaces. However, they don’t all look the same, right? They don’t look the same in the equatorial region as they do in the northern regions. They don’t look the same when it’s a climate that’s very cold versus a climate that’s very hot, climate that’s rainy. They all operate slightly differently. And they’re also based on materials. Some dwellings might be made more out of wood, rather some out of stone, so on and so forth. So, creating those generalizations, he comes up with this notion of patterns, and that everyone can use patterns to plan, use patterns to kind of observe and predict what’s going on in an environment.
And one of the key elements that he talks about in his patterns collection thinking is that everyone can participate in architecture. You don’t need a degree in architecture to talk about planning your city, because you’re a citizen of that city, and you know the paradigms. You know the generalizations. You know the specifics. So, one of the other things underlying this idea of pattern thinking is that we can all participate. You don’t have to be just the sort of elite architecture team in a particular organization. Everyone throughout the organization can participate in these patterns, in these generalizations. So, he also establishes this notion of describing a problem which occurs over and over again, and then the core solution of that problem, but describing the solution in a way that you can use it a million times without ever doing the same way twice.
So, that’s a challenge for those of us who work as programmers or developers, because we sort of like coming up with the way to solve a particular problem. Often, what we do is, as developers, we actually construct solutions to problems. It’s much more challenging, and actually, I think much more advantageous, to construct problems, to think about, "Oh, what are the common problems that everyone needs to face? Let’s list those, and give everyone an opportunity to contribute to how to solve those." And also, when we think about the way the web works, when I can put up a webpage without getting permissions from anyone, where I can publish anything I want, where I can connect to other sources, that’s sort of the same thing, right? We’re empowering people to make their own decisions and do things. So, pattern thinking is a way that helps us do this exact same idea.
Okay. So, from a web point of view, we’ve talked about pattern thinking in general, we’ll talk about pattern thinking as architecture. From a web point of view, what does pattern thinking help us do? And in a web-centric way, there are really three pillars that we all have to deal with in all… This is the generalization that we all have. So, the first one is messages. So, we’re sending messages back and forth. Messages represent the way we share information. So, I might share a JSON message, I might share an HTML message, an XML message, a CSV message. I can actually share the same information in various message formats, right? I can send sales figures as an HTML table. I can send sales figures as a CSV to upload into a spreadsheet. I can send sales figures as a pie chart, as a diagram, right? So, the representation or the message that we ship back and forth is a key element in all of this.
The actions are actually the reason we’re sending a message. I wanna share, I wanna approve, I wanna unload, I wanna submit, I wanna copy, I wanna delete. These are all actions. In fact, I think it really helps to think of the web activity as is pretty much based on actions. We often think if it’s based on data. Someone once told me, "Data is just the leftover evidence that somebody did something." So, when you think of data as sort of the evidence, rather than the primary element, all of a sudden, actions come to the fore. This is why event-driven architecture is so powerful, because it gets us to think about actions first, and what actions have happened, and what actions might be sort of a domino-type effect.
So, actions are the reason we share information. Vocabularies are the information we share. That is actually the meaning of the message. So, when we share an HTML message, HTML messages have a very common architecture. But the message, the information that HTML message send isn’t <p> and <a> and <div> and form. That’s just the structure, right? That’s the structure of the message. The meaning of the message is what’s inside the <a> and what’s inside the <div> and what’s inside the table. So, thinking about vocabularies as the actual information, or the semantic meaning of what we’re sending, is also a very powerful paradigm. So, really, we have this notion of all three of these being very, very important. I need to send messages that have some kind of structure, in order to accomplish actions, and those messages need to use a similar vocabulary that we know ahead of time.
How many people have worked in some kind of domain-driven design or something like that, right? So, that’s really all about getting everyone to talk in the same way, right? Having a common, ubiquitous language that we all speak in. That’s the vocabulary, right? The classes and things like that, that’s the structure of what we’re doing, but the actual vocabulary, the words, the meanings, that’s really why we’re building whatever we’re building today.
Okay. So, having talked about thinking in patterns, and what that can possibly give us, let’s talk a little bit about the RESTful patterns that I collected up in the book. And as I mentioned, there are about 75 of them. I didn’t get as many in as I had hoped. We kind of ran out of time there, actually, I will tell you now. Ther are actually another 20 or so that didn’t make it. Maybe next time, maybe in a second edition we’ll get to talk about those. But we’ll talk about them each in turn.
So, in design. We’re gonna talk about design patterns. Design patterns are often just ways to start thinking ahead of time about where we’re gonna end up or how we’re going to get from A to B. So, I love this quote from J.C.R. Licklider. Has anybody ever heard of J.C.R. Licklider before? Oh, this’ll be good. So, I will read this quote, because I think it’s very powerful. This is a quote from a memo that he produced, a government memo in the United States. "The problem is essentially one discussed by science fiction writers. How do you get communication started among totally uncorrelated sapient beings?" Does anybody know what a sapient being is? Anybody have an idea? Yes. That’s right, a thinking being.
Now, in 1966, J.C.R. Licklider is working for the U.S. government in the computer world, and there’s something else going on in the '60s in America. There’s a race between the Russians and the Americans. What is that race? The space race, right? So, what is J.C.R. Licklider worried about in 1966? He’s worried about aliens. Exactly right. If we ever did actually meet up with aliens, how would we communicate them? How would we start to talk to each other? If our computer met an alien computer, how would those computers talk to each other? It turns out J.C.R. Licklider is sort of the grandfather of a lot of the things that we take advantage of today. He actually helped fund the original ARPANET, through government contracts with the Defense Department, and so on and so forth. If you’d like to learn more about Licklider, there’s a great book called "The Dream Machine," which I think covers his life very, very well.
But really, what we’re really trying to do when we design computer systems, when we design systems on the web, is we’re trying to start communications, when uncorrelated sapient beings, people I have never met. I’m trying to communicate with you in some kind of way. And we have some languages and shared actions that make that possible. So, another way to think about designing is that we wanna make systems that machines built by different people who have never met can successfully interact with each other. That’s not hard. Is it? We’re actually going to use things built by somebody else who we’ve never met, possibly for reasons they have never thought of. That’s building the web. That’s actually what we do when we build the web. We don’t control people. We give people opportunities, and that’s really what our design patterns let us do.
So, thinking ahead about how we can start to design systems that have that feature about them, that allow people to communicate with each other, even if we are not correlated. There are 11 patterns in the book that I talk about, and the first group of them have a lot to do with things like interoperability, compatibility, and vocabularies and semantics. These are the things that really give us a baseline for getting started. We wanna be able to talk to each other. That means we need to have interoperability between message formats, right? That’s what JSON and XML and CSV give us, right? They give us this opportunity. Oh, I know JSON. I recognize this.
We also want a future compatibility. So, we don’t need just curly braces. We need some promises about here’s the data, here are links, here’s a version, here’s some additional information. Here’s some metadata about the data. If you think about HTTP as a protocol, it’s really a whole bunch of metadata, name-value pairs, and then and a body of data. So, it’s metadata and data, metadata and data, metadata and data. In fact, the more involved your system becomes, the more metadata you have. It actually, your metadata increases faster than your data.
Sharing domain specifics, we’re gonna talk about accounting, we’re gonna talk about insurance, we’re gonna talk about PSD2. We’re gonna talk about open banking. We’re gonna talk about buy-in banking. We’re gonna talk about the ACORD insurance vocabulary. We’re gonna talk about HL7 health. These are all ways to share semantics together, and to actually then express them as problems. What are all the challenges that we have in accounting, in banking, in healthcare? And they’re endless, right? So, when we build something, we’re actually describing problems inside some domain.
So, then, once we’re ready to start describing those problems in a forward, compatible way, how do we actually enable them? How do we focus on things like the actions, not just the messages, but also the actions? A very common thing that was built into the way the web was built 30 years ago was this notion of hypermedia, or of links and forms. I can link from one thing to another. One of the most powerful elements of the web as it’s conceived is I can create a link to your page without asking you. I don’t have to have a meeting. We don’t have to change formats or anything. I just have a link, and I connect. That’s pretty powerful. That’s pretty amazing. So, expressing not just links, but also domain actions, like reading and writing and submitting and approving, is also a very powerful pattern.
Designing consistent rights. If you work in a computer system when everybody’s spread out, and I wanna send a message like, "I wanna debit $100 from your account," and I send that message over the web, here’s account one, account two, $100, I never hear back. What has happened? I don’t know. Did it work, and I just never got the return message? Did it never get there? Idempotent messaging lets us actually repeat the same thing over and over again. Idempotent means "same strength," from Latin. That means PUT is much more powerful tool for writing data on the web than POST. We could talk a little bit more about that if we need to.
The last set of design patterns I talk about is repeatability, reversibility, extensibility, and modifiability. All of these things become really important. Repeatable is the thing I was just talking about a second ago for the bank, right? If I know I’m using this idempotent, or same action, same strength action, I can repeat, and see if I can get a different response. I can safely repeat, and no, I’m not creating three, four, five, six, seven deductions from your account.
Reversibility is very important. If I wanna write something and I change my mind, I should be able to reverse that. We’ve built that into accounting systems for thousands and thousands of years. We can do that in Kafka systems. We can build that into everything that we do. Extensibility means that I can use your service, and extend it in some way without asking your permission or without getting all your other users to agree with me. We don’t have to version the thing if I wanna add a feature. And then the idea of modifiable means that you can change the service I’m using at runtime without breaking my application.
So, these are the key elements for design. These give us a foundation for creating messages, and using them in these actions. I’ll highlight one of them, which is describing problem spaces, because I think it’s such a powerful idea to describe what’s the space, and allow people to come up with solutions. This is actually a format called application-level profile semantics. It’s called ALPS. It’s designed by I and a few other people. It’s actually about 10 years old now. And it helps you produce documentation for your vocabulary. So, this is the whole vocabulary, the semantic elements. It also produces a connection diagram, which tells you all the ways in which this particular problem space is connected. There’s a home. I can go from home to the list, a collective list, and from the list, I can go to a detail. And at the list, I can do such things as create or filter, so on and so forth. On the detail items, I can update and remove. There are other, lots and lots of other possibilities here, but this diagram, it gives me a chance to understand how things connect to each other.
So, it’s not the same as a C4 diagram if you were in here earlier today. It’s not the same as a state diagram, which gives you very specific sequential actions. This doesn’t dictate where people go. It just says what you could possibly do. So, I like this notion of describing problem spaces. Overall, the idea is to make designs composable. Gartner talks a lot about composability right now, so it’s that same notion of making things composable, so that we can then later connect them even if we didn’t plan on doing that ahead of time.
So, we’ll talk about clients. Good news about computers is they do what you tell them to do. That’s also the bad news. Ted Nelson, anybody heard of Ted Nelson before? Ted Nelson is a person who’s credited with the notion of creating the words hyperlink, hyperdata, hypermedia. Ted Nelson is still alive and kicking in a farm in upstate New Jersey, and he’s just as irascible as ever.
So, client patterns. I choose to talk about client patterns before talking about services, because I think clients are more important than services. It’s the clients that we’re working toward. It’s the clients we want to enable. It’s the clients that do the work. So, client consumer applications, API consumer applications, often make very few assertions on how they communicate. They will know a protocol, like HTTP, they’ll know a message model, like JSON or HTML, and they’ll know a vocabulary, like banking or something, but that’s it. They don’t necessarily know when they start from the first page, what’s going to happen next. That’s what links and forms are all about. So, we have lots of opportunity to empower clients to navigate this problem space however they wish.
So, the starting elements are really making sure that you’ve got a client that knows how to get started. We talk about using, really just kind of memorizing one URL. One of the patterns I talk about in the book is the home pattern, which is the guaranteed URL for starting, and the very next thing that you find at that page are all the versions that are possible, right? So, when you change versions, all you do is add more to the collection of possibilities, rather than try to replace the existing possibilities.
So, message-centric applications is really the idea of our messaging back and forth, and understanding vocabularies is the other leg of that three-legged stool, right? Is this notion of understanding what we’re gonna be talking about. Often what we’re doing is we’re trying to figure out, okay, at runtime, we’re uncorrelated. How do we make that first connection? I would like to talk about banking in a JSON format. Is that good with you? Yes, that’s good with me, as long as we do it over MQTT. Oh, okay. Right, we can communicate, machines can actually make these decisions.
Managing representation formats, this idea of, "I can get you the information as CSV, as HTML, as SVG, as a canvas document." Those are all representations. Every element in the response needs an identifier. I need to be able to know where the given name is, where the family name is, where the postal code is inside the document. You might think that schemas are gonna tell me the solution. That’s fine. But there are many, many applications where they don’t use a fixed schema for every conversation. HTML is one of them, right? There’s no schema that we do for HTML every time we get a document, because there’s so many various possibilities. So, breaking outside that role where I just have a fixed set of objects, and I can actually send you a message that contains one or more objects, is very, very powerful as well.
The other thing that I talk about here, that, mentioned down here in the bottom, and you may not be able to see, maintaining your own state, and having a goal in mind. Often we think of client applications as sort of the, just the followers, the robot. They don’t have their own state information. They’re not making their own decisions. They’re just following what the server tells them. The most powerful applications, the most powerful API consumers, are the ones that are solving their own problems, not yours. They might actually enlist lots of different APIs, lots of different services, and these services don’t even know each other, and they’re trying to solve their problem. They have their own goal in mind. I’m trying to actually do some shopping at the shopping site. I’m going to use this credit card to pay for that. I’m gonna take all that information and hand it over to this shipper. That’s the client that’s actually making those kinds of decisions.
So, one of the patterns I like in this category is this notion of representation formats. So, you can build a client application that actually says, "You know what? If I receive the response in this particular format and representation, then I’m gonna parse it as a Collection+JSON message, and then act accordingly." But I might get the same information in some other format, like SIREN, or maybe HAL, and I can just simply parse them accordingly, and still have all the information I need. What I’m looking for is this, this object as a person, this schema as a person. So, I could get that person information in lots of different formats.
In fact, I can also extend this application to actually include another format that’s not on this list here, right? So, I can expand this application. So I can extend this, this is an example of extending, so I can even talk to another server. Maybe that other server gives me the information in RDF, or Hydra, or something like that, and so I can add that as well without upsetting anyone else or any other service. So, the whole idea about the client pattern group is to make clients adaptable. That means sort of reactionary in a lot of ways. I want them to be able to react to changes in the environment, and still be alive, and still be consistent, and still keep running. In any system that we have lots and lots of applications or lots and lots of APIs, there’s always gonna be some part of the system that’s not running efficiently. There’s always gonna be somebody who’s upgrading or versioning or changing or modifying some part of the system. But the system has to keep running all the time, right? Those of us who work in enterprises, that’s sort of our real critical goal, is how do we keep everything up and running all the time, even when we’re making changes? So, creating clients that are adaptable, that can react to new formats, that can be extended individually, that understand the vocabularies ahead of time, are a great way to improve that resilience in your system.
I love this quote from Paul Clements. Paul Clements was one of the earliest people who kind of came up with this notion of software architecture, Carnegie Mellon University, "The best software architecture knows what changes often, and makes that easy." I know there are certain aspects, like maybe it’s the URLs that are gonna change quite a bit, so I’m gonna make the URLs easy to change inside the software. Maybe it’s the number of users, maybe it’s the profile information, maybe it’s just the content itself. But there are certain things that I know I need to make easy. Now, when you’re creating services, you’re creating an interface. The API is the contract. And that’s really full stop what it’s about, right? You’re making a promise, and that promise needs to be kept for a long, long, long, long, long time, because people are gonna be around for a long time as well, and your service needs to be dependable, needs to be valuable.
The diagramming, which, by the way, is, this is a fantastic set of diagramming by Fagner Brock [SP] for this book. This one in particular has lots and lots of meta information. There’s build time work for services, there’s runtime work for services, there’s content negotiation. There’s a lot going on in the service realm. Services are really the way we enable clients to do their work, right? Services aren’t the work. Services are the enablers to let clients do their work, and there are lots of service patterns covered in the book. And the big ones right in the very beginning talk about this notion of model leaking. How do we prevent model leaks? How do we prevent the notion of when you change the model, and you’ve got five classes instead of four classes, that doesn’t change the contract. That means you need to construct your services, and the interface that goes with them, the API that goes with them, in a decoupled way that makes sense.
How do you convert internal models to external messages? So, a customer class or a person class inside code needs to be converted to a message. We have simple serializers that maybe just serialize that into JSON or XML, but maybe you need more than that. Think of this case where you have an object, you’ve got a customer object, you’ve got some invoices, and then you’ve got some sales information about that customer. What you wanna do is present this as a single message, right? I wanna have the customer, their list of orders, and their list of all their recent sales contacts. That’s a message, right? So, we need to convert those models into messages, back and forth.
And then expressing internal functions as actions. How do I express, submit, and share, and update, and approve, and deny, all those kinds of things? How do I turn those into exposed elements? And you’ll do that differently in an AsyncAPI than you would in a RESTful API, or than you would in GraphQL, and so on and so forth. So, that ability to think about that as an abstract element is important. A big chunk of the elements in the services list of patterns has a lot to do with metadata, as I was saying before, supporting support for client preferences. Telling clients, you know what? I support four different formats, two different protocols, and two examples, two editions of the same semantic profile. Right? That’s a lot of metadata that you can empower clients with.
Content negotiation, what format do you wanna talk about today? Publishing service definition documents. Service definition documents like OpenAPI, and Schema Definition Language for GraphQL. Protobuf for gRPC, and AsyncAPI for evented APIs, right? These are all definition documents that are going to make it easier for others to build clients or SDKs, or even mimic the services at other locations. So, they’re very, very important. Publishing API metadata, such as terms of service, service level agreements, where you can find the documentation, who you can call if there’s a problem. There’s a great format called APIs.json. Has anyone heard this? APIs.json is a great way to express this kind of metadata information.
Supporting health monitoring, simple heartbeats, as well as what the average throughput is and so on and so forth, becomes really powerful when you start orchestrating services, in figuring out the health of elements. This is often available in closed systems, like Kubernetes or something like that, but it needs to be available to anyone. Anyone, anywhere, all the time.
Standardizing error reporting. There’s an RFC called HTTP problem format, which standardizes the way we tell people, "Okay, there was a 400. You need to include these fields," or, "I couldn’t take that money out of your account because you don’t have enough money in your account." Standardizing that in a way that allows all the clients to understand them is really, really important. And then the last set of service patterns are optimizations, improving things. Service discoverability. You can have an API catalog which lists all your possible services, but that doesn’t tell you what’s up and running right now. You can have a service discovery document, which basically says every time somebody fires up a service, it makes an entry in a registry. Now other people can find that entry in the registry, and they can bind at runtime, to make sure that they can do something like approve a payment or handle a shipping or something else like that.
There’s also some talk about idempotent create statements. So, we talked about idempotent earlier. So, rather than using POST to create, using PUT to create is a really powerful way to make sure that system is resilient, so even if the network goes down for some reason, you could replay that, and make sure that you’re not repeating the same action or creating duplicate records. There’s something about fallbacks. I always need to provide a fallback. If I’m connected to some other service, and that service is unavailable, I’m gonna have to do something. I’m gonna have to return a 500, which says it’s unavailable, or an alternate, or if it’s a read situation, I might give you an older version of that particular product record or something like that, but we always need fallbacks for every single connection that we make.
Okay. So, to highlight, I love this service registry pattern we talked about earlier. Almost every language has a pretty easy way to do this. When the service fires up, it registers, it sends some response data, some metadata about that service, what it’s available for, how it can search it, what you need to start. Maybe it needs mutual TLS or something like that. And then also, when the service shuts down, then it also de-registers itself. So, the service registry becomes the sort of active reflection of what’s going on at this moment in your part of the world, in your space.
So, making services modifiable is the parallel, or the handoff between making clients adaptable. I can change the service in certain ways. I can change the underlying information without breaking the contract, and that means that the two of us can work together.
All right. Let’s talk a little bit about data. Irakli Nadareishvili, with JP Morgan Chase, he’s the head of their innovation labs, internet innovation labs. He’s got a great line about data. The first step in breaking a data-centric habit is to design systems, not as a collection of data services, but instead as a set of business capabilities. So, this goes back to this notion about data as evidence, rather than as mining gold. So, data is actually the evidence of a business capability. I wanna be able to approve a payment. I wanna be able to share information with others. I wanna be able to onboard a new customer. That’s what I wanna do. Making data allow us to accomplish the actions we need is very important.
This is the one time I will do a self-referential quote, which is kind of creepy, but I’m gonna do it anyway. So, this is something I talked about about 10 years ago. "Your data model is not your object model is not your resource model is not your representation model." Each one of those is a layer. And you should have the freedom to adjust that layer, or even replace that layer, without upsetting the others in that stack. And that’s a bit of a challenge. But data messaging is critical in that sense, because often what we run into are tools that actually just tell us to use our data model, and generate an API from the data model. And it really does us a disservice. Now, sometimes, if I’m just trying to solve a problem myself, today, on my machine, with some command line apps, that’ll probably be a good idea. But if I wanna have any longevity, any future, if I wanna interact with uncorrelated sapient beings, it’s probably not a good idea, because too much will change at that data layer.
So, hiding your data storage is really, really important, is a great example of a way to start. I don’t care if you’re using T-SQL or MongoDB or physical files. I don’t really care. Just, please, let’s talk about the data that I need in order to do the actions. And I don’t care if you change that later. I don’t care if you change that to a hierarchical database or an object-oriented database from Datomic to T-SQL. It doesn’t matter to me. So, hiding that technology’s really important. Now, often it’s pretty easy to hide the technology. You know, I’m using Kafka, I’m using Mongo, but it’s often not easy to hide the query language, right? A lot of us still wanna do SELECT * FROM, or actually do something else that’s very specific to a particular technology. So, making sure you hide all of the technology and all of the storage internals is very, very important.
It’s also important to leverage things like the way HTTP URLs work. You’ll notice that sort of the default for HTTP URLs is not LESS THAN or GREATER THAN or GROUP BY or IN BETWEEN or all those kinds of things like that. It’s simply equals. This field equals, name equals Mike, right? And something, something, something. So, implementing your query language to mimic that, at least at that level, is super, super powerful. Now, you may have very powerful data tools behind that, but if you just promise this information, where this value contains something, "name equals MI" would return Mike, for example, right, because MI is contained, then it’s very, very powerful.
Also, returning metadata for queries. So, especially internal, on the internal side, if you’re making informational queries, information retrieval queries, rather than data update or data management queries. I want all the records that have anything to do with GOTO in Amsterdam over the years. Well, that could be hundreds and hundreds, maybe thousands and thousands of records, depending on what I’m looking for, right? So, we don’t really know for sure. So, often I need to know ahead of time, you know, I’ll give you the first 100, but I’m telling you, there’s about 11,000 on this. You might wanna refine your query. Or, there are so many possible returns, I’m not gonna actually even start retrieving. You need to narrow your query. That happens in sharing metadata and meta information.
A classic case, for those of us who work in applications, if I have a query that returns no records, what is my HTTP value? 200 or 400? Who say 200? Who say 400? It’s an empty page. All right, I didn’t get anybody on that one. But having a pattern over and over again, if you think in terms of messages, it doesn’t matter if you have 1 record or 500 records or no records, I still get a message back every time. So, that’s, 200 is what makes sense.
In some cases, you can actually use a media type for a query. Some very long, involved queries make it really difficult to, like, put on a HTP line, right? I actually need a rather involved set of statements. So, there is, actually, the T-SQL media type. It’s been registered for more than a decade. So, I can actually send SQL messages back and forth, using POST bodies, or even updates, PUT bodies, in messages over HTTP. How many people have used Solr, or some other kind of system, right? ElasticSearch, something like that? They have their own languages as well, and they can be converted into media types, where we send rather involved Solr or ElasticSearch queries over messages in HTTP.
Finally, there’s some ways to improve the data with caching. Modifying data in production, modifying models in production, happens quite often. What happens when suddenly I wanna add a field, I wanna move between… This used to be in two tables, now it’s gonna be in three tables. I need to be able to do that without breaking things, and I need to make it reversible, so that if I change my mind, that I can go back to the old model.
Extending remote datastores. I’m working with Salesforce, and they’re doing some information for me, but I need some extra fields. Maybe Salesforce doesn’t allow me to do that. Actually, they do, but some other product might not, HubSpot or Workday. I need to create my own database and then connect it to theirs. For every query that comes back, I need to attach my data, so that my users can use that information. That’s a great example of extending. Limiting large-scale responses we talked about earlier, and then pass-through proxies as well.
So, modifying models in real time is pretty common. This is my message. This is my data. It turns out I also need to add a middle name along with family name and given name, but this database is maybe owned by HubSpot or Workday. I can’t change their data. So, what I’m gonna have to do is create my own name-value pair sidecar table, and associate it with individual records, and now I can keep track of that information. In fact, what I might do is actually just expose this information to everyone. They don’t need to know that some of these are from a sidecar, and some of these are from some other source. So, this is a powerful way to extend data without having to get other people involved.
People know the Bezos everything-should-be-an-API story. Does anybody know this story? So, Bezos started AWS with this notion that everything’s gonna be an API, and you can only talk to the API, and you can’t be talking to the software people anymore and go around their backs. And people think this mostly had to do with some architectural understanding about creating microservices or something like that. There was no such thing as microservices in the '90s, when he promulgated this message. What he was trying to do is get people to stop having meetings. Do not have meetings. Do not call a bunch of expensive architects together and argue about whether or not we should add the middle name field to your database, because nobody else wants it, just one team. Just add the thing yourself. Yes. So, that’s what, really where that pattern’s come from. Okay. Making data portable is really powerful.
Let’s talk about workflow. "Productivity is never an accident. It’s the result of a commitment to excellence." Paul J. Meyer is a motivational speaker type. I just like this quote because of this notion about it takes work for things to work properly, and that’s really what workflow is all about. Workflow is really, really hard. So, what we’re trying to do is enlist other services in some kind of composable way, so that we can accomplish something that maybe each individual service doesn’t know about. So, I have common patterns, I have this notion of tasks and jobs that need to get accomplished, and I have this notion of knowing how my tasks and jobs are going. There are lots of patterns in workflow in the book. We won’t get a chance to talk about a lot of them, but a bunch of them have this notion about what is a workflow-compliant service look like? What features does it have?
We know that HTTP resources have things like GET and PUT and POST and DELETE. That’s a sort of a paradigm, right? That’s a pattern. What is the pattern for compliant services? They need to be able to commit, they need to be able to roll back, they need to be able to cancel, they need to be able to redo, right? There’s a series of steps, and some of these messages talk about that. Shared state for workflows is really important as well. If I’m enlisting services I’ve never met, where does the state go? Sometimes it goes on the client, but sometimes there are multiple clients. There are different ways to think about how to share state. Describing workflow as code is how most of us do it. I’m gonna call this service, I’m gonna write some code for it, and then I’m gonna take data from that service and hand it to this service, and then I’m gonna take data from that result and hand it over there. We write code.
But sometimes we create a domain-specific language. Has anybody heard of the language Ballerina? Ballerina is a language designed specifically for this notion of workflow, or interconnections, integrations. I love that they called it Ballerina because they do choreography. If anybody knows about service choreography, it’s like ballet. You could also have workflow as documents. You can pass around documents like invoices, like paperwork, and this is the invoice, or this is the shipping manifest that goes with this document across the world. Or you can actually create your own kind of job control language. I give an example of one that I’ve learned, in a couple of different formats, called RESTful job control. We’ll take a look at that in just a second.
And then there’s a whole bunch of everyday things. How do I know how my progress is going? Are we done? Are we halfway done? Is something stuck? What are all the possible things I could do right now? What are, returning all the actions. What’s the most-recently-used actions? People who open up a ticket often wanna do this next step, which is assign it to somebody for work. Send that information along ahead of time. Make it easy for a client to solve a particular problem. Supporting stateful work in progress. There’s, like, five or six different tabs in a screen or something. How do I support that ahead of time? How do I create a stateful step-by-step until I finally submit the final item? Establishing list navigation, we’ve talked about. Partial form submit, which is very powerful. I need five inputs, I give you three. Rather than rejecting the inputs, I’ll take those three. Please give me two more. Right? And this means that you can create very powerful autobots, using these kinds of technologies.
A stateful pattern to enable client workflow is this notion of clients wanna manage their own time. Maybe they’re paying attention to the temperature in the room, and their job is to maintain it at a certain temperature. So, let clients decide when to turn things on and turn things off, by giving them enough information to know about what the temperature is in various parts of the building. And then there’s some optimizations about replays, and incomplete work, and automatic retries and rollbacks. We’ve talked about some of these before as well. But creating them as recipes and as paradigms means we can talk about them in a kind of a general way. Okay.
So, this is actually that job control language we talked about earlier. I can go from home to get a list of jobs. I can perform jobs and do various steps, and if everything’s working out, then I’m successful. If not, I roll back. So, making workflow flexible is very important. It’s really a metadata. It isn’t just enough to enable a workflow, it’s to enable other people to make workflows. That’s super challenging.
All right. Let’s wrap it up. All of the things that are included here actually follow this sort of RESTful principle, that I’ve carried with me for a long time. And you’ve heard me mention it in various ways already. Leveraging global reach, the internet, to solve problems you haven’t thought of, for people you’ve never met. And think about it. That’s what good software does. That’s what spreadsheet software does, right? Spreadsheet software lets people that the spreadsheet software creators have never met over the last decades, create solutions for problems they haven’t thought of. That’s what we wanna do.
And there are various ways to think about good recipes or good patterns. The ability to share solutions yourself, and find other people’s solutions. Make them available for others to use in ways you haven’t thought of, and let them extend it. Let them do other things with it. Don’t close them off. Don’t prevent them from being creative. Letting strangers safely and successfully interact to solve a problem. Guess why you give everybody rollbacks? They try something out. Oops, didn’t like that. Let them roll that back. And this idea of promoting longevity and independent evolution, so that I can change my service without worrying about changing everybody else. And then finally, this idea of understanding that time is actually an architectural element. It will take a lot of time. Time changes things, priorities change over time, so our services need to be able to change as well.
So, we’ve talked about these things, each one of these, that I’ll just leave you with one more message, from Donella Meadows, who wrote a great book called "Thinking in Systems." "Everything we think we know about the world is just a model," right? There’s another phrase, "The map is not the territory," from Korzybski. This idea that it’s the model that lets us see the world. And we can change that model, we can create other models, and we can share models with other people, and in that way, we can actually change what the world looks like. We can empower others to do all sorts of amazing things, especially when we think in these terms of paradigms and patterns, rather than specifics, and let other people be creative in solving their problems. And that’s what I have for you today. Thank you.