Noel: Hello, welcome to Pod Rocket. I'm Noel and with me today is Phil Wilkins. Phil is a cloud developer evangelist at Oracle, he's an author, much more Welcome to the podcast, Phil. Phil: Hello. Thank you. Noel: Of course. So recently you gave a talk API Stop Polling, Let's go streaming. Did I get that title right? Phil: That's right, yep. Noel: Perfect, perfect. And that was at the Dev Innovation Summit this year and I kind of wanted just talk about that and pick it apart a little bit. But before we get into it, can we kind of get into your background a little bit more? Can you tell us a little bit about your role at Oracle and what a cloud developer of does? Phil: Okay, so Oracle is a bit more than just a database company. So Oracle have a hyper scaling cloud offering, trying to compete with Microsoft and AWS and GCP and so on. And my role amongst a couple of us is to work inwardly with our product development teams to share insights that we pick up when out talking with customers or potential customers and the community at large and at the same time just sort take what they're sharing with me and my understanding of the platform and what we can do and present at conferences, write papers, write content for the Oracle website, expressing how our technologies work. A lot of that these days is built round and supporting OpenSource such as Kubernetes and GraphQL and if you're Java developer you'll know all about VM and the Java platform that Oracle Drive. So it's taking a look at all those sorts of things and getting those messages out there, sharing insights and practices that we think are particularly effective or warning people of potential potholes that they may encounter. Noel: Gotcha. Yeah that makes sense. That makes sense. So on your website you kind of say that your technical set is centered around API application development and integrations for on-prem and cloud, primarily using open source. So would you say that your career is kind of focused on this like API development and integrations space specifically when you're going out and writing articles and talking to devs or is it more broad than that? Phil: It's broader than that. I mean I started my career of more years than I car to remember as a developer and I've come up through that path and I worked in some fairly demanding realtime environments. So I got taught some pretty disciplined development techniques. That stayed with me, although as I've grown through my career I've done consulting which gives me that expertise to be able to sit down and talk with people about their problems and their challenges and share ideas and techniques that we've proven with real world problems. So whilst my specialty is APIs, I dealt with low code environments but I do tend to focus on the back end. APIs kind of go everywhere and that developer skill gives me the understanding and the ability to do microservices but not everybody needs that or wants that for solving system problems. Noel: Yeah, I mean I think that kind of way of thinking APIs, it's easy to apply that to microservices like endpoint architecture, but it's also something you have to think about even when you're just modeling classes or functions even within a given application. There's still an API here. Maybe not in the strictest definition but there's still an interface here that I'm thinking about the consumer in. It kind of all folds back in on itself. Phil: Absolutely. I mean if you think about older techniques, we've had gone through talking about test driven development, which kind of is working with the idea of an API you test are formulating and expressing what you think the contractor is going to be. And then we've had contract led development where we've said, look, yeah here's my interface whether it's a dot H file from C days or a Java interface or whatever your preferred language is approached to it is. Noel: So then maybe to bring us back to the topic of the talk, so do you feel that the way in which someone consumes an API is pivotal to its design? Does that dictate how, if you're designing an API, like how it should be thought about, is this going to be a polling API or a streaming API? Phil: I would always start with what's the problem you're trying to solve rather than necessarily starting with a technique. Once you've clarified and you understand the problems that you are trying to solve, then we can go, "Okay well what's the right approach?" And then we get into the details of how do I express using those techniques or technologies, that API. So almost the coding bit to of delivering the implementation of an API is the last thing you worry about and that's quite often referred to as the API first paradigm or the API first approach and it gives you a nice robust API that is less likely to be impacted the moment you want to change how your backend works. You start with one program in language in and go, do you know what Python's not given me the grunt that I want. I need a pre-compiled language, I'm going to switch, replace my backend implementation with say Go. If you've done that design first that when you come to change in your backend implementation it shouldn't impact your API. But if you do the go the other way, which is quite often what people do when I think they're in a hurry is to write a whole chunk of code really quickly. It fires up, it does what you want it to do as the developer viewpoint but then when someone tries to come and use it's quite a bit harder because it might not behave how you expect it or the API reflect quirks in your implementation. Noel: Yeah, yeah I think that makes sense. I think a lot of devs that have gone through the process several times would likely agree, but I think the piece that I'm curious about is again, say we've got an API that people are hitting in and they're polling a lot of data from very quickly, whether one is polling or streaming that API, maybe whether one expects most clients to consume it via polling or streaming or web hooks or any number of methods, do you think that those kinds of distinctions should impact the way that an API is designed? Phil: Yes, they do need to. What you're actually looking at is this the problem of how frequently and how reliably I can get the data? Yeah, so the key point you mentioned yourself is the velocity point. So if you wanting a fairly steady flow of data, then a polling technique is not necessarily the most efficient. And of course if you want to be your client to be bang, up to date, all the time, you need to be pushing that data as events occur at the back end. So it naturally would suggest a streaming technique rather than, okay, the client is going to an arbitrary value go and start asking the backend for data. Noel: Gotcha. So maybe to define some terms a little bit for listeners, cause we have people all across the gambit of developer familiarity here. When we're talking about polling, what is that exactly? What is an API that one is polling to pull data? Phil: Polling is the traditional API implementation. You have some sort of URL and a payload, you make a request on that. If we're restful, we're using HTTP or HTTPS and we just get an answer back and that's it. That's the conversation over and down with. That's a polling model. So the client is always the initiator and the last part of the conversation. In a streaming approach, the client normally initiates the conversation and will say, look, I'm interested in ticker events, trade stock exchange trades, and every time there's a trade I want to know about it and it will establish some sort of handshake, which the back end then will push data down to the client and when the data changes. So you've got this constant flow, hence the stream. We're all familiar with video streaming, Netflix and all of that sort of thing, that's just a very specialized use case. You start your Netflix client, whether that's a browser or your computer or your smart TV, it does a handshake, it requests what it wants and then the server will just keep pushing it at a tempo that's appropriate. New bit data and another bit of data and those littles will be the data are your frames of picture. Noel: Enjoying the podcast, consider hitting that follow button for even more great episodes. Was the impetus for your talk that you felt that these kinds of APIs or this method of communicating data between two systems is underutilized and we're overly relying on pulling right now? Phil: People tend to just think, oh I do want to create an API and therefore just going to create a rest point. And I've seen solutions, even products out there that are supporting that because customers to those products perhaps not thinking actually, do you know what I want the product I'm talking to, whether that's a finance system or a hotel system to tell me when something occurs, not me have to go and ask for it. Noel: I see. So you brought up the hotel system right now and we have Oracle Hospitality specifically having switched to using streaming APIs for a bunch of its internal data transfer. Is that right? Phil: That's right, yes. They're starting to offer that as well for the clients in certain things because think about your customer journey within... Or perhaps not the customer journey but the staff within a hotel. You check out, they can't go and clean your room until you've checked out or some signal that suggests you've left the hotel but you don't want the hotel staff to keep going click, click, click, okay, I can now go clean room 10, room 11. You kind of want them to be told the next thing on the list will be room 11. So when they've finished the room, they're not having to wait until the browser refreshes with the state, they've got it in front of them ready and that's kind of where push can come into it. The backend decides who's going to be most effective to clean the that room and you push it to their client machine mobile device. So they've got that information. Noel: Yeah, yeah, I mean think that makes sense to me. Was that the primary goal of the project was just to essentially ensure that there wasn't like stale data being referenced at the consumer level at the very end? Phil: There's a raft of use cases like that. There's a lot of integration going on when you start looking at how hotels work, there are lots and lots of third party systems all integrating and communicating with your central management at the hotel and you want to push the data back and forth rather than constantly polling. If we've got a system dealing with say, enabling and disabling key cards, I check in, they'll give me the key card. You want to enable the door or enable a third party system so when you go into a hotel room, you've got on the TV welcome Joe Blogs. Yeah, nice to have you with us. You want to be able to push to that system. We've checked Joe into room two and please system that deals with the entertainment, display this information and you want to push it so by the time the person's got the to that room, it's up on the TV and it's correct, it's not waiting for the next time that entertainment system polls the hotel management to say have I got any new check-ins to deal with? Noel: I feel that in this scenario the front desk or whoever is activating the key, it doesn't feel as important to me that their system is streaming, right? Because they're just going to write data when they know that it's ready to be written. Okay, the person's checked in, I hit the push button, I send the request up to the server. But it does make sense to me that the display reading of that data, whatever that is, especially if it's displaying data that is being written to by lots of clients potentially and we want to ensure that it is always up to date without having to determine thresholds for polling for example. I feel like streaming does seem highly beneficial there. Is that a pattern that is pretty general? Is it data that is often written by many clients is a good candidate to have its readers then be streaming updates to it? Phil: It depends on how sensitive to time particularly the client is. If your client wants to have the current view at any moment in time, then stream is a good vehicle for it because if it's not time sensitive then the fact that three or four things have happened before you do the next poll may not be relevant. You don't want to be told that somebody's just done something when the client doesn't care. It's not relevant to the user of the client that that's happening. When you looking at stock trading, you absolutely want to know by the second when things change. So the client needs to know, it needs to be up to date and have every event as it happens. But if I wasn't that worried and I was just doing an end to day bank account check, I don't need every event on my bank account to come through. I just need to go and ask once at the end of the day. It's all about time sensitivity and things like that. Noel: Gotcha. That makes sense to me. What I guess then are the downsides of using streaming over a more traditional polling based approach? Because it sounds like, one could ask, well why wouldn't I always just want the most update even when I'm logging into my bank account at the end of the day, what's it hurt to just have that thing streaming all the time? Why don't we just use these streaming paradigms for everything? Phil: Streaming, it raises a number of challenges. So the benefit is obviously that when events happen it's going to come to your client and that's fantastic, but it makes the server and the network infrastructure work that much harder. Once you've created that connection to start that stream, that's some sort of network socket typically, not always, but more often than not as a result. That's using bandwidth and compute cycles to maintain and keep that connection alive. So even if you just send in a heartbeat, that's still computational load. Now one or two clients not very thirsty. When you multiply that up to hundreds and thousands and tens of thousands that suddenly starts to cost and you've got to manage all of that and that requires computer cycle of memory to keep the thoughts open and all sorts of things like that. So it becomes less efficient. Depending on the streaming API technology you're using, you can literally sacrifice some security considerations, particularly if you're trying to cut the streaming costs down. One way of doing that is to use a technique called web hooks where both ends use traditional requests. So you open up the connection, ask for some data, close the connection, communication finished. The difference is on the web hook, the client provides the server with a URI for that client. So when the server has a change, it actually calls that your and talks as if it was doing a standard web call. So the client gets called when it needs the information. The benefit of that is that you can exploit some of the security features of HTTP protocol. When it's a pure socket, you are actually coming down in the abstraction from infrastructure. So you have to start doing things, http, you know how it's constructed, we've got plenty of libraries. Over a web socket, you have to understand what the format of the data representation is and that means that you've got a lot more working in deserializing the payload potentially, unless you've got a library on both ends have agreed that that's the way it's going to be done. Emily: It's Emily again, producer for Pod Rocket and I want to talk to you. Yeah the person who's listening but won't stop talking about your new favorite front end framework to your friends even though they don't want to hear about it anymore. Well I do want to hear about it because you are really important to us as a listener. So what do you think of Pod Rocket? What do you like best? What do you absolutely hate? What's the one thing in the entire world that you want to hear about? Edge computing, weird little component libraries, how to become a productive developer when your WiFi's out? I don't know and that's the point. If you get in contact with us, you can rant about how we haven't had your favorite dev advocate on or tell us we're doing great whatever. And if you do, we'll give you a $25 gift card. That's pretty sweet. So reach out to us. Links are in the description. $25 gift card. Noel: Do you see a trend, I guess on the topic of libraries, that a lot of people that end up adopting streaming in some form, do you think that they're more often than not using a library off the shelf or even a platform? I'm thinking of Firebase or Superbase or something, right, where it's like you guys have a java script library that gives you a functional stream like interface where you're just like, I want to watch this object in the cloud. Tell me client when this happens. Do you think that a lot of people that end up adopting streaming are going that route because of these difficulties that you're covering? Phil: We are seeing libraries being used and actually one of the things to consider, particularly when you're dealing with client devices is generally consider good practice to provide a library to conceal your streaming mechanism. And even when you're providing traditional rest APIs, it's considered good practice to give the developer an SDK if you like, or library just so that they can write their code against that interface. It deals all the handshakes, it deals with the issues of what authentication mechanism is to be used or encryption and things like that. So it takes all those problems away for you. Any API, if the team have got the bandwidth to do it, provide your consumer with SDK, it makes their life so much easier and it also gives you a little bit more influence over how you implement things and how you therefore can change that handshake. Noel: Does Oracle offer any specific tooling to make that easy? Say you're a front end or a full stack web developer and you got some data source that you want to start getting streamed to clients say it's stock prices or something like you used as an example before. Would there be any tooling that you could pull off the shelf to make that easier or you wouldn't have to write the web socket logic yourself? Phil: So we within Oracle and a lot of the thing technologies we use, we actually use common standards. A lot of products these days, not every product that we've got, has got an SDK. Probably one of the best examples I can give off the top of my head is the APIs into the actual cloud platform itself. When we are using the Oracle cloud infrastructure, you got multiple ways of managing it through the UI, through Terraform and so on. But it's also got an SDK for everything. And those SDKs are actually in five or six different programming languages. Regardless of your background with your Ruby or JavaScript or JS or a Python person, you can code your interactions if you want. And that's really powerful. And then when we are using a lot of technologies, you know can use the native API directly and we tend to use proper restful principles and use JSON. So you can choose your own library then. It's about giving the developer freedom to work the way they want but we empower them to work quickly as well if they don't want that fine grain control. Noel: Gotcha. So I guess yeah that maybe is a decent segue then into some of the different streaming API options. So we talked a little bit about web hooks and web sockets. Are there any other potential tooling technology that people should be looking at? Phil: So the two big ones outside of those, I intend to encourage people to look at because it takes a lot of the pain away. One is GraphQL and GraphQL's getting a lot of traction and its benefit comes from some of the problems that Facebook were facing 10 years plus ago where a lot of people use Facebook from their phone, smartphone or whatever and the web client or the client on the mobile device just couldn't deliver a good user experience because it would make a restful API call for one bit of information, that comes back, then he has to make the next one and that comes back and you have this huge great chain of API calls to gather data. And what they tried started to look at is what do we put all these APIs together and have one Uber API? So when you make a call you get all the data back in one great big clutch. It really thrashed the back end for gathering a lot of data that actually the client had no interest at all. So what they did is develop this GraphQL and GraphQL has the ability to communicate in the request what attributes of the API are wanted. So it only gets the data it once and the GraphQL server in the back end can mix and match data which becomes really clever. So rather than doing those continuous series of calls GraphQL can give you back just the data you want across multiple entities and what they've done is taken the next step and say well when you're streaming it's the same request but all you're saying is I want you to send me the answer as it happens with just these bits. So that's one and GraphQL is incredibly powerful and flexible to do that. Right in the back end can be a little bit more challenging because it's so flexible. But it is one that I would always say it is a technology to go and look at. It's open standards, it's governed independently. Although Facebook drove it initially, it has now got its independent governance so you're not going to get dictated to by any one organization. The other one that we're seeing picking up is GRPC, which is originally driven by Google but now like GraphQL, because now into governance in an independent organization but it's a lot lower level, it's quite a bit more technical but it's highly performing and it's great for microservice to microservice communication but I wouldn't use it in a use case where you want more agility and the ability to revolve the payloads a lot more quickly. It does support that but it requires a lot more work to do it because what it does is it works a bit like how Corba used to, if you've been around long enough to remember Corba, where you define the payload and then it generates code and compiles that code that understands that payload so it then can deserialize it. It's not in doing any interpretation or understanding a schema and saying okay well that's that and that's this. Because it generates that code, the moment you want to change that definition at all, you've got to look at taking the new binary in into both ends, the client and the server. Noel: Do you think that a lot of people that end up... You're talking about GRPC being significantly lower level, do you think that most application develop, say they're writing full stack by apps so they've got a backend with some services they define in react friend or something. Do you think that GRPC would not be well suited to a lot of those use cases? Or I guess would you recommend they maybe reach for a library that kind of implements a GRPC interface on top of their existing model? Phil: So trying to put an abstraction layer on top of GRPC, because it generates the code gets quite challenging. So it is possible but it is not easy and the libraries of that sort of nature are going to end up being home grown more often than not. So I would always say to people, if you are unsure of the client or you have not got control of both ends of the communication, I would err away from GRPC unless you really are desperately in need of that performance. Noel: Gotcha. Phil: Which is why it's particularly good for microservice to microservice based communication. Because in most cases people have got control of both ends. It might be different teams within an organization but still within your organization and you then got that high performance communication between the microservices, which is fantastic because it's not just sending its text, it compresses the payload down and all sorts of clever things to do with understanding how your payload structure works to optimize it. If you are uncertain or you want to issue... you allow third parties who might not want to use your library or be forced into using your library, I would a go GraphQL because it's essentially JSON based. It's doesn't use pure rest paradigms because everything's post. But it's an awful lot easier to get on with and you don't need these compiling stages from the payload definition. Noel: So I haven't done much streaming work within GraphQL. Say I'm composing a query and I want to stream it, is it all at the root query level where I tell the server end stream these changes to me? How is that communicated in the in API? Phil: You essentially just add one key word, the word streaming into the query or into the mutator definition. It doesn't make some sense on the mutator but on the query you just had the keyword streaming and it will tell the back end. Of course you are dependent on the implementer having built the means to have the back end recognize that something's changed and it needs to be pushed. But once you've got past that, then you are up and running. Noel: So when you're in this stream mode, so you have some deeply nested GraphQL query and there's like a list 18 layers deep that could be being appended to, is only the data that is changing is that all that ends up being sent over the wire? Like the new items in this list or how does that end up working? Phil: So it's not something I've seen done too much in terms of having lots of very deep nesting, but the way I believe it would resolve is that it's going to be at the top level. So you'll get the payload to the attributes you ask for rather than just that the one attribute that may have changed. So if you were watching a price changing, you wouldn't get just the price. If you ask for price and the name, you'd still get price and name. So you got the context on which that data change is working. And in that sense that's just the same as rest. You aren't looking at implicit context. You have to be explicit and you can't effectively rely on context within the communication handshake. Noel: Right, right. Then there's a handful of problems. Then you have to figure out. I guess the more sparse data that you're getting back, I know a big complex query in terms of streaming, the more work you probably have to do as a developer to figure out how does this data fit into the whole? Or if you're watching a list and you're only getting new items, well is this in the top of the list, the middle of the list, the list? It's simpler if you just always get the whole list. But it seems like it could be a bit more data over the wire. Phil: I mean it depends on the expression of the query. The back end would look at the whole list but if your queries said, Oh and two attributes are my first and last date of change, then you would only get a subset of the list. But it's still all the attributes you've asked for, just the one that's changed. Noel: Yeah. Very cool. So how about, like you said before, web sockets are, or sorry, web hooks are kind of in this weird space where it's behaviorally very functional to streaming, but it is still a much more... it is kind of polling but we just flip a request in the middle to make a polling like call where the server is now polling the client in a sense. Is there a reason that you would still steer devs to set up web hooks? What's the beauty there? Phil: So there in lies... One of the benefits of a web hook style approach is if that data is slow changing but you still want to affect that push, you're not keeping that communication channel open. The channel is only open for the data flow to start and end and that makes it quite efficient. Your benefit of that is that the server doesn't need to know the client web address until that client requests and then it can request, It's almost like, you're on a mailing list and you subscribe and you unsubscribe. I'll keep getting called on this API for as long as I'm interested, but don't tell me after I've had enough. Noel: Yeah, I feel like in a lot of more public APIs, I feel like that ends up being what a lot of people are leaning on. It's just like, well we have this data that we want people to be able to subscribe to changes on. In highly public APIs of that nature. I don't feel like I've encountered any GraphQL APIs in the wild that had streaming options available, but I feel like web hooks are pretty prolific at this point. Phil: Yeah, web hooks have been around a lot longer than GraphQL. I think a lot of people using GraphQL to tune the payload that they get and they're getting their efficiency gains that way so it isn't as costly to do polling. And the other thing is what use cases need that constant push. When it comes to client devices, there are less use cases than system to system use cases where you want those systems to be a lot more communicative and you effectively get your processes going as quicker and as efficiently as possible and you want to keep pushing as much as possible but perhaps you don't want to get involved with having brokers everywhere. You want to be able to just create a connection between two services by initiating that request and that other service then just starts pumping data to you. Noel: Yeah there still is if simplicity is the right term, but it feels very agnostic still. Like oh well it's easier to join systems that may not be quite as tightly coupled as others would be. Or you are using some other paradigm entirely like the web hook, the ubiquity of making simple web requests is still just that. It makes it easy to integrate into a lot of setups. Cool. So I feel like we've covered a lot. Is there any other points or anything you'd advocate for to try to motivate devs to go consider streaming instead of polling for APIs or setting it? Phil: So the key really is think about your user. How many times have we sat on an app that perhaps doesn't code the backend frequently and if we are expecting change and you sat there and the user ends up going, I'm going to just refresh the whole app or refresh the webpage because you think, well that looks out date. I'm sure things have changed since I last looked at it. So you end up refreshing the whole app and that creates many, many more API calls on the back end and making it work a lot harder. So you actually end up creating [inaudible. In those sort of scenarios, consider streaming and trying to mitigate the fact that your backend works a little bit harder by using techniques like GraphQL where you are just stipulating what you want quite effectively. And even on traditional rest calls, a lot of people miss quite a few tricks. There are HTT header attributes that are great for communicating, I only want to know data since this point in time. So they're the kinds of scenarios that are worth doing using frameworks like GRPC and GraphQL, they start to embed security mechanisms and frameworks to make your security a lot easier. Far, far more robust and secure than potentially just creating your own web socket solution. Through web sockets, if you go down to the... it's like many things, if you go down to the very basics, you've got to build all the extra supporting layers, you're monitoring your security yourself. Why do that when people have gone through the pain of fixing that for you? And security is in this day and age is not to be treated lightly. Noel: Especially the amount of user data and everything we're talking about. Even just like if you start trying to send everything through a streaming paradigm, it's kind of like, well I don't know if we need our off calls to be going over a web socket. Maybe that's not the best plan. Yeah. Cool. I guess more broadly, is there anything else you would kind of point listeners at? Anything they should check out? Anything on the horizon? Phil: I think that the key ones that we've talked about are going to be there, around for a while. I've seen sort of nuances and variations on the themes that we've talked about, but the sockets, if you want to get down and dirty, is the main way to go. If you want to go streaming, if you want to deal with something that's akin to rest, then go GraphQL. If you want to do something that's a bit more hardcore but alleviates some of the performance considerations than GRPC. Others are all just variations on that. There are a service side push and we'll see I think in HTTP3, some new features around that. Service side push today has a whole raft of limitations because you can't acknowledge the receipt of a code and because that's something you got to consider is particularly if the client disappears on a mobile device, it moves to a different network. It's going to end up creating a fresh connection for you. How do you know it's the same session? How do you know that previous session is dead? Noel: Yeah, there's a lot to consider, but I appreciate you coming online and kind of, I don't know, demystifying stuff, talking API with us a little bit. It's been a pleasure. Thank you so much, Phil. Phil: Thank you, thank you for having me on. Emily: Hey, this is Emily, one of the producers for Pod Rocket. I'm so glad you're enjoying this episode. You probably hear this from lots of other podcasts, but we really do appreciate our listeners. Without you there would be no podcasts and because of that it would really help if you could follow us on Apple Podcasts so we can continue to bring you conversations with great doves like Evan You and Rich Harris. In return, we'll send you some awesome Pod Rocket stickers. So check out the show notes on this episode and follow the link to claim your stickers as a small thanks for following us on Apple Podcasts.