David Egts: Back in Episode 261, we're talking about. Gunnar Hellekson: David Egts: I was a British Standard seven, six about being reliable Gazette here,… Gunnar Hellekson: yes. Yes. David Egts: And yeah and then we mentioned Stephen Pustie as we do whenever we think about GIs any we have that Correlation. Gunnar Hellekson: I love that. Yeah. David Egts: Gunnar Hellekson: Immediate, that's right. David Egts: And we all know, too. It's like if you say Steve and Prusti's name three times in the mirror at night he magically appears right. Gunnar Hellekson: Yeah, yeah, that's right. That's been my experience. Anyway, that's it. David Egts: Yeah, so He's on the show right now. He's with us. We've channeled him. Steve Pousty (TheSteve0): Yeah, you guys did it three times last night. I had to show up. I'm bound by it,… Steve Pousty (TheSteve0): Gunnar Hellekson: That's amazing. David Egts: yeah, what Yeah. Yeah. Steve Pousty (TheSteve0): So, I couldn't not be here. Not, I don't know. Gunnar Hellekson: Some kind of delightful GIs demon. I love it. Steve Pousty (TheSteve0): Yes, you're the cocktail party and you need someone with geospatial knowledge to fill in Steve Poust. Yeah. Yeah. David Egts: Life of the Party. Yeah, yeah. So we're glad you're here and… Gunnar Hellekson: Great. Steve Pousty (TheSteve0): Me too. David Egts: and yeah,… Steve Pousty (TheSteve0): I have to say your podcast is one of the most fun interviews I had and… David Egts: so we try so. Yeah. Good. Steve Pousty (TheSteve0): I love the way you Edited it last time you did this because before we actually podcasted I was talking about how ahead got stung by a loss and then I felt kind of dizzy and That was the opening voice cut, you put to the pocket, last things start to get itchy and then they move into the rest of the podcast, which I thought was great because I love real life stuff. Not all like I'm super So anyway,… David Egts: Yeah. Steve Pousty (TheSteve0): so And I'm excited to be Let's go to that Here's thing. So I can't remember exactly your point though. I do remember that. Although this part I remember and then you can tell me there was other parts about it… David Egts: Yes, yes. Steve Pousty (TheSteve0): but I do remember them saying they were changing their street names so that computers could be better with them and That is garbage. David Egts: Apostrophes. Saint Mary's,… Steve Pousty (TheSteve0): Yeah, garbage. Yeah, you don't have everybody in the country,… David Egts: get rid of the apostrophe. Steve Pousty (TheSteve0): change the names of their streets because you can't program a database correctly, that is ridiculous. Technology is a tool, not our gods,… David Egts: Yeah. Steve Pousty (TheSteve0): so yeah, no do not, that is And then there was something about Standards coming up, like paying for standards and how much they cost and… David Egts: Right. Steve Pousty (TheSteve0): all that stuff actually comes up in a bunch of places and I hate it. You have one of those kind of The European Petroleum Standard Group. when you talk about this, when you go from 3D to 2D, You use math to squish. The orange is what I like to say, right? Because you can't keep everything consistent. David Egts: Yeah. Steve Pousty (TheSteve0): When you go from 3D to 2D which is a glow throughout. So there's all sorts of different mathematical formulas both for the shape of the earth and the mathematics you want to use to also put it together because the earth is not actually round. And it matt even mountains matter,… David Egts: Right. Steve Pousty (TheSteve0): so, They have a standard which lists and they have a number that goes with it. You can use the number. That's not a problem. And everybody knows what they are. But if you actually want to get your hands on the book, that actually were the document that talks about it, you got to pay why are we doing this? Like that is crazy. If you want to be this good. Gunnar Hellekson: but, Are they doing it to recover the cost of the research? Or what's the Steve Pousty (TheSteve0): Yeah, the petroleum group really needs that. No, they're not making a big. They're not. David Egts: That's our exit. David Egts: That's our exit. Yeah. Steve Pousty (TheSteve0): Yeah. I saw on a petroleum executive the other day. Steve Pousty (TheSteve0): I'm from a trigger shows with this cup shifting around. Steve Pousty (TheSteve0): No, they don't actually calculate those things. They just bring it into a standard and have the meetings around it, that make it a standard and stuff, but they're not doing real. There's just a leader. And Gazeteers that's a blast from the past again it's here, right? David Egts: Yeah. Yeah,… Gunnar Hellekson: Yeah. David Egts: I had to look up what a gazetteer was, it's only a fancy word that people use in English clubs and stuff. Steve Pousty (TheSteve0): And so I remember correctly because I remember when we were talking about it but I think a Gazette here is you look up a name and it tells you where that is on the map, right? So,… David Egts: Yeah. Yeah. Steve Pousty (TheSteve0): you know what we call that We call that. Basically, So I reverse geocoding geocoding is I have an address put me on with steel coding. David Egts: Okay. Steve Pousty (TheSteve0): I have an address, put me on that. That's I have an address. Where is that in the real world? So we're geocoding basically is… David Egts: Mm-hmm Steve Pousty (TheSteve0): what you're doing. you want another fun fact about Geocoding? It is an embedding technique with all these people talking about embedding in AIS,… David Egts: Door. Gunnar Hellekson: Sure. Steve Pousty (TheSteve0): The reason we need embeddings is because these are these unstructured data. That doesn't mean anything to a big image in and of itself. The computer doesn't know what to do with that. It's not the ASCII character, whatever. so, When you geocode, you're actually taking unstructured data, which is basically an address and you're turning it into a two-dimensional vector, which is the x and Y coordinates of that spot on Earth. So when I teach vector databases, the GIs people, you've been doing it all along. Don't worry about it because the vectors Of that were… Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): how you get them can change, but that's still vector encoding. Gunnar Hellekson: And so what's it? So tell me that I just realized you're exactly the person who's gonna explain this to me. So we have addresses… Steve Pousty (TheSteve0): Yeah. Yes. Gunnar Hellekson: which address haha some of the problem space but you can't use a street address to get to anywhere in the world. And so you have other alternate I guess geocoding systems,… Steve Pousty (TheSteve0): Yes. Gunnar Hellekson: You got what three words is a? Steve Pousty (TheSteve0): Yeah,… Steve Pousty (TheSteve0): that's a Gunnar Hellekson: Is a thing. Gunnar Hellekson: And then Google Maps keeps wanting to give me some kind of goofy hexadecimal hex which is apparently easier than the GPS numbers. I guess or the latitude launch to, but what's notices? Everybody seems to want to solve this problem. What three words? Google's goofy, hex, whatever. And then, of course, we've got traditional latitude longitude, Do we need so many systems? Steve Pousty (TheSteve0): Right. No. Gunnar Hellekson: Or … Gunnar Hellekson: what is the okay, we know? Steve Pousty (TheSteve0): I don't understand… Steve Pousty (TheSteve0): why they came up with what three words and I don't really understand the use case. I've always kind of the food. What? Three words? starting to tell people where you live on the earth? I live on Franklin, ice cream. Something else. How is that any better than Two simple X and Y coordinates? Right? that's just ridiculous. Gunnar Hellekson: So for what it's worth it was explained to me as useful in disaster FEMA scenarios, right where somebody doesn't necessarily have or you're trying to communicate over a bad radio or something like that. And you can use the three words instead of reading out 16 numbers to get the same data, which is fine. I guess seems like a very elaborate way of solving that problem, right? Steve Pousty (TheSteve0): It sounds like somebody's making up something. To justify themselves, right? … Gunnar Hellekson: Yeah. Yeah. Steve Pousty (TheSteve0): what the words, I don't think that's okay. But not having address Everywhere is a very real problem in a former life. I worked for Decarta and dakara was the Lbs, Location-based Services, engine under on Google when it first launched and it was Yahoo maps and it was all that other stuff and one of our back in the day, when there was aggregators before open street map and before Overture Map Foundation, There was Navtech until the Atlas, And they gave you worldwide good, they were verifying directions speeds, all that stuff. And we were in India. Steve Pousty (TheSteve0): And most of India, especially outside of the cities has no address, it's all wayfinding. It's all. go down here. Take a right at the dairy, then take another left, then you'll see this old store, that's kind of closed. Go there do this. That's the only way they do addresses. There's no. And I'm guessing they're Gmail because they know a city they probably have standardized for cities and so you just say so and so in this city and then I guess the letter carriers assumed to know I know that person is and… David Egts: Yeah. Steve Pousty (TheSteve0): I'm gonna bring it to their house. so yeah, I don't think there is a good. Steve Pousty (TheSteve0): I don't think what three words doesn't really solve it from? the problem of creating an address for a place and then you also have to get people to adopt addresses. they're used to wayfinding when I was a kid, you did. still up until when even I was older than a kid. You printed out, it was amazing that you could print out the Not Google Maps, who was it? Matt. Yes,… Gunnar Hellekson: Maps. David Egts: yeah, what Steve Pousty (TheSteve0): could you print out the entire MapQuest and… Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): I want the image too because I want to see what it's gonna look like. And then I got all the instructions. it's kind of way finding, right? It's like turn left here. Do this here? Up until we had portable navigation devices with us at all times. That's the way we moved around, right? So I don't know. Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): Yeah. I think people. David Egts: In people are, I heard in Russia, house numbering was based. Steve Pousty (TheSteve0): Yeah. David Egts: It wasn't in. numerical order it was in the order that the houses were built Steve Pousty (TheSteve0): Right, right. So becomes like it's a… David Egts: but, Steve Pousty (TheSteve0): if you want to know how old the house is or what's the oldest house on the street. You got it? Yeah the other thing is that so that's also problem that we run into you I suppose I had a piece of property. It was just land and I have a house on one side and I had a house on the other side. But nothing in the middle. So we give this house six and since we're on the right side of the street, it's six and then eight. And then, I empty lot in between them, I decided to build my property. And I put a house there. Does it become 66a? Do you remember the entire street like that has happened to my parents, where they lived on Grandview Avenue and it was five twelve. And then, there was enough houses being built, it suddenly became. You're now 283, Right. they just totally change the number on the house because the Russian system would avoid that though, Because you just keep going, if there's a million places. So it depends on what you optimize for. It's all about picking the right tool for the right job. David Egts: Yeah. Steve Pousty (TheSteve0): there's yeah. so, David Egts: And I had to dig it up. There is an article I saw two about how different cultures use different ways to Orient and themselves. And there's one that they had an example. It's like you have a fly under northwest leg Steve Pousty (TheSteve0): Yeah, I think it's,… David Egts: Yeah. Steve Pousty (TheSteve0): I think it's an Australia or I don't remember. It's some native group that there is no left and right. It is all given based on … David Egts: Yeah. Steve Pousty (TheSteve0): where northwest whatever it is like I'm on your west side,… David Egts: Where the sun is? Yeah. Steve Pousty (TheSteve0): And then if I turn this way you become on my east side, you're not on my right side anymore. David Egts: Yeah. … Steve Pousty (TheSteve0): You're right. David Egts: it's relative to the person. Yeah. Steve Pousty (TheSteve0): From where you are but I guess they probably have some front behind. I don't know. But left and right, don't exist. It's just the direction of. Gunnar Hellekson: When I was growing up in Hawaii we had the Ford. It took me forever to learn the cardinal directions when I moved to the North America because in Hawaii, it was always you were Malka, which is mountainside or makai, which is Waterside or… Steve Pousty (TheSteve0): Right. Gunnar Hellekson: Eva, which is basically one mountain or cocoa head. And so that was the elbow planes and… Steve Pousty (TheSteve0): Right. Gunnar Hellekson: cocoa head, and that kind of oriented you left and right in the city. And then you are either water side or… Steve Pousty (TheSteve0): Yeah. Gunnar Hellekson: if I no, no, it's malca of, that old bakery. Steve Pousty (TheSteve0): Right. yes,… Gunnar Hellekson: Right? Yeah,… Steve Pousty (TheSteve0): Smith Street Malcolon, right? or… Gunnar Hellekson: yeah, right. Steve Pousty (TheSteve0): whatever? Yeah. Gunnar Hellekson: I have no. I got to the US and it was like, I heard people throwing around north South east West, that I knew in principal what that meant. But I couldn't get it in my head. it took me several years. To get used to an alternate orientation system Steve Pousty (TheSteve0): And that makes sense. Though, if you think about it for Reals, we can't sense. Cardinal directions, So, what good does it do you say? it's north great. Which way my facing right now. how do I know… David Egts: Right. Steve Pousty (TheSteve0): which way is actually north? So the people I feel bad for people, in Iowa or some of the planes of Wisconsin, How do you orient? there's nothing there you can't say, this side of that, or you can't see it. David Egts: Yeah. Steve Pousty (TheSteve0): What here, I get mountains ocean and then in between, you down Watsonville way, or up to Half Moon Bay, you can say which way I'm on route one. the happening bayside or the right and then, it's on that side. Yeah, so… David Egts: for me. Steve Pousty (TheSteve0): addresses I'm gonna bet they're in yet another thing that was forced on humans by Go moving to cities like moving into big cities and stuff and then south east and west. I bet it was only really used by explorers and things like that who had a compass with them. so, … David Egts: . Steve Pousty (TheSteve0): there's my little fake on that one, but was there something about open source? I'd actually. so back to standards. I think all. If you want to be a standard, you have to be open source. That is my feeling like, if you want to be the standard. David Egts: Steve Pousty (TheSteve0): And everybody has to build to that standard or do something that's standard. Then you cannot deny people access to understanding that standard based on their ability to pay, You can, but I don't think that's right. Is what I'm saying. So, all standard should you? Gunnar Hellekson: But that what you say is I agree with you and it is also true that somehow we've managed to build a pretty sophisticated society with the vast majority of standards actually being plumbing standards gated building,… Steve Pousty (TheSteve0): Yeah. Gunnar Hellekson: standards gated huge chunks of stuff that we do or a kind of locked behind these either credentialed or for pay, you… Steve Pousty (TheSteve0): Yeah. … Gunnar Hellekson: it's whatever. Steve Pousty (TheSteve0): I think but that partly contributes to the concentration of power incorporations, if we're going to get into that kind of stuff,… Gunnar Hellekson: Those right,… Steve Pousty (TheSteve0): right? Because those… Gunnar Hellekson: yeah. Right. Steve Pousty (TheSteve0): who have money can make money and those who don't have money Cry as hard as they… Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): they still can't. But I can't afford the standard, And so, therefore I'm gonna be the princess electrician until I understand the standard and then I can go do what I want to do. So, I don't like it at all and then I think that was also related to you guys were talking about data available. I think and I think part of my outlook on that standard stuff also comes from being in the United States. We have been very fortunate and still not healthcare, but all data created by the US government is in the public to me and local governments as well. all data created by the government, has to be given to the people. And so when I was you going to early GIS conferences, you go to try to get candidate or UK data. Okay? Tony Young. Steve Pousty (TheSteve0): How much you can and it was not cheap either, they were trying to recover some of the work that they did in doing it… David Egts: Steve Pousty (TheSteve0): but the US didn't have that. So I feel like that kind of stuff and I always felt like it should be that way Canada has now becoming much more that way and… Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): the UK has done that as well too… Gunnar Hellekson: Do you feel like,… Steve Pousty (TheSteve0): where they started open up there. Gunnar Hellekson: Do you feel like the broad and three availability of that data? Kind of? Because I'm thinking in my head about, where most of the interesting GIS software is happening, and I think it's all happening in the US. Am I wrong about this? Steve Pousty (TheSteve0): Turn this part. Yes, there's a good contingency in Europe, not a big company but a good contingency of people working on it there. Gunnar Hellekson: Yeah. Right. Steve Pousty (TheSteve0): I mean, there's people everywhere but there at least in the false 4g world like the free open source for GIS is a big European center and then there's North America Canada and… Gunnar Hellekson: Yeah. Yeah. Steve Pousty (TheSteve0): US USA in Canada and stuff like that. yeah, what and your point about that was sorry, you're gonna bring it Gunnar Hellekson: Just that the availability of data is kind of a pure good, all by itself, but it also creates a bed for innovation that can happen,… Steve Pousty (TheSteve0): Completely completely. Gunnar Hellekson: right? Steve Pousty (TheSteve0): Have you guys heard of the Overture Map Foundation? Gunnar Hellekson: No, you mentioned that earlier. In the same breath is Open Street. I know Openstreetmap, right? But like it. Steve Pousty (TheSteve0): Right. … Gunnar Hellekson: Tell me about over. Yeah. Steve Pousty (TheSteve0): remember I said I wasn't going to talk politics. I'm about to discuss open,… David Egts: There we go. Steve Pousty (TheSteve0): I'm about to discuss, open street map and that's a hot button issue. So this part of the discussion about Open Street map is solely my opinion and… Gunnar Hellekson: Steve Pousty (TheSteve0): my not any official for anybody. Openstream is awesome. I have been contributing Since my kids were little, I actually took them to Santa. We went to a mapping event in San Jose where we tried to find toilets and not public, toilets, right? And you see pictures, they sent out a flyer with my kids and my dog in we all effects and it was great. So I loved Open Street map and I know Steve Cook's like Steve Coates, the founder and I go way back and I like him. I don't like the licensing these though. I think the licensing that you stifled innovation,… David Egts: Mmm. Steve Pousty (TheSteve0): It was one of those. If I remember correctly, I haven't looked at it in so long because I just gave up on it. It's something like if you make any changes to the data, you're required to get it back. and feed it back into the original data. So, If you wanted to improve the data and you wanted some advantage from that you couldn't actually use their data to do that, which I always thought was wrong. I don't like that idea. Gunnar Hellekson: So, it's like a GPL equivalent. Steve Pousty (TheSteve0): Yeah, yes. David Egts: As yeah. Gunnar Hellekson: That's like, Steve Pousty (TheSteve0): But for some reason. I want to say it's a hardcore GPO there is something about it that I was like Whoa maybe it was that I thought A lot of corporations with love to. Collect more data and maybe give it to Openstreetmap. If they were, I remember what it was, but it was hindering corporate. To I forget what it was on. Maybe I'll send you the notes on it or something like figured out. that's their license. But the thing is they've had discussions about licenses because they wanted to open up for more people use and that community in general is pre-contaneous and they're pretty our way or See you later, At least that's been my experience. It may have changed. I haven't been as involved. So what ends up happening is, Meta makes a face map for All that Facebook stuff. Amazon makes a base map. Steve Pousty (TheSteve0): Apple makes everybody's making base maps with the same basic information over and over. And there's no Gaining of it. It's just the same thing, being duplicated a bunch of times. Gunnar Hellekson: Right. Steve Pousty (TheSteve0): So what the Overture Foundation is, which I think is great. after how much I'll say that, what the Overture Map foundation is it's a Linux Foundation project. And I like to think of it as the Linux kernel for map data. So what they're doing is they're handling all the base layers. They're not gonna get into ice cream stores, they're gonna get into store what's the geolocation of stores? Maybe they're opening and closing hours but not into high-end data, particular data things, right? And then the licensing on that is more of an MIT license, Which is outwards. To Oddb open data. One of the ogbd licenses, so I really like it a lot and you've got Steve Pousty (TheSteve0): It's under the Money Foundation So it right now, it still has a good funding source and it just like the Linux kernel Meta is a big member Amazon Google's a big member or maybe Google's not. I'm trying to think of the engineers I worked with there. the founding members have to give a certain number of Fdes. To actually work on the project as well. it's like an in-kind. Other part dimensions. Gunnar Hellekson: I see. So with it. Steve Pousty (TheSteve0): So it's me. Gunnar Hellekson: So with each of these separate base map Thiefdoms, this is a way for them to join forces. Kind of, get some economies out of having one common base map,… Steve Pousty (TheSteve0): Yes. Exactly. Gunnar Hellekson: and then everybody can differentiate on top because the licensing allows them to do it. Steve Pousty (TheSteve0): And that's the point. nobody's exclusively getting value, because they have a little bit better base factor. This person bit They're set of ideas is different from their set of ideas. That's not where the competition is anymore, What you put on top of it. The other thing and this if they pull all this will be awesome. They're trying to come up with Global unique, entity IDs. so basically, unless that building is destroyed in every release of their map data, it will still have the same ID. And unless that road is removed. Every there, they're actually taking some of the Openstreetmap data and processing Into longer segments so that because the Openstreet map likes, the roads a lot. And so they're trying to say No, we're gonna do bigger, you can split all you want inside, but that idea of that road is going to stay the same. so therefore people could Steve Pousty (TheSteve0): which would be the best is I can get the ID for the. Ice cream shop down there and it'll be the same. I can link that to all sorts of other information with that idea as well as a primary key for everything. So, that'll be huge. Gunnar Hellekson: And where do you start? I mean, you have to generate an algorithmically, right, you have to, right? Yeah. Steve Pousty (TheSteve0): Yeah, yeah, I don't know how they generate it. I know I am. I don't know how they're doing it, but everything like all their base layers,… Gunnar Hellekson: Okay. Yeah. Steve Pousty (TheSteve0): they're trying as much as possible to give them unique IDs and… Gunnar Hellekson: Okay. David Egts: Mm-hmm Steve Pousty (TheSteve0): they'll stay over time, but that was pretty cool. Gunnar Hellekson: Okay, maybe this is a good segue into something. I was curious about asking about is, I don't know if you've heard about artificial intelligence and machine learning, it's Yeah,… Steve Pousty (TheSteve0): Wait, what? Gunnar Hellekson: no, it's a… David Egts: No, we've got 20 minutes and… Gunnar Hellekson: We're making rocks thing. Kind. Of the time. David Egts: haven't mentioned What was open? Yeah. Steve Pousty (TheSteve0): I know, I know, and I just started working for a company that does this kind of stuff as well. So, Gunnar Hellekson: the real I'm asking because I'm curious where you think because obviously every industry is trying to figure out how to apply AI it's construction. It's that they go medical, whatever it's not immediately. Obvious to me. How AI should be applied to kind of the geospatial problem space. So what is there anything interesting going on there? Or any kind of novel applications for GIS? Steve Pousty (TheSteve0): Okay, so before we get started, now you've touched on a really big opinion issue, It's called AI because the tech grows wanted to sell more software and more services. It is a very, very excellent and amazing machine learning model better than anything we've ever done before… Gunnar Hellekson: Yes. Yes. Steve Pousty (TheSteve0): but it is still machine learning. It is not AI. So Joe Rogan you can let go of the idea that you're hurting something when you talk to it. I mean, I think that whole thing about trying to say what one of the reasons they're doing that is because Okay, so here we're getting more political, but my opinion again. Is I think the big companies that have done a big indices, right? The big databases, they used material, they shouldn't have. And so to get around that, they say the AI learn because you can learn anything from material and then regurgitating in a different way and that's not common, right? But they're gonna try. So they want to say this machine was learning and now it's getting back information after it learned. And then there's Yes. Gunnar Hellekson: Yes, because there was a legal transformation that we went through here,… Gunnar Hellekson: right? Yeah. Steve Pousty (TheSteve0): Yes. And so Steve Pousty (TheSteve0): So it's not Stop it and… Gunnar Hellekson: So sorry. Steve Pousty (TheSteve0): the road. No, you can say it… Gunnar Hellekson: When I said decision trees. Steve Pousty (TheSteve0): because that's everybody's saying right that's what everybody's saying and we can as long as we establish a front that when we say we're just using it because the industries using it and not that we believe it's AI. Gunnar Hellekson: Yeah. Yeah. Steve Pousty (TheSteve0): So when they make up all sorts of terms, hallucinations it's not a hallucination. When it says something weird, that marvel is predicted. It's called an error, The model produced, an error, it's not hallucinating anything. If you have forced the model to keep work making words and you said Make a work no no matter how confident you are in it. So it makes the word. And if that words crazy, then it's gonna start going down. A crazy bad because you hold it, keep going no matter what, So it's not a hallucination and so that brings up two things for me on this one I am really upset with the companies that produce this but again, it's kind of stuff. They actually know the probability of each word as they're, putting it down, What the probability that word? I've seen an open AI demo, they used to have this where you could put it in red different shades of colors. Steve Pousty (TheSteve0): If they kept up for the general public, it would be so helpful. At least in terms of understanding where I should pay attention because this thing might have gone wrong, So if it has a low probability word, even if it's followed by a whole bunch of high probability words, you should know that that point in the discussion, it might be really wacky because it didn't really know what to put there. So it made up a word basically to put there and they don't do it And so what ends up happening is when the statement comes out of the computer It is actually true statement. The whole thing is a fact, And so people and this is why I don't like it calling ai and hallucinations and all that stuff. People believe computers are not from members in Excel spreadsheet. Steve Pousty (TheSteve0): How many times have you heard people say that? Number is wrong? No, no. But the spreadsheet said That's that So it's that number, right? And so we already have this tendency to believe computers if it comes out of the computer. So you get it started to talk in natural languages and looking pretty much like a human talking. It's 10 times worse, There's my Ai thing. Yeah, and then we can talk about open eight open source a high. But you asked about the geography thing first. Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): so, The geography thing, What I've seen mostly there's two spaces, I've seen it using natural language processing, not quite as much. what I think I've seen most people try to do Is build an interface where you can tell speaking analysis that you want. I want all the Bubba within Baba and then make sure they're Bubba and it'll turn that hoods into the thing that actually does. The right? So I've seen some work on that most of the work I've seen has been more in remote sensing and computer vision. right where these transformer models and all the high power, we have now are doing better jobs of Creating more accurate maps for me from remote sensing or finding more accurate data from the remote sensation. Gunnar Hellekson: Okay. Steve Pousty (TheSteve0): You're actually governor you're still at, right? Yeah,… Gunnar Hellekson: If any? Steve Pousty (TheSteve0): so your company was one of the first ones. Along with NASA, I mean your parent company, not Red Hat but the parent company released one of the first foundational models for remote sensing. It's called Trivia or something like that. It's inhibition. If you look very you can find it. But now there's other people coming out with other Foundation models. there's a ton another vision places but they're one specifically coming out for world sensing, which is awesome because I think the real future around this kind of stuff is fine thanks for giving me that big foundational model, that's great. And now knows how to speak and a general understands how language works. Now, I wanted to understand this stuff and then speak to me about this very specific thing, right? And the foundational models are great… Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): because the earth is very like a rainforest that look like a rainforce another place so you can tune into the area. So that's… Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): what I've seen. Yeah and I've just started entering the world of computer vision. I've done Image processing, for GIS different satellite images, zero photos, photogrammetry. But I just started working at Voxel 51 and they're all about the data. That goes into the image processing and then evaluating how the model did afterwards. they're not building the foundational models but they let you know because This is another thing. we can go on forever. Everybody keeps trying to build these bigger and better models. Or everybody thinks that's what they should do. And if you look at the leaderboards they're moving half a percent makes you the new leader right? Which is immaterial basically Steve Pousty (TheSteve0): But what I've heard from most people, and I can believe given myself background statistics is if you have $10,000 to spend on a new model versus $10,000 to spend on cleaning up, your data will outperforming tuning your mom, anything you can do with the model but far away,… Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): that's dependent on the data that you use and how clean it is and what's the distribution of it? people's face colors, what your distribution of people's face colors, right? So yeah,… David Egts: Steve Pousty (TheSteve0): people should really care about data. I've always loved you but Gunnar Hellekson: I heard here's a funny consequence of what you're talking about, make sure your inputs are clean, That's kind of a steam. Steve Pousty (TheSteve0): let, Gunnar Hellekson: In the show that we started so keeping your inputs clean the Gunnar Hellekson: There's an open database of English word counts, So naturally occurring,… Steve Pousty (TheSteve0): Mmm. Gunnar Hellekson: How often does this word appear in the wild right useful for lexicographers and dictionary OED but two years ago. they recently stopped building the database because they started scraping… David Egts: Yeah. Gunnar Hellekson: because of course they use Internet as the corpus and… Steve Pousty (TheSteve0): Yeah. Gunnar Hellekson: the amount of AI generated English text on the Internet is now hosing the database because it's skewing, Because now something that we're words like Dell or… David Egts: Yes. Gunnar Hellekson: something way more popular than they were before because for some reason open, the LLM loves the word Dell, right? Now,… Steve Pousty (TheSteve0): Right. Right. Gunnar Hellekson: it's impossible for them to get a clean human generated corpus. Gunnar Hellekson: Of the Internet. the most recent one I think was from 2022 before it started getting poisoned wild. Steve Pousty (TheSteve0): Yeah, totally Okay. Yeah, there's many things to go from that. universal income, mean you think What's the connection between that universal income? Partly. It's the tech people's fault because People building cars getting replaced by robots we're like that's just the way the future they can work on a higher and better thing Then What would give them time to do that? We'll retain them but now that it's coming for tech you're like we gotta watch out for this thing. Gunnar Hellekson: Yeah. Yeah. Steve Pousty (TheSteve0): It's kind of the only thing I can see is a logical conclusion of this is some sort of universal basic income. If you keep making everything automated, if there's way more people than there are things to actually do. You can't expect. I mean, I don't know. Gunnar Hellekson: But we thought we've talked about this on the show before if absent, some mechanism like universal basic income, that profit that margin goes somewhere and absent, a universal basic income. Or That margin occurs to capital, right? the bosses are going to take them. Steve Pousty (TheSteve0): Right. Gunnar Hellekson: The company is going to take the money. The corporation is going to the money. Yeah. and… Steve Pousty (TheSteve0): For sure. David Egts: By more GPUs. Gunnar Hellekson: so it would be nice to have this AI driven automated future where our whole lives are easier and that actually ends up enriching the life of an individual as opposed to enriching a balance sheet somewhere, right? That would be Steve Pousty (TheSteve0): Yeah, and we're really far down that path right now of origin. Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): I mean, it's both the Clinton administration. we weren't going to talk about politics. Done on that one. We're not getting an basic economy,… Gunnar Hellekson: So okay,… Steve Pousty (TheSteve0): income or anything. Gunnar Hellekson: we'll Watch me pivot away from politics. So What's that? Steve Pousty (TheSteve0): I got it, too. You want your good? I've been talking about,… Gunnar Hellekson: There, are you? Yeah, good. Yeah. Steve Pousty (TheSteve0): you could No, the talking about data. The big hullabaloo which I'm really passionate about is, What does it mean to be open source? AI. Gunnar Hellekson: That was my pivot. Okay, go. Steve Pousty (TheSteve0): Brilliant. That's what I look at Michelle. We're like all the same wavelengths. David Egts: Yeah. Steve Pousty (TheSteve0): It's awesome. Steve Pousty (TheSteve0): OSI is messing it up. And I'm really disappointed in them. I don't know that metas on the board and Amazon's on the board voting for what the definition is I'm sure Oracle and Microsoft would love to be on the board on the definition of what OPENSOFT software was What are they doing on the board? I don't even understand why they have a vote at all. so that's one thing and the next. So basically what they've come up with and I don't know who they're hearing it from why they are so bent on this. They are not requiring you to share your data to call yourself open source, AI And so they say things like what if it's proprietally healthcare data like they can't share that. I'm like You're right they can't share that they're not open source there seems to be this alter thing happening with OSI right now… David Egts: Right. Steve Pousty (TheSteve0): where they want everything to be called open source. and it's just not. And so why are you trying to force? Like we say that all the time in software, we told Mongoda we told Red is that we told a bunch of people that is not open source. Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): You can do whatever you want but not you can't use that label for some reason here, they don't want to constrain the label and they want to make it spread a much farther and it's wrong. I wrote a whole blog post on it and I can give it to you if you want to put in the show notes, but they're violating all so many different freedoms by not requiring the source data, That's basic even if I can't compile the code, I should be able to look at the code and see if there's something in it without the data set in a machine learning context you do not have the source, there's absolutely no way you can get to the weights without looking at the source data together is the code in open source software and the actual model weights are like the compiled thing that yes the binder exactly. David Egts: Binary. Steve Pousty (TheSteve0): So I gave a thought experiment in that post as well about why data is required. right now, this instant every single instance of PostgreSQL Binary disappears off the planet. Gunnar Hellekson: but, Steve Pousty (TheSteve0): For some reason, whatever. The next instant someone could grab that source code again and reconstitute that exact same binary. You might have a good different timestamps and so therefore it has a different whatever, but it's still the exact same binary. imagine for some reason. all the llama 3.1 weights disappear from they're no longer here. Are you going to regenerate those without the data? there's no way to do it, so It is and I'm so disappointed I. And if they okay I'll say this off the record kind of, but I hope somebody sees on this. I hope this software Conservancy or somebody says, you don't no longer own the rights to open that term because you're violating the principle. I hope that someone sees on it because they're really, really dropping the ball. And then now, there are also talk, One of the proposals also was, we should give them great. Steve Pousty (TheSteve0): But you're this level open source. I was like, Why are you starting with AI to do that? if you want to do that, start with software, you understand that really well and… Gunnar Hellekson: Yes. Right. Yeah. Steve Pousty (TheSteve0): there's a long history start there and if it works bring that over here. But for now keep it nice and tight. And if you want to expand in the future, you always can. And this is the other thing. There's nothing to stop anyone from doing all the things in without the open source. Label. It's a marketing term. Right. Basically, it's saying our license is approved by OSI, so we can call ourselves an open source. But if I wanted to do all the things in the open source license without calling it open source or putting it on an open source license, I can just say this make my own contract. I've done the same things. So there's nothing stopping any of these companies who say they want to be open source for doing all the steps and if they don't so Aren't they doing it already? If they're open source, do it, Why aren't you doing now? And it's because you want to marketing term and if you're doing it because you want to marketing term then that's not the way it should be for open. Right. Gunnar Hellekson: So just to pull on the thread a little bit. So the… Steve Pousty (TheSteve0): Okay. Gunnar Hellekson: here's my understanding of, open source that Why is it important Have an open source Ai? It's important. Yeah, you've should reconstituting your rebuilding to Absent. It's okay. All the biners disappear. I need to be able to go do it again, I think what most folks are after when they hear the words Open source ai. what the intention of it is like, I somehow either I can somehow improve the model or I can somehow. Enhance the training or I can enrich the model somehow, And there's some kind of feedback loop between me and the bill or the base model, where Maybe I keep some of my work and I have a fork that I enjoy or… Steve Pousty (TheSteve0): Right. Gunnar Hellekson: or maybe I hand some of the data back and I enrich the base model and… Steve Pousty (TheSteve0): Yeah. Gunnar Hellekson: that's the kind of dynamic that it opensource AI is trying to evoke or trying to encourage, right? So… Steve Pousty (TheSteve0): Yes. Gunnar Hellekson: which may actually makes your critique Gunnar Hellekson: It's even more perplexing that they would allow for a license like this. Because if you can't share the data or the weights, there is no way for anyone else to improve the model. Steve Pousty (TheSteve0): You can. Look, so don't know… Gunnar Hellekson: What are we doing here, right? Steve Pousty (TheSteve0): what they doing is. They're sharing the weights, So they're basically sharing the binary and they're saying You can find tune it,… David Egts: That's open. Steve Pousty (TheSteve0): You can tune it. Gunnar Hellekson: Yeah. Yeah. Steve Pousty (TheSteve0): So you can add on after the fact and change the behavior of the weights, which to me is we have an extension way. We haven't a way of putting in plugins, we have a way of linking to other libraries and you can do it that way. Steve Pousty (TheSteve0): Just even the most. If open OSI wanted to be associated with good in the world, they would force this. there is so many examples already where without being able to see the source data. You can't know if the data sets you so huge. You can't know what's in it. there was that whole example of a very popular vision data set built by some company, turns out there's like 12,000 c********* pictures in the whole thing and the only reason that people do it is because they look at the source data. Right? and there's a bunch of other examples like that. So to me, I don't understand how you can't require. But here trust me, here's the weights. That's one of the big things about open sources. All bugs are shower. Right? Because we can all see the source code. Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): In this case all bugs are not shallow, we don't even know if it's a bug because This is huge model, producing a big sets of weights and We don't really understand how to analyze them yet. so, yeah, it's very very disturbing to me, especially after what just happened with Redis and Elastic Surgeon over there, Pure you either are or you're not. You're and then along comes open source and we want everyone to be AI. Gunnar Hellekson: Yes. Yeah. David Egts: So to me,… Steve Pousty (TheSteve0): David. You wanted to say something? Yeah. David Egts: yeah, so I always go back to, the good old days of open source of it's like buying a pack of hot dogs without knowing the ingredients unlabeled, right. And the thing here is that there's the I also think about the political aspects, not global politics side of it, Where a certain model has to conform to a certain country, and what they believe is their beliefs right? David Egts: In Europe, they'll say that, we got to have our own AI because the AI that is coming out of the United States, has a very American slant to it. So, even if I talked to a model, I get an answer back and then I translate it to a different language is still going to have American values associated with it. And if you provide those weights, you don't know what sources are being used. and you could also see where thumbs are being placed on the scale. In terms of this is the wrong kind of thought. and so I think that's important, too. Steve Pousty (TheSteve0): yeah, I mean This is the thing, I don't think most people understand or maybe they're starting to understand now about those AI models. They're just reflecting the data that they were trained on that's all they're doing. David Egts: Yeah. Steve Pousty (TheSteve0): So if you train a model on the works of Shakespeare it's gonna talk to you and Elizabeth in English like that because that's all it knows and the same way if you trained on so I'm watching Shogun Right. And at least at that time. Gunnar Hellekson: It's great. Steve Pousty (TheSteve0): It seems like,… Gunnar Hellekson: By the… Steve Pousty (TheSteve0): yeah, it's really. Gunnar Hellekson: by the… Gunnar Hellekson: official David gonna show endorsement? Steve Pousty (TheSteve0): Yes, yes,… Gunnar Hellekson: Shogun is great. Yeah. Steve Pousty (TheSteve0): it's good. And I think they're probably doing a really good job showing how the Brit, probably acted but that clash of values when it was like, we'll just run away together. And she's like, that is not something that's in our universe, there's a totally different conception of self and… Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): relations of society, but let's just say that I understand what show events trying to say, and I'm correct and understanding that, it seems like death is not the worst thing to have to happen in that Japanese society at that time. The worst thing is dish So all these people talking about the AI being alive, if they had trained it on and the AI says, I don't want to die because that's what we usually say. I think if they had my hypothesis, if you teamed on that data, they would say Dying spine. Just don't dishonor me, it's because that's the values that are associated, but people don't understand that somehow, again, back to the computer. It's truth, right? And they don't think about all the data going into it all that stuff. So, it's not. Gunnar Hellekson: There's a term we've used on the show before and if I remember correctly I came from a friend Julian Sanchez. When I was living in DC would a guy, I know and he came up with this term which is fantastic. First, of course, it sounds amazing. The term is epistemic closure right? Steve Pousty (TheSteve0): Like a systematomy or… Gunnar Hellekson: This idea that yeah,… Steve Pousty (TheSteve0): whatever that is. Yeah. Gunnar Hellekson: the epistemology is now closed. It is hermetic right? Steve Pousty (TheSteve0): Yeah. Gunnar Hellekson: No, he goes into it and when you have a model, you have basically wrapped an entire knowledge set in Saran Wrap, because nothing is going into it, As you're pointing out, and you're only gonna get out of it. What you put into it. So you're basically snapping a chocolate and it's saying this is how smart it's gonna be or this is the opinion that it's gonna have. Steve Pousty (TheSteve0): Right. Gunnar Hellekson: It's not gonna lay to learn unless we have a facility for ragging our way into some other set of information. Steve Pousty (TheSteve0): Brett but it won't inherently come out of the information itself, There's not going to be novo think of there's the four freedoms of open source… Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): if it's never read that stuff, And I think this is one of the other things that I. So in that blog post, I also put down for every one of the violations. I thought I said, here's the way you can prove me wrong. demonstrate this and I'll take back what I'm saying. But if you can't, then it seems like my point stands. And so one of the ones I said is fine tuning idea because some people say, fine-tuning is the ability to modify the model and I think, No it's not. if fine tuning, the exact same as modifying the model I should be able to take the fine-tuning data enter at the beginning when you're building the foundation model and I should get the same weights as if I find to them. And if I don't that means there's actually something different in the process. So go ahead, do that. Show me and then we can talk about it but I am pretty darn sure. Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): That if you put the data in the beginning rather than fine-tuning on it afterwards, it makes a huge difference in terms of how that data is factored in. So it's not the same,… Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): it's just not the same. So I think OSI is messing it up and I hope somebody does better and I kind of was upset with OSI at the time with the redist stuff and all that stuff coming up because it was obvious what to me and I think a lot of people what those companies are trying to do, they're trying to stop big corporations. From taking their software and not contributing back, And basically keeping them out of the market too. you can make no money on your own office or software. So you're not a company. And I know open source is not a business plan, but Steve Pousty (TheSteve0): But OSI was like, nothing, there isn't? We're not budging an inch and I'd be like, maybe you should come up with a new license. I can at least help these people who are trying to build a business like I want to encourage people to make open source software and if you make it basically so that I make open source, software and then Amazon's got Microsoft or Google is gonna own it. I have no incentive to do it unless I'm think I'm gonna get acquired. So, David Egts: Right. Gunnar Hellekson: I mean the irony is the root of the Where OSI came, it was the FSF versus the OSI, and OSI was actually the more business, friendly alternative To the More Doctrine area FF, Steve Pousty (TheSteve0): Yeah. Yeah, and Gunnar Hellekson: So as usual you live long enough to become the villain, right? Steve Pousty (TheSteve0): exactly. Not me though. I always change. I'm, strong beliefs of literally held strong belief in Seattle. so that's my feeling on the open source, AI stuff. Okay, and I don't like all this focus on natural language processing. This is going to be a challenge for me as a developer advocate for a computer vision company. And I talked about this during my interviews with them because we were talking about division in text, text is pretty easy. to convey the other people convey what you're doing and they can kind of understand it because the grammar is It's constrained. And it's linear and it's constrained and we all know how to read or you have to know how to read to use these. And so, it's something we've learned. And so it doesn't take but learning to read was a conscious thing. You had to go to school. Someone talk to you to read. Steve Pousty (TheSteve0): We do vision without anybody teaching us that we just do vision. And so most people don't think about okay, what's happening in our neuron, what's happening in the processing center of our brain, like the whole pathway to get us to say, that's a dog. Right, and what generalizations we've done over time because there's a lot of steps. That matter in. Image processing that you don't think about what's the size of the pixel you're gonna use because depending on your pixel size things have more or less resolution things that were bush. become just a blob of green or things if you go too fine, that Bush has black in the center It's gonna see that. It's not part of the bush because it's not a So that's just one example of things that you have to start teaching people. I mean, the hardest we get to I think in terms of doing a base level for doing, NLP is you have to think about chunking Steve Pousty (TheSteve0): And that is a real concern. What everybody's showing now how you chunk has a huge influence on what the output is? but that's about it, there's Tokenizing. and then I really wish they would just say, Can I just say basically what these models are Auto regressive time series. Constrained by some input you put in. So, auto aggressive means it keeps going in a direction that it wants to it once. It starts in the direction, it's going to keep going in that direction. David Egts: Steve Pousty (TheSteve0): It's a time series. Except the time steps or words. Right. And then the extra constraint is your prompt or whatever you said. And so it's gonna say, okay I'm gonna predict this next Based on the word behind me and based and I'm thinking the strain in the University of Answers based on what the problem was. It is. Under the hoods. I mean there's a lot of stuff happening. At the highest level, that's what it is. but that's not the same with vision, there is auto regression in it in that you pass a little filter boxes around. A lot of times like you chunk everything up, it's the vision version of Pumpkin. You have to take a vision and break it into small chips and then you pass filters over stuff like that. So it'll be interesting to see how I'm going to Steve Pousty (TheSteve0): I don't vision. image processing for the longest time and so it'll be fun to try to explain to people from the beginning that aren't GIS people because we learned remote sensing and then you get a lot of these concepts. so I'm excited for that part, it should be very fun. and I like it… Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): because when we say, there's a lot of different use cases for NLP, It's always basically. Some sort of better text interface to something else that we have or better knowledge base to something that you have with image processing. There's so many different things that you could actually be trying to do, right? So there's object detection. So is there an object? Where is it? And then you can have segmentation which is either find me all the pixels that are this course or this entire picture. I want you to assign every pixel to a class. Right? And so then where things are. You have that? No depth ception. some of the things there's also depth that people can work with on just really New model. I don't know how it works. It's magic. They just released a new model. The model doesn't need to know the focal length and it's monocular. So it's a single shot. Steve Pousty (TheSteve0): It's not like we see because we have two things separated,… Gunnar Hellekson: Yeah. Yeah. Steve Pousty (TheSteve0): This is just a single shot, they're estimating depth within a picture and keeping fine-grain detail. I'm like magic? Yeah, so I think last week or something like that, it's amazing for today. But it was really Gunnar Hellekson: I mean if I write something like that I would just be the stuff in the center is closer. I mean, it's like there's nothing right. Steve Pousty (TheSteve0): That exactly at the biggest scale that we have the same. Gunnar Hellekson: It's like, Steve Pousty (TheSteve0): I think it's called cobbler's law In the geography GIs and field is called things closer together and more similar, Gunnar Hellekson: Right. Yeah,… Steve Pousty (TheSteve0): Right? and… Gunnar Hellekson: right. Yeah. Steve Pousty (TheSteve0): that works pretty well for predicting stuff, So, Yeah, we'll see what happens. It's fun. I mean, and I went to a startup because I decided I really don't. If I can work for a baby enterprise company, that's not a problem I can, but I enjoy working for startups, even if they're not going to succeed in me. I'm not in it make a million bucks. Did you guys ever see the movie? Fruit trees. Did I talk about this in the last podcast? Gunnar Hellekson: I think so. Steve Pousty (TheSteve0): Have you ever seen the movie Hoop Dreams? it's… Gunnar Hellekson: I haven't. David Egts: so a while ago, Steve Pousty (TheSteve0): It's a documentary about two inner city boys who are both basketball players. And their life as they like doing their last year of high school and then try to go on to college. And as part of the movie, they showed the statistics on. What's the chance of becoming like an MBA player, and they basically go There's this, many kids in the world. We play basketball. This little play in high school. This video play in college, this many will play in something after. And so basically that's the hoop tree, right? I'm special, I'm gonna make it and I think If you get into that huge valuation. It's basically hooters like most startups fail, and that's just… David Egts: Yeah. Steve Pousty (TheSteve0): how it is. And you might be exception But go into it knowing that you probably not exceptional, Everybody thinks their Facebook Right? So I like it more for the whole company feels like a team. And the type of this politics because humans are involved. So, anyways, there's politics. But the type of politics are ones that I can. I think a little bit better where they're going frustrate me as much because there's none of that. What I get frustrated with sometimes in enterprise companies, is who I have this good idea. I want to do that thing. And what happens is that's in somebody else's business unit or something like that and they have to protect their fiefdom. So no, you can't do that. We have to do it and then you sit there and wait and then they don't deliver. It's not their priority and all this and it just never happens. Whereas there's always more to do than there. Are people to do it in a startup and you're Have this great idea. Yeah that is a great idea. Go do it right? Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): It's just a lot less of that. Inter team. Dynamic people trying to retain headcount and… Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): make sure their team looks good. So, I think that's the part I like, and I like the smallness of it and, being I guess it's closer to the action. Also, Gunnar Hellekson: For sure, yeah, that's right. David Egts: Yeah. What you do is tightly align to the outcomes and… Steve Pousty (TheSteve0): David Egts: as opposed you being a small cog and a huge machine and you don't know if you're making a difference or not. Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): Right, right. And I mean There's some nice things though. so I got laid off from Broadcom. I don't even bring this up. This is not a whole other topic. You guys are gonna totally have to edit this. Maybe you are splitting the two this is a good chunk of stuff, and this is another good chunk of stuff. I got laid off from Broadcom, VMware, but right after they require very broadcom, I think 50% of my unit has now been Lay off right. Broadcom's basically. A private capital firm that pretends to be a hardware company, did they buy? The company squeeze, everything they can out of it, The first thing they just cut costs. So I got laid off. The end where being a Silicon Valley company had a very nice severance package. it was quite generous and so, Steve Pousty (TheSteve0): I was like, what I've always wanted to be a consultant and Are I've got to pump the money in additional money. I say that I can actually try doing this. hence that my fancy new email that I was having you guys 17. David Egts: Mmm. Steve Pousty (TheSteve0): That's my Company Tech Reason Consulting. I love reading. It's ravens are amazing. And Steve Pousty (TheSteve0): And I've been doing it and I met my financial goals that I had set out for myself this year. But I don't like it. by myself, I want a business partner and then I'll dig consulting a bunch of my work is some like I'm subbing to another consulting firm and that I like, because I'm not having to go out and do the BD, all that stuff so that It is really hard work. Gunnar Hellekson: that is hard work that like,… David Egts: Yeah. Steve Pousty (TheSteve0): And it involves talking about money. Gunnar Hellekson: That's right. Steve Pousty (TheSteve0): Which is, really uncomfortable for me, I just definitely not a sales engineer. It's because sales engineers are comfortable talking about money and… Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): I'm developer advocate because I just want to talk to you about how great it is. They'll convince you buy and I'll make you want to buy it but you have to talk to them about how much it costs you, And so, I don't like that conversation at all. And then what also comes out of it is the insecurity about. Paycheck month-to-month. David Egts: Yes. Steve Pousty (TheSteve0): And I guess it's just My risk envelope. That I not confident in my ability to sell. They make sure that my risk envelope for this, dipping into my retirement is not going to happen. So Gunnar Hellekson: Yeah. it sounds like startup is exactly the right balance between risk and… Steve Pousty (TheSteve0): Yes. Exactly. Gunnar Hellekson: flexibility and… David Egts: Yes. Gunnar Hellekson: freedom of movement and… Steve Pousty (TheSteve0): And that was the other thing when I was looking at startups,… Gunnar Hellekson: stuff like that, right? Yeah. Steve Pousty (TheSteve0): I mean, I'm a more senior person in general, And so what ends up happening is you think, I went into his thing. I was at Denver. That salary requirement, you better lower that a whole bunch. Otherwise, your salary is, one third of a round, right? So, you can't be going in asking for that. and… Gunnar Hellekson: Right. Steve Pousty (TheSteve0): they'll give me equity and maybe not, who knows. David Egts: Yeah. Yeah. Gunnar Hellekson: Yeah, right. Steve Pousty (TheSteve0): But I make a decent salary. And I'll have some place to work and I should get used to that and not special. I think in the United States, everybody, we are caught this message that you Angelina. And my wife and I were talking about this is like we're taught that you need to expect the most something that as opposed to, I'm just gonna live a nice life. I'm not gonna be most spectacular, anything, but I'm having a good time doing that whole thing. my way I understand it. That's where I'm trying to get to right now. Gunnar Hellekson: That's great. Sounds like you're well on your way. Steve Pousty (TheSteve0): Yeah. Synthetic data. You want to talk about it synthetic data my first take on it with NLP was like, I kind of understand how it helps. But it doesn't help in the breath of your model, synthetic data basically my understanding the synthetic data which is what we used to call the statistics, imputation like there's a missing value. So the value is you generate new data based on the distribution of your existing data? Steve Pousty (TheSteve0): So what's great about that is you more completely fill the parameter space or the data space, right? Because you're putting new things in there which meet the probabilities. So that could be good. If your data is on balanced, you could take some of the lower stuff and generate more of that to maybe at least help it. Wait have the same amount of weight but the problem with it is it won't teach you anything new. Right. you can't use synthetic data to say in 20 years this is what will be happening because You don't understand a relationship and the data can't extrapolate that way either, these machine learning models in general, are not Based on understanding that based on prediction, they're like goal is predict not to understand. and so, I think synthetic data is okay. That would vision. It matters also though and I think it actually is good in vision. Steve Pousty (TheSteve0): Because what they usually do is they take an image for synthetic data and they'll chop it proper to the first half. They'll turn the contrast really different. They'll take it and rotate it this much, so basically, you permute, that image, a bunch of different ways it is that currently. So that it knows that that's a teddy bear, whether it's sitting like this, whether the sun is shining it or not like you basically push,… Gunnar Hellekson: Right. Steve Pousty (TheSteve0): you can push it in ways, they might actually get so I understand… Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): why they do that. I think that's actually a very valid use case there. It depends on how generally you Gunnar Hellekson: Remind me. This is a related to so IBM research built instruct lab. the user experience of it is some SME can write a fact, basically, a frequently asked questions and then feed it through the robot and then the end of enriching the model as a result, Okay it's actually doing behind the covers is taking the question and answer and then generating a bunch of synthetic data in order to coerce the model into incorporating the answers right? Steve Pousty (TheSteve0): Right. Right. Gunnar Hellekson: It seems like it's not a step. I would have come to naturally but … Steve Pousty (TheSteve0): No. Gunnar Hellekson: I understand okay, yeah, that's kind of how it would have to work, right? If you're gonna go like,… Steve Pousty (TheSteve0): Right. Yeah,… Gunnar Hellekson: rejigger all the weights according. Yeah. Steve Pousty (TheSteve0): you have to. So basically what you're saying is if I understand it because I come back out of my mouth again to understand it is I want to add we're building a model of Steve Pousty (TheSteve0): I don't know. I'm stuck ice cream cleaners like… Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): how to make ice cream flavors, all that stuff. and I come in and I say Butter pecan, pecan, However you want to say it is a valid ice cream flavor and it's made in the following way. Because it's not in this database currently,… Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): We already assume it's not. So what's gonna happen is that foundational that model, the weights or whatever are going to predict the response to what I just wrote and then feed that predicted response back into the training set or fine-tune based on that training set. The foundational model to a new state is that basically everything Gunnar Hellekson: So you have to Let me do it. Let me try to even simpler my understanding is that it's truck. Steve Pousty (TheSteve0): Okay. Gunnar Hellekson: Lab makes it possible to shove a bunch of butter pecan into the model where it wouldn't have been otherwise in order to coerce the model into answering correctly. Whereas it would have been Steve Pousty (TheSteve0): That's actually different. So you're saying it. Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): I say Butter pecan or What's the flavor of butter be gone? Or What's it made of You're saying it's gonna answer and… Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): then It's going to not me directly yet. It's idea. Is It's gonna start generating data to shove the weights in such a way, that they'll be better able to understand the colony answer. Gunnar Hellekson: Yes, right. Steve Pousty (TheSteve0): And then I get next more. Gunnar Hellekson: Yes. Yeah. Steve Pousty (TheSteve0): So basically your ideas to give it a bunch of questions that it currently doesn't have. Right. Gunnar Hellekson: Yeah, it, introduces a bias into the model, only the lines of the Q&A, right? Steve Pousty (TheSteve0): Yeah. Yeah that's interesting. David Egts: Yeah. Steve Pousty (TheSteve0): That's yeah. I'm playing a lot with rag. Don't take from Red Monk. used to be a red monkeys of VMware now. Gunnar Hellekson: Yeah. Yeah. Steve Pousty (TheSteve0): So I saw a video of him trying to use Openai as a dungeon master. Did you see that one,… Gunnar Hellekson: Yes, I saw the… David Egts: No. Gunnar Hellekson: Yeah, yeah. Steve Pousty (TheSteve0): And if you notice when he's doing that, he has to keep reminding the AI about the rules. Right. He's like, No, don't forget cobals like people don't know. And, now you need to roll this type of ice and you almost had it And I was like what I'm watching like This is Radic, This is the perfect rag opportunity. Gunnar Hellekson: Right. Steve Pousty (TheSteve0): So I contacted… David Egts: Yeah. Steve Pousty (TheSteve0): because we worked at the same company at that time then I known him since his red hat days and I think he's great and so I said, Hey dude, this is perfect example. Correct, let's build it together. He's like, I don't have any technical skills that way, so, that's fine. You just have to get me the dungeons, and drag, I played Dungeons and Dragons also, but he wants to fifth edition and I was first edition. Steve Pousty (TheSteve0): So I was like, get it. Let's try to find Sure enough, all the fifth edition rule books and some of the campaigns are in Markdown. But here we go, so easy enough and then lane chain or I think it's lunch and yet linking has a markdown splitter. And I use the model and then I used nomik, that's the embedding model which I think has a pretty long context. Length is one of the reasons I chose it so that I wasn't worried that… David Egts: Steve Pousty (TheSteve0): if I split my work down headers. That it wouldn't all fit into the context length. And so I've got the vector embeddings and the code that does that. Now I have to write the other piece which is Taking someone's Query by database, then send that off to something else and see if it does a good job, right? But I… Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): if Rad works, it's got to work in this case. Right, because… Gunnar Hellekson: Right. Steve Pousty (TheSteve0): if I say, I throw an arrow or I throw a sphere at the cobalt. You're gonna,… Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): hopefully get things to talk about cold, bones and spears in the data set. Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): And it's gonna say, Hey, here's some context for Spears do this? Contain that answer when I explained it to students, I'm like, how rag works is. It's basically saying The foundational model was trained on the entire Internet. So it's a very general vocabulary. When you do rag, what you're doing is you're saying Hey I don't need the entire vocabulary. I need you to focus on the stuff. I'm sending in constrain your answers to be within the area that I'm trying to do. And so that's… 01:05:00 Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): what I was doing. you want to Although I have something in three,… Gunnar Hellekson: Sure. Steve Pousty (TheSteve0): I want to start with three. Another one is the debate between fine-tuning and rag. so great,… Gunnar Hellekson: Yeah, good. Steve Pousty (TheSteve0): my former boss of Red Hat and again, Is part of the instructor lab team, not the technical team, but the marketing team. we were and some other people were in our slack group of former team. We're talking about. Should you fine-tune or should you graphic? Right. And I'm of the opinion because I think of it as me. You should do, Because it's super cheap and It's easy to understand and it's cheap to do and you don't need data scientists to do it, you can just do it on any commodity hardware. and then I interviewed with a company that was doing, Steve Pousty (TheSteve0): Their whole thing was to help you find too. And so I asked him,… David Egts: Steve Pousty (TheSteve0): I said I always tell people to do, What's your opinion on these? Fine tuning is so much less expensive. And it's like… Gunnar Hellekson: Yes. Steve Pousty (TheSteve0): what and he's He didn't say this, he wouldn't give an inch on rag being any use at all and that's fine. But it seems to me like the distinction is rag is great if you're trying to decide, will it work? And you're just prototyping and… Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): you're getting or, maybe it's even better but you don't have any data scientists on your team. And then fine-tuning comes in when you say this actually is gonna work. I'm gonna keep this introduction for a while. I don't want to maintain a database and that complicated architecture to send things through. I just want to use a fine too. Which I haven't thought about. Gunnar Hellekson: That's right. Yeah. Yeah,… Steve Pousty (TheSteve0): So it's cheaper in the long run, but harder and more expensive in the short run. Gunnar Hellekson: which is this is the magic of instruction Lab as it makes that process of fine-tuning much easier and approachable without a whole fleet of data scientists to go to actually do the fine tuning, right? Yeah. Steve Pousty (TheSteve0): Yeah, at least I didn't talk at odsc East Open Data Science conference, it was good that was one of the best odcs I've been in. It was one of the best conferences, all the talks were all about all this stuff happening and All the big players are trying to make dashboards to make fine tuning better for people because I think they all know that's kind of where the money is. And what was I gonna say about? Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): There's something else that you said that. I wanted to say something about though. Gunnar Hellekson: Rag, Tuning And struck Lab Grant. Steve Pousty (TheSteve0): Yes, got it. Thanks, just not exactly what you said, but it kind of helped it constrained by problem space. David Egts: What? Steve Pousty (TheSteve0): We get to the answer. Now, I lost it again. Steve Pousty (TheSteve0): Instruct What I'm fine. Tuning problem, space. David Egts: We said it would be cheaper and the long run. Steve Pousty (TheSteve0): Yeah. It was a little bit off of that though. Gunnar Hellekson: You're talking about rag being good for ad hoc work. But if you're gonna do it on a regular basis probably it's cheaper for you to go fine. Tune. Steve Pousty (TheSteve0): There's so many things that come up with this though, I have another topic that came of it Should you have your own servers or should you pay the cloud vendors for your servers? And I was just talking with a couple people today who are in the field it's morally, irrespoible. To pay cloud providers for your AI work. He's like Thank you. Gunnar Hellekson: Morally. Steve Pousty (TheSteve0): I'm making it more. He probably said something like It's idiotic or I usually irresponsible, It's probably But I can go. Right from here. This one part of New York I can just talk That was my smack meter going up. especially with this GPU stuff,… David Egts: just, Steve Pousty (TheSteve0): unless you're a small company, there's lines usually to get onto that hardware. It's not like these are things that you automatically spin up and spin down the small company. Yes. Right. Because you don't have a big TV data scientists but if you're a big company, the overhead on Amazon infrastructure is so incredibly high. When I say Amazon, in this case I just mean any diaperscalers. It's so incredibly high that within the first six months, you've recruit all your costs by just running it in some data center somewhere. so, that's… Gunnar Hellekson: Provided. Steve Pousty (TheSteve0): Steve Pousty (TheSteve0): what Yeah,… Gunnar Hellekson: You have access to the GPU in the first place, right? This is Yeah. Yeah. Steve Pousty (TheSteve0): You could buy one because someone's not snapping it all. yeah and then we talked about why nobody's using AMD that was fasting. This is one of the cto of me Steve Pousty (TheSteve0): and the reason no one's using the AMD is because AMD just said, here's some open source drivers and chuck them over the fence. And he's like nobody wants that what they should have done is paid people to say We're gonna Learn to the AMD infrastructure. We're gonna pour, all the Python libraries to the AMD infrastructure and then you'll get people to adopt it. Nobody has tummy for unpaid work to do that. That's a lot of work, That's not like a few tweaks here and there so I was like, yeah, you're right. And then that made me think of Intel's gonna have to spend a bunch of money if they want to get any place. That's amazing to me Qualcomm but Gunnar Hellekson: Intel is doing what you're saying, AMD should have done, They've made big investments and stuff the one API and a bunch of higher Stacks of software in order to make it in order to make their hardware more accessible to the rest of the market. Steve Pousty (TheSteve0): yeah. Gunnar Hellekson: Right? With Steve Pousty (TheSteve0): Yeah, and I think Intel already learned that game, even before GPs were thing, you'd go to the market and they'd be like we built this library. So you could do X on the Intel architecture and takes advantage of our special switches and… Gunnar Hellekson: There. Yeah,… Steve Pousty (TheSteve0): stuff. So they understood that game. Gunnar Hellekson: yeah. that's, Steve Pousty (TheSteve0): and so, it's all kind of crazy. Remember what it was saying? Because it was a good one but I forgot. Number. Gunnar Hellekson: That's all right. Yeah, you're gonna go in five minutes should be okay,… Steve Pousty (TheSteve0): Yeah, when I got to go in five minutes anyway, so Gunnar Hellekson: so I don't even know how to land this plane. David Egts: Steve, you said a lot, and… Steve Pousty (TheSteve0): I did. David Egts: I'm gonna spend the next three weeks doing show notes of everything. Steve Pousty (TheSteve0): Yeah. No,… David Egts: We talked about fact, checking everything and Steve Pousty (TheSteve0): no, no, no, no, wait, time out. No fact check. What is that? You didn't tell me beforehand. That there would be fact checking like that's ridiculous. David Egts: Yeah, right, yeah. ternal, I'll have my intern to it… Steve Pousty (TheSteve0): okay. David Egts: With all that for people to appreciate all the effort, I put into the show notes. Where should we send everybody to take a look at those show notes and… Steve Pousty (TheSteve0): Yes. David Egts: to dive, deep on everything you said today? Steve Pousty (TheSteve0): So if they've been living under a rock and don't understand, they haven't seen the shining gleam and glow coming off of this podcast. They should go to Dgshow.org and I absolutely listen to you guys on there. Definitely on Spotify because I listen to you guys on Spotify. I'm almost positive. You're on Apple. And for those stall works,… Gunnar Hellekson: Yeah. Steve Pousty (TheSteve0): that listen to it somewhere else. I'm sure it's in one of the major subscription feeds and your life has been. I mean I brought you in so that's great but your life would be so much better if you say listen to the show regular. I mean it's funny and topical, it's great. David Egts: Awesome. Gunnar Hellekson: That is a well compensated endorsement from Steve's. Yeah. Steve Pousty (TheSteve0): Yeah. Considering I got zero dollars. David Egts: Yeah. You don't like it money back? Yeah. Steve Pousty (TheSteve0): You got a lot of that one. Yeah. All… Gunnar Hellekson: That's good. Gunnar Hellekson: Steve thank you so much for being on the show. Really appreciate Steve Pousty (TheSteve0): Sure, anytime you want to ask me back I love this show. Steve Pousty (TheSteve0): This is so much fun. Just blah blahing about stuff. it's not blah blahing but it's like I get to vent. the Yeah,… David Egts: Get you on your soapbox. Steve Pousty (TheSteve0): yeah and David you're like, somehow we have this really long one. I'm like, as if you thought it was gonna be any different with me on the podcast. next time you should schedule about four hours. Gunnar Hellekson: Yeah. All… Steve Pousty (TheSteve0): All right. Yeah. Thanks you guys. Gunnar Hellekson: Steve thank you. Thanks. David Egts: Yeah. Seriously. Steve Pousty (TheSteve0): Bye.. David Egts: All right, that was fun. Gunnar Hellekson: It's great. This editable transcript was computer generated and might contain errors. People can also change the text after it was created.