AUDIO EDIT Adam Argyle - PodRocket === Jack: [00:00:00] Hi, and welcome to Pod Rocket, a web development podcast brought to you by Log Rocket. I'm Jack Harrington. I'm here today with junior developer Adam Argyle, who's just learning CSS lately, and he wants to talk about how it works with AI and let's talk about how AI sucks at the front end. Hey, album, how you doing? Adam: I am good. I just learned about display flex yesterday and ~it's, ~it's blowing my mind. Jack: Yes. One of the eight different ways of doing layout in the web. Ugh, Adam: I have though, mastered centering already, so even though I started yesterday,~ uh,~ I can center her divs Jack: Yes. ~Right. ~Or you can pay opus an obscene amount of money apparently to, to send her a div for you, God, Adam: Or can it, that's actually part of the conversation today. Can it do that? ~I, ~I've had it burn my entire sessions tokens on ~like ~one CSS task before, and then I went in and I looked it up and I was like, it was one CSS toggle that I did into dev tools. And I was like, I just wasted,~ like,~ who knows how many dollars on you just sucking at that. So anyway, other things. It does great though, so ~it's, ~it's weird. Jack: yeah. ~You know, ~like people [00:01:00] talk about how they have ~like ~five different agents running simultaneously and then I'll watch my agent like go off the rails and ~like ~start like just doing stuff that like, dude, no, like you're ~totally, ~totally wrong at this point. Couldn't be more wrong. Stop. Please. Like it, ~you know, I can't, ~I can't imagine doing it with five, ~you know. ~But anyways,~ uh,~ I was just joking up front. Adam Argyle is amazing. ~Uh, ~Adam and I spoke,~ uh,~ at ~um, ~Cascadia actually last year. That was fantastic. And Adam, if you ever get a chance to see Adam. It's like a whole thing. ~He's just, ~he's just this amazing ball of energy and it's just,~ it's,~ it's just him. ~You know? ~It's not, that's not an act, ~that's just, ~that's just him and Adam: Thank you. Jack: And he, yeah, ~I mean, ~he knows,~ like,~ there,~ your,~ your blog is fantastic for basically showing off essentially everything that the browser can do. And ~like, ~I remember when View transitions came out, like just ~your, your, ~your demos are fantastic. It was like, oh, check it out. I can do this, I can do this. ~And then, ~and then I would go take a look at the code and it was that, and you'd added like five extra examples that you hadn't, ~you know, ~it was fantastic. Adam: Oh, thanks man. Yep. I got, I [00:02:00] obsess with it. This stuff's just so fun. Jack: It is. Adam: kid in a candy shop. I'm like, you gave me a new toy and I'm gonna go use it. ~So, ~and then my blog is just my little playground where I get to, yeah,~ I,~ I at one point called it the cesspool of web platform features because it would break so many demos, because even the browser engineers, they ~kind of ~work with unit tests, right? They're like, look, I made a feature and it does a thing. I'm like, I combined it with eight other brand new features and crashed Jack: And it blew up. Yeah. ~Well, ~awesome. Okay,~ so, uh, ~let's talk about AI ~in the, ~in the context of this. And ~I, ~I seem to think that you might have some negative ish opinions about AI and its ability to do UI work. Adam: ~it's, ~it's just there, there's something really ironic going on, and some of it makes sense and a lot of it's solvable, but ~like, ~the irony ~is, ~is like you sit down and you use lovable, or ~you, you, ~you write a prompt and you have mark on, you're like, Hey, I have markdown content. Make me a cool site tool. And then it goes, wow. And you're like, damn. That's actually pretty good. ~Um, ~and then you're like, okay, but I want you to tweak this one little thing over here. And it goes, Jack: yeah. Adam: it like falls over and you're like, you just generated a pretty good looking site. And then I asked for a [00:03:00] tiny change and you died. ~Like, ~that's weird. ~Um, ~and so that's one of the, like ironies of the things that's going on. It is quite good. It can do, it's very capable, but in other moments it just is weird with you and you're like, what is going on here? And so I decided to write a blog post about all my interactions with front end and,~ uh,~ using AI to generate things and to edit things. And just my observations at why it sucks and why really my here's my hot take, dude. A hot take is CSS is gonna be the last language that models are good at. And we'll get there. ~I'll, ~I'll explain it. But yeah, I think,~ uh,~ it's just, it's really hard for LLMs. Jack: Yeah, I think you may be right and if I was listening to this podcast kind of sight unseen, I might think ~like, well, ~it's always done a pretty good job, like laying out forms and stuff. And it's ~like, ~yeah, it actually, it does a pretty good job at just like making the right input fields and doing that kind of stuff because it's, that's part's pretty easy. But if you wanna do an Adam Argyle style, CSS. I think ~that ~that's, that, ~you know, ~where you really, ~you know, ~make [00:04:00] leverage ~the, ~the best parts of CSS and really make a really exciting dynamic experience for the customer. Yeah, I think we're gonna have some problems with that. Adam: We got some problems. ~Um, ~so ~the, ~the blog post goes through, it's like the first thing I'm like, it's trained on ancient garbage,~ um, you know. ~They had to, that's all they had at hand. ~Like ~if you think about the history of CSS, like they needed training data. So they go gobble up everything that they can, and the web is covered in CSS. And ~so, uh, ~if you think about ~like ~the amount of modern good CSS, it got, it was like a fraction, like 5% of ~the, ~the entirety of what it got. So obviously it's going to regurgitate ancient garbage that it found, and that's Jack: floats Adam: That's what it does. Yeah, it does flows. Yeah. And that's why it doesn't really write CSS grid unless you ask it. It doesn't do logical properties unless you ask it. So it makes a very Latin English, ~you know, ~United States looking website. It doesn't give you something that's international or accessible because it's just trained on what it's trained on. So point number one can be fixed. I'm expecting specialist models to eventually come out and you could find [00:05:00] a CSS specialist model that did that, ~you know, ~omits,~ uh,~ the garbage from its brain and finds just the good stuff. And so somebody retrains a new model that's smaller, better, faster, and specifically tailored towards dope front ends. I could see that totally happening, but for right now, Jack: I'm totally with you on this, and I think that we're gonna start seeing that more and more just across ~like ~all the vectors. You're gonna start seeing ~a ~a, a modelist really good at MongoDB and another one that's really good at, ~you know, ~react and that kind of thing. Because one, these frontier models have just. ~Are ~are starting to spike in price. They're actually, we're starting to get the actual get charge, the actual amount ~that they, ~that they actually should have charged us. ~Uh, ~and then the other thing is that, and that's gonna push us to local models or small models. And I think ~that ~that's where you can get this really cool, hyper trained stuff. Like you can get ~the, ~the atom, r Gule Adam: Yeah. Jack: ~you know, ~model that ~like, ~it's not gonna give you floats, baby. It's not gonna give you old school table layout. ~No, no, ~no. It's gonna give you the real deal. Adam: Yep. And ~so, ~yeah, it's like we'll just wait for that ~once, ~once the month because [00:06:00] yeah,~ we, we needed, ~we needed to generalize that. ~I mean. ~We're still exploring what these things do. So it ~kind of ~makes sense where we're at. Like I'm not chastising them too much. ~Um, ~it's just ~a, ~a result of the training data and ~that's, ~that's fine. So it's, it that's number one. ~The, ~the first point of this why AI sucks at front end was it's trained on ancient garbage. And yeah, that's ~kind of ~a provocative way to call it, ~you know, ~ancient garbage,~ uh, you know, ~there's good stuff in there. But anyway, that was the first Jack: you gotta do the clickbait thing, man. ~You know, ~content person to content person. Every once in a while you gotta Adam: it, it works. Yeah. They gotta come in with that. ~Uh, ~the second thing was, ~you know, ~an LLM can't see. It doesn't have eyes. And it's also not a human, it doesn't have design eye. And so there's all these things. It ~kind of ~does poorly. It's also bad at math. And ~so, ~okay, all of this comes together. Like we, you can give it a picture and it doesn't know what a picture is, Jack. It's ~like, ~oh, pixels on a screen. I need to turn that into a layout. Ooh, that's actually really awkward. 'cause I can't see the layout that you need me to see. It flies blind. ~Right? ~Even if you give it an mc, like a browser, MCP [00:07:00] or a screenshot loop or all these different strategies, you still find that it's not that great. It's 'cause it's pretty bad at math and layout. Math is some of the hardest math you think about like the dom. And by the time you get to the component, you're trying to edit how much layout math has happened to get there. And the LLM is just ~like, ~I am in context corruption and pollution right now. Just trying to conceptualize the size of this box, ~know?~ Jack: ~Right. ~It's actually the smallest thing, but yet somehow it, it's lorded up with all this garbage, and it's also using all this context about like next Js or React and all of that stuff that, like within the context of that moment, are completely meaningless. It's just CSS. Adam: Yeah. ~So, ~okay, Jack, question to you. Have you ever asked an LLM to do something front end related? And it said, Jack, I got you. ~Uh, ~it's done. And then you go look and you're like, there's Jack: a hot mess. Adam: it's a hot mess. You're like, Hey, there's,~ uh,~ it's, there's things missing. I can't even see them. Are they visibility hidden? ~Like, ~what happened? ~Uh, ~LLMI, you really messed up. And it's ~like, ~you're [00:08:00] absolutely right. And then it goes back and it still messes up again and you're like, it's because it can't see man. Even when you give it tools to inspect the dom, that doesn't mean that even if the dom is there, dude, anything could be impacting this thing to make it not visible. And it's just CSS and the page is just so wildly global and local and all these things are just coming together that. How could it, ~you know, ~it's ~kind of ~even unfair to ask an LLM to understand this dynamic layout painting and rendering all this stuff altogether. However, I think it'll get there one day. I think we can give it enough tools and enough,~ um,~ scenarios that it could start to make and be more confident and produce better results. ~So,~ Jack: That is an interesting point though. ~Like ~there's not, there's just not that many sites that really do it well. ~You know, ~and so that the training set is actually a lot smaller than I would hope. ~And, ~and in general, like I think there's a lot of sites out there where they're just looking for that kind of SAS three by responsive layout, ~you know, kind of, ~okay. There it is. And that, and yeah, so ~I've, ~I've [00:09:00] definitely found that it can do that ~if it, ~if it, if you're doing stuff that it's ~very, ~very similar to what you've done in the past,~ like,~ oh, lay out some, they got lay out tic-tac toe for me. They can nail that. You can nail that a hundred times in a sleep. But yeah,~ if,~ if you wanna explain to it like, okay, so we're gonna have this animated transition between here and here. Yeah. Good luck with that man. ~Like, ~that's just not happening. Adam: Yeah, interactions are really tough for it. I fully agree. ~Um, ~but there, there are tools that are trying to make and to ~like ~mitigate this right now. Have you seen Tation? Tation Jack: No. Adam: or impeccable, Jack: Ooh, no. Adam: ~uh, ~V zero. Jack: Yes. See, Adam: ~So, ~okay, so V zero has in its feature set, and a lot of 'em do now too, where after you've prompted and you get a thing,~ uh,~ because what we used to do is,~ uh,~ we would open up dev tools or see the preview and be like, oh, okay, that header text that starts with global whatever. Then you'd go to Claude, you'd be like, okay, Claude, now find the header text that has this class name, that has this text content. Okay, now that you know what the hell I'm talking about, I want you to modify [00:10:00] the margin on it or whatever. ~Right. ~And that was ~our, ~our loop was we had to translate and try to give it anyway. So now you have,~ uh,~ these scenarios like in V zero where you just click on the element and you annotate and you click on something and you annotate. And that's what ation does. It's a Chrome extension. Impeccable is a skill that runs a script on your dev server that then lets you click and describe and annotate. So ~we're, ~we're trying to reduce the scope of all of the mess 'cause it can't see and give it a very particular thing,~ uh,~ to focus on. And then you ask for your change and we're trying to make tooling to facilitate this flow because it was all otherwise pretty clumsy. Jack: Yeah, that actually makes perfect sense to me. ~Um, ~we were doing a little bit, we had a POC of something similar with Anec ai,~ uh,~ and just, ~you know, ~like lack of resources. Couldn't get there with it. But yeah, no, it's absolutely, I think this is a fantastic idea because otherwise like it's ~you, ~you just have a hard time even explaining where on the page the problem is, ~you know?~ Adam: Yep. And it's nice in V zero 'cause it knows which React component it's using. Ation also knows [00:11:00] about which react component. So when you, it ation is weird where you annotate and then you copy and it creates this markdown chunk for you that,~ um,~ use the source maps and other stuff. But ~I mean, ~but the source maps are in there so it knows which file to change. And so when you go paste it in clawed, it's ~like, ~knows Anyway, still a little clumsy. ~Uh,~ Jack: that's clumsy. That's clumsy. Adam: impeccable is pretty cool though. Everybody. If you haven't tried that one out,~ uh,~ definitely go look at that one. Jack: Yeah. Are these pay for service? Oh, ~I mean, ~V zero is obviously pay for service. My wife was just like, oh yeah, we've just canceled that service. I'm like, yes. Okay. ~Um, ~but,~ uh,~ yeah. So what about impeccable ~is a, ~is a skill, Adam: Impeccable is a skill. Yeah. And it,~ well,~ it's,~ um,~ it's on V three now. It's gone through, it used to be 10 skills that were like,~ uh,~ make this more fun, make this more consistent, make this more,~ uh,~ compact. Make this more spacious. And it had these sort of,~ uh,~ designer verbs that were slash skills that you could use. And then by now though, they're all ~kind of ~rolled up into one,~ uh,~ omni skill. ~Right. ~It'll do skill discovery based on what you ask. Yeah. The uber skill. ~Uh, ~and then it comes with [00:12:00] this ~like. Uh, ~if you want to use it, this thing will spin up with your dev server and create a whole UI about you, ~you know, ~selecting an element, asking and annotating a change using the skills that it has built in. And then it creates you three versions of it and you just hit the left and right arrows and preview,~ uh,~ variance. And so you're, Jack: That's that. I like, Adam: it's free everybody. So go check out by Paul Backhouse,~ uh,~ impeccable,~ uh,~ who's a prior Chrome,~ um,~ engineer that I worked with. A very good designer as well, so Jack: Kudos to that guy. That was one of my, ~I mean, you know, ~you probably Photoshop guy too back from the day kind of thing. And there used to be a thing where, and it probably still is, where they could give, would have ~like ~an effects gallery where it's ~like, you know, ~he'd show you like nine variants of what we do to this particular thing, and then you can ~kind of~ Adam: Oh yeah. Liquefy and grainy and they were all there. Yeah. Jack: ~right, right, right. ~And that was great because then you know,~ it,~ it was so much easier to like, oh, okay, actually I want you to push it this way or push it that way. Yeah. Yeah. That's great. I love it. Yeah. Adam: So yeah, so the [00:13:00] fact that LLMs can't see is fine. I think we'll get there. ~We'll, ~we'll give them ~the ~the things that they need eventually. And when we have all these cool new browsers coming out too. You have agent browser. What's the penguin one that's like the rust one that doesn't even need chrome headless anymore. It's like super fast and can do all this stuff. ~So,~ Jack: I don't know that one, but I've got Adam: a bunch of people hacking on that. Jack: age agentic browsers. Oh my God, dude. Like they're Adam: Oh yeah. Yeah. I've got them all installed too. I think I only opened up DIA very Jack: Yeah. Actually I'm on DIA lately, ~but you know, ~there's Arc and Google and Chrome now has the Gemini built in and blah, blah, blah, blah, blah. Yeah. Adam: Yeah. Yep. ~Uh, ~so yeah, that was 0.2 in the article was that LLMs can't see. ~Right. ~They,~ uh,~ we can teach them to see and, ~you know, ~and here just a small rant again. 'cause the first one is rant two. The second one to rant is like they computer, they're using our computers now, Jack, they can't even save it. They can click buttons, order things, move around, switch apps, do all this stuff, but they can't stink and do a layout. And so it's just ~like, ~that's why I had to write this article. It's like,~ why,~ why can't it do this? It can [00:14:00] do so much. It could write rust apps, but it can't write grid what is going on? So Jack: I think,~ well,~ one thing is these are language learning models, right? They are. They, their whole, the whole idea is that it predicts the next token that is it, right? And it's Adam: I call them word vomit machines. These clinkers. Yep. Jack: ~Right. Well, ~it's just a weird model for then asking it to go around and ~well ~one tell you if you should invade Iran or whatever. Don't ~do, ~don't use that. But also ~like, ~I've tried it for like 3D modeling and stuff like that, or even 2D modeling and it's terrible. It's terrifying. 'cause again, Adam: all the fingers on the things you make. ~Uh, ~I'm like, ah, little. ~I, ~I tried to make a little boy flying in the air and it made this freak kid with like way too many fingers, and I was like, yeah, put it away. Jack: ~well, ~yeah, ~I mean ~the image models are great, but they're meant for more like creative kind of stuff or whatever. There's no model that like understands spatial reasoning. And I think ~that ~that's actually, that's a hard problem. ~I mean, ~when you think about it, like how would you, like what? What's, how do [00:15:00] you train it? What's right, Adam: I just, yeah. ~A lot of, ~a lot of training data, ~I guess. ~Yeah. Jack: Yeah. Adam: But yeah, you're right. And that's part of the other thing it can't see, but it's making very realistic images and now it's making video. People are making entire videos by, ~you know, ~stringing together all the clips that they get outta these tools. And it's pretty good. But again, why is a webpage so hard when it can make movies that are like Pixar quality, like it's why? Anyway,~ uh,~ that will go on to point number three then from the blog post, which is, it doesn't know why we do things. ~Um, ~and I think, ~you know, like, uh, ~with like video and images and some of these other examples, it's a little easier to know why,~ uh,~ something's happening. And plus it's all on the prompt. You're like, I want a penguin,~ uh,~ riding a bicycle, or what is it? ~Uh, ~what is he a pelican? I want a pelican riding a bicycle. ~Uh, ~and it's ~like, ~it, the user just wants an image. ~Uh, ~I don't need to know why this is happening. ~Um, ~but like with a webpage, there's a user goal and there's probably 50 user goals all at one time on the webpage. And somehow it has to balance all of those, but one of them at the same time. And us [00:16:00] as,~ uh,~ plebes ~and, ~and normies, right? We're like here balancing that as part of our job is to make something user-centric. And,~ um,~ I think you can just easily overload an LLM if you tried to even tell it all the reasons why this should exist. ~Um, ~and so limiting the focus and the scope with when interacting with an LLM is how you get good results, right? Because it just prints out the next word. ~Um, ~and so it's on us. So this one is a little harder, but I do think LLMs will start to understand why better. ~Um, ~but for right now, they, when we make architectural decisions. And they don't know why, and they'll even be SCO ftic with you. ~Hey, ~hey, great architectural Jack: Oh my God, right? Adam: can't believe you chose that. That's the smartest thing I've heard anyone say all day. And you're like, shut up. LLM. Just get outta Jack: I know,~ I,~ I'm so sick of the glazing. ~I, ~I've turned it off and everywhere I can just added to the system prompt, just, ~you know, ~I'm not super human, don't glaze me. ~You know, like, ~whatever. And it's actually a much more pleasant experience. ~I mean, ~it is, I kinda miss it now and again, it's ~like ~if you're having a bad day, it's like I say, ~you know, ~a little Adam: Tell me I'm pretty. Jack: yeah. Okay. But,~ uh,~ yeah, [00:17:00] I know what you mean. And that's interesting because ~I, ~I don't know, I don't think they, that they,~ well,~ at least on the current path and the way that these models are built, I don't ever see them going there. And then, and that's why like just recently, Kent Dodds put out an article about how like, product, now we're all product engineers and I think that's true, right? We all, and we all should have been the whole time trying to just like, understand the customer and what it is that they're doing and why it is that they're doing it. Even as an engineer, you need to know this stuff because ~you know, ~the spec document. It's not gonna have all of that. And you really need to, ~you know, ~work with your product manager and understand why people are doing it so you can make the right decisions. And so we were always supposed to be there, but this is LMS are pushing us more in that direction. And also having taste, ~you know, ~actually looking, being able to look at something and say, yeah, ~that ~that's ~hot, ~hot garbage like that. ~You know, ~why are you using so many fonts? Why are ~you, you know, ~that you shouldn't be, ~you know, ~ba, ~you know, ~bombas you with eight different colors. ~Like, ~unless it's at the point, but ~like, you know what I mean?~ Adam: Yep. [00:18:00] Yep. And,~ uh,~ the blog post, I'm like, and you mentioned it already, we've got ~spec, ~spec driven development, which I'm very positive,~ uh,~ about because I want more people to participate in ~the, ~the creation of software. And I think specs make that,~ um,~ much more approachable. It's like natural language. ~Um, ~but then you have b, d, d, behavioral driven design and state machines. So if you practice spec driven design,~ um,~ behavioral driven design, and you create state machines, these things can really,~ um,~ put enough bounds on the why. It's ~like, ~here's what the user's goal is. These are the BDD things. It says ~like, ~this user needs to, they need to be able to log out. Great. So then you have a state machine that kind of represents logged in and logged out. And so now you're shaping the world of this thing, right? So now it knows a little bit about why it knows how the states work, and then you've got a spec that sort of outlines all this other,~ um,~ important details for you about how this should operate and that,~ uh,~ is going pretty good. But even with those things, you'll find that the, yeah, it doesn't do good animations. ~Um, ~it lacks ~the, ~the robustness of all the potential states it could be into. So it, ~you still, ~you still need to drive it. You're right, you still need to be a [00:19:00] product owner. ~Um, ~there's a lot that you still contribute as a human in this process. ~Um, ~but yeah,~ the,~ the point of this one was that it doesn't know why we do things that's on you generally. ~Um, ~unless it, ~you know, ~maybe one day it'll be able to work backwards, be like, ~I want a, ~I want a tuner application, I want a banjo tuner. And it goes. I know why he wants one. He wants to tune his banjo, and I'd be like,~ brilliant,~ brilliant. LLM, ~you know? ~Now work backwards. ~Like, ~what are all the things I'm gonna need if that, so maybe I've actually been, here's a crazy thought, Jack. This is a modern, this week thought for me. ~Uh, ~the diffusion models have been making me,~ uh,~ think about working backwards from a problem. ~Um, ~and so if you've got a, yeah, hey, user asked for a banjo app, you can fuzzy out to think about all the other things that sort of tuners have. ~Like ~what are the classic things and attributes of a tuner and the reasons that someone might use a tuner, fuzzy out. Again, expand all these things and gather data, gather information, context and patterns and consistencies and desires, and then wrap it all back into the final product. ~Um, ~that could be a really cool way to use models is to extrapolate,~ extrapolate,~ extrapolate, identify [00:20:00] patterns and aggregate and bring them all back together on the most common best practices and patterns into a final solution. ~Um,~ Jack: Unless you Adam: know why we do things, but yeah. Jack: want a banjo tuner. And then, ~you~ Adam: Yeah. Jack: ~it, ~when it does that expansion and contraction thing, and now it's talking about ~like, well, ~a Chinese loot doesn't have, ~you know, ~it doesn't have the same harmonic Adam: Yeah. The Shami Center. ~Do you, ~do you need to tune a Shami scent? I'm like, I don't. Jack: You want ~the, ~the singing, ball tuning as well. I came up with that feature. No, man. I just want to tune my freaking banjo, man. Adam: Oh, that's hilarious. You are right. It could definitely go off the rails now. ~So, ~yeah. Rails. You gotta get these things, rails. ~Uh, ~otherwise ~they, they just, ~they just make more words. ~Like, ~here's more words for you. And I'm like, I didn't want those ones. ~Uh, ~anyway,~ uh,~ anything else you wanna comment on the, it doesn't know why we do things,~ uh,~ section of the article. Jack: ~Uh, no, ~no. Let's keep going. I'm in. I'm Adam: So yeah, those first three ones,~ I,~ I know I complain about them. I think they're worthy complaints. They're just the state of now. I think they're solvable. This fourth one though, this is the kicker Jack. This is the one it's not gonna be able to do. And it's summed up best [00:21:00] in, let's see, I'll just ~like ~read,~ uh,~ my, here's one of my quotes from it. I,~ um,~ oh crap. ~Uh, ~it's dam damnit humans. ~We're, ~we're an LLM combinatorial explosion. We're unpredictable targets. We change our minds. We switch view ports. We change theme preferences. We change our devices, we change our browser, we change browser versions. We switch the way that we do inputs with our finger, with a mouse, with a keyboard, with our eyes or whatever. We change our everything. And we are not a static target jack. We're not a pattern that can be learned. So it, that's, you can't pin us. We're not a pinnable version. Like rust ~is, ~is a thing that you can pin. You say, I'm gonna make an app for this version. Backend developers, they can pin a version. Frontend developers can't pin shit, man. I'm sorry. You could bleep that out if you want to. ~Um, ~but you can't pin, you can't pin the browser that I brought to the table. You also can't pin the size that I brought the browser to the table. You can't pin almost anything about it, which makes the amount, the matrix of state. Is wild. And CSS is built to handle that matrix of state, but an LLM isn't matrixes of state [00:22:00] drive it nuts. And so we are just naturally in a state where ~this is, ~this is why I think CSS will be the last thing it knows how to write is because it, you can't pin anything about it. It, CSS has always been a request. It's not a demand. The browser and the user ultimately has control this entire time. Jack: Yeah. Adam: ~it's, ~it's a complex environment for it to try to do stuff in. And anytime you change the state of it that the whole thing could re-shift again. And then they, LL has to reevaluate every, oh, ~you can just, ~you can just see the LLM going, oh, let me go chase down this rabbit hole. There's eight more rabbit holes, there's eight more from the age of the eight. And ah, it's just ~like, ~that's why it's ~like, ~I think I'm done with the layout and it's just gave up, ~you~ Jack: Yeah, exactly. ~And, ~and, but that's, that models human, like our interface with it. ~Right. I mean, I, ~I've had jobs in the past where it's ~like, ~I. They're trying to go for, and this, I hate these words so much. Pixel perfection, right? Adam: Ooh. Yeah. ~Mm mm.~ Jack: ~you know, ~it's supposed to be this. And then of course, like they didn't give you the mobile version, they didn't give you the tablet version, they [00:23:00] didn't give you anything. And it's ~like, ~now you've gotta do ~this, ~this dance of ~like, ~okay, I am gonna re be reinterpreting this and blah, blah, blah, blah, blah. And then they're invariably, they're like, oh, gotta make this change. And then ~you, ~you tweak that one thing and then 15 other things break because now,~ oh,~ oh wait, hold on. All the, ~you know, ~layouts are wrong and all that. ~I mean, ~that's actually one of the reasons why I was so excited about,~ um,~ that the sub layout thing, the container stuff. Css because like it allowed me to ~kind of ~scope like a component down to really what it is. It's ~like ~if you lay it out on the sidebar, it's gonna lay out like this because that's the width of the container. And if you lay it out in the center section, it's gonna be this. And ~I, ~I was like, yes, I bingo. 'cause then you could actually,~ like, I can, ~I can literally just ~like ~look at this component and be like, yep, that's the one, and that kind of thing. But of course, ~I mean, ~I don't see that many uses of container out there, sadly. It Adam: Yeah, LLMs haven't trained enough on container queries, and so they're not gonna produce it. Dude, have you heard the joke where it's like a CSS walks into a bar and a bar [00:24:00] stool and another bar falls over, Jack: Yes, I haven't heard that, but it's exactly true. Adam: right? Yeah. You're just like, I'm just gonna change this one margin value, and you're just like a whole, it's, ~you know, ~avalanche falls down and your app, you're like, no, that's not what I wanted. Oh, Jack: That was a whole, it's a cascade, right? ~I mean, ~this whole CSS thing,~ we,~ we never even use it as anymore ~is, ~is, ~you know, kind of ~baked in from like word documents way back. And it's like you're supposed to be able to define like a class and then a subclass on, on, on that class. Nobody ever used it like this, but like a Cascades, right? And ~so, ~and the reason we don't use it like that is because of the bar stool problem. Adam: Yeah. ~Well, ~and ~the, ~the cascade is only part of the issue. That's how styles are resolved. But then there's the concept of extrinsic versus intrinsic layouts. And extrinsic is what designers always do. They draw a box and then they paste. I almost said shit inside, but then I said it and I bleep me. I'm so sorry. Anyway, they draw a box and they put something inside and ~like ~that's,~ um,~ not the way the web wants to work, or the web can work in a way where it makes [00:25:00] a box perfectly fit for the content, which means the content. Content can change size based on user preferences, can change languages, can change all these different things, and always have a perfect box around it. That's an intrinsic request. It's like asking the human, Hey, how tall are you? Let me make a form fit for you. ~Uh, ~actually, let me make a coffin for you. How tall are you? ~You know? Uh, ~but there's like positive. Here, let me make an airplane for you. I'll make you a personal aircraft at your size, right? ~Um, ~and that's intrinsic, but we tend to not do intrinsic because it has combinatorial explosions. Also, if you think about it,~ um,~ that means when a short person comes in and they get a small aircraft, you, your factory has to be prepared for a small Jack: a small engine. ~Right. ~How does that work? Adam: Yeah, exactly. ~So, um, ~it that can, that churn causes people to go extrinsic and make one size fits all. And that's the per pixel perfect thing, which is, again, it always comes back to bite you. ~Um, ~but it lets, allows you to get somewhere quick in the beginning. So anyway,~ the,~ the web has many intrinsic,~ uh,~ powers and even powers where like it'll adjust layouts based on [00:26:00] space being requested by the inner content. ~Um, ~which is another thing LLMs don't like, is that the, a lot of times layouts work from the inside out and it wants to think about outside in. ~Um,~ Jack: Yeah,~ that,~ that one makes my blood boil. ~Like, ~I gotta say just as an engineer,~ like,~ I'm like, oh no, that's not good because I, ~you know, ~it is always one of those things where it's ~like, you know, ~why is this bounding box not staying static? ~I'm, I'm, ~I'm hitting it with ~like ~eight different versions of width, and then it doesn't, and then ~it, ~it blows up and you're like, Adam: Yeah. Stinking height a hundred percent too. You're like height a hundred percent. And it's ~like, ~I am. And you're like, no, you're not. It's like, where are you trying to resolve this from? Dom Node? Yeah. Anyway. Jack: That's actually,~ I,~ I think that would be a great version of,~ uh,~ use for AI would be why, ~you know, ~just literally click on something and ~like, ~be, why is this happening? ~Like, ~what, where does that come Adam: Yeah, I think that's in Chrome dev tools. That's what part of what? ~Um, ~yeah. They've got,~ uh,~ the a i a assist in there. Jack: Is it Gemini? It's probably Gemini. It's probably one of those small models. Adam: It might be the small one. Yeah. 'cause it's built in there. Yeah. Jack: ~I mean, ~we experimented [00:27:00] with small models Adam: Yes, we ~did, ~didn't we? Yeah. gradient.style. Yeah, Jack: Yeah. That didn't work so good. Adam: it was a great idea. Maybe it's better now. That was a, it was ahead of its time. Jack. You were ahead of Jack: was, and I actually, I think Web MCP might be ~the, ~the solve for that, but ~that's, ~that's a different story. Adam: It might actually just be skills, like ~how many, ~how many color combinations and other stuff could there be? So maybe you use,~ uh,~ the weak LLM to just interpret the prompt request and then feed it into ~a, ~a pre-made set of, ~you know, ~gorgeous presets that a skill helps to find. So Jack: Yeah, there you go. And again, you're doing that smart thing, which is ~constraining, ~constraining the LLM. Try and get into actually making a decision that makes some amount of sense. Adam: Yeah. 'cause what was your prompt? You're like, make me a gradient That's like a sunset or something like that. ~Right. Um, ~yeah, ~we don't, ~we don't need ~a, ~a huge frontier model to do this work. Oh, speaking of that, there was a model that came out today and I was like, Hey, this is ~kind of ~relevant to,~ uh,~ the conversation and I can't remember the name of it, so let me pull it up. 'cause it had ~the, ~the craziest name. It was cracking me up. ~Uh, ~let's see. It was like Nemo tron three [00:28:00] Mini Max thing. Do you know what I'm talking about? The one from Nvidia? Jack: Oh, I've seen the Nemo tron models and ~I've, ~I've been using them for code, but I haven't been using them for actually doing app development. Adam: Yeah,~ I,~ I haven't tread them either, but ~the, ~the Ne Tron three Nano omni model. Jack: Ooh, exciting. Adam: ~But you know, ~it reads your screen. So I was talking earlier about how it can't see and there's people trying to work on that. So that's,~ uh,~ this brand new one that just came out today has better vision of what's going ~on ~on your screen than ever before and can direct you on how to use any app that you want. ~Um, ~I don't know if it can do CSS layout, but they're trying to give it vision. ~Right. ~They're trying to solve the problem of it. Can't see. Jack: Yeah, ~I mean, it, ~it's not like it, ~I mean, ~can't see is I think, strong. Like I, when I use Cursor, it does a lot of screen grabs to go and try and navigate Adam: or makes videos now. Yeah. Jack: Oh, I haven't seen that. Adam: Oh, I don't make videos. Jack: I'll give that a go. Adam: How about them? Tokens? Jack: yeah. ~That's, ~that's a whole different story. ~Uh, ~but, and then I, when I use Claude, I've been doing ~this, ~this stuff out in my garage and, ~you know, ~just ~kind of ~sending it pictures and it's really good. It's actually [00:29:00] been really good at ~like, ~oh yeah, no, that, that wire doesn't go there, that goes over there, kind of thing. So ~it, ~it's getting there. Adam: It's Jack: but it doesn't have taste. ~I mean, ~I don't really think, like taste is something you just can't give, I think, to an LLM, ~I mean, or, ~or if you do, it's gonna just give it,~ like,~ it's always gonna give you the ~same, ~same look and feel. 'cause that's what it considers, ~you know, ~the taste from this year or whatever. Adam: Yep. And there's also ~like, ~you'd think it could be good at symmetry ~and it, ~and it can to a point, but there's a lot of weird things with design where the mathematical symmetrical answer,~ um,~ is often visually off. And so like it's opt you have these differences between optical centering and mathematical centering or optical color correction versus mathematical color. Color palettes are a really good example. ~Um, ~math. Jack: like another, that, that was another one where, yeah, that just, there's been some CSS st stuff lately to ~like ~give Adam: Text rap balance. And text rap. Yeah. Jack: ~and, and, ~and even the documentation on, it's ~kind of ~vague. It's ~like, ~this is better. It's just better. ~Like, ~I can't really [00:30:00] enumerate how or why, but ~it's, it's, ~it's better, Adam: Yeah, ~pretty ~pretty's. The one that's like very subtle. It's ~like, ~oh, we'll,~ uh,~ we'll barely touch the line breaking, just so you don't have any widows or orphans. You're like, okay, great. ~Um, ~but yeah, color palettes are notorious for you. You get all these tools with math and you're like, okay, I want a 10% even grade of changes from the dark to the light one. ~You know, ~gimme a nice palette,~ uh,~ zero to 10 evenly spaced, and then you get your color palette back and you're like, Ew, Jack: ~Mm. ~Yeah. Adam: weird. ~Um, ~or the palette looks good and then you go use it in practice and you're like, it's not producing good results. And ~then you, ~then you go down the rabbit hole of,~ um,~ how do you make good color palettes? And that is still being explored by humanity. ~Um, ~'cause we're still not done with color spaces and all sorts of things. So anyway, there's a whole can of worms and like math design versus optical and subjective human design. And I guess, yeah, it comes down to taste, dude, just like you're saying, we, I don't know how we're gonna give LLMs taste, but we can ask for taste in someone else's trained, ~you know, ~agent maybe. ~Um, ~we'll see.[00:31:00] Jack: yeah. ~You know, ~or maybe we Exactly. We have ~like ~the Miami Vice Pastel agent, ~you know, ~who's really good coming up with, ~you know, ~flamingo type designs, ~you know, ~that kind of thing. Adam: Yeah. But then you're gonna need to have language,~ uh,~ you need to know ~that ~that's even an option. ~Uh, ~you have to go ask for it. It's not just gonna give you something with good taste. It's,~ here's,~ here's a creative tip for everybody out there, and this is, I'm finding is not very wildly known, is that when you ask for something,~ um,~ creative from the agent, ~you know, ~it's only gonna give you the highest probability answer. Jack: Of course. Yeah. Adam: Okay, you can ask it to give you five results and tell you what the probability is. And so you can be like, I want five new layout options here. Tell me the probability of each one that, that you would've chosen. And let me try them on. And it'll be like, oh,~ it,~ it doesn't know why or what it's doing, right? So it's ~like, ~oh, you're right. I hear here's my five options. Option one was 80% likely, I was gonna give you that one. 40% likely for this one 30. And it just goes down into here. And I'm like, cool. Try the one at the bottom first. 'cause I'm looking for [00:32:00] creativity and everybody else is getting the top one. Nobody else asks for anything else. So they're always getting the most probable answer, which means if you are not doing this technique, you're making mainstream shit, y'all, you are just,~ uh,~ you're getting slop, you're getting, you're not gonna be unique. You ask for creativity, but you're just getting the average, you're getting the most mainstream thing possible. So break it down and work your way up and see what else you can get in there. Jack: Dude, that is Adam: for, so anyway, hot tip for y'all. Jack: yeah, that is a, I was, if you showed up on the podcast and you got that tip right now that ~you, ~you, your hour ~of, ~of listening time was justified because that is a ~fantastic, ~fantastic tip. I love that. I'm gonna do that myself. Yeah. 'cause I know, like underneath the hood, I've actually ~like ~worked directly with these LM APIs like at under. Under the level that, that most folks do. And yeah, it's all probability. Like all of this is all just probability. ~And, ~and yeah, it's an interesting idea. It's like normally we just, yeah, every UI is gonna just pick off the top one because it's the highest probability. And [00:33:00] yeah, there is, there's interesting content. There's this an interesting fruit underneath that tree that maybe ~not the, ~not the perfect one. ~You know, ~maybe ~that ~that rotten one over there is, has some interesting aspects to it, ~you know? ~Or what Adam: And yeah, ~you can, ~you can fork, fork and branch off of the third option. And that's what I do. A lot of times when I'm being creative, I'm like, gimme five creative options for ~like ~a weird cool thing that happens here. And I'm like, I didn't like any of 'em. But the third one had attributes that I liked make five more based off of that particular branch and just continue. So it's like iterating ~and ~and exploring. Jack: I want this to be a product, Adam. ~I, ~I want the this to be something that ~I~ Adam: Oh dang. Jack: integrate into Cursor and just be like, okay, let now let's do the, let's do the Adam thing. Gimme some options, ~you know, ~and now we're gonna get Adam: show up in the matrix. You're like, okay, you picked that one. You're like, I want more. ~Like ~that just expands into five more. You're like more like that one expands again. And you're like, okay, combine these two expands and then you're like, that's the one right there. And yeah, that's like a whole two weeks of ~like ~iterating with an art director and a designer and a developer all in one. And yeah, [00:34:00] man. Jack: ~And, ~and Adam: I'll go write Jack: the art director to ~like, ~no, be crazier. ~Be ~be ~da da da da ~da. And then they come back and they're, ~you know, ~they're ~kind of ~expecting that you don't want that or whatever, and you're like,~ no, no, ~no. Whatever you thought I didn't want. That's probably what I do want. ~You know, ~that kind of thing. And ~that, ~that kind of reminds me of like the early days of AI where it was so much fun to just have it go and build out the app, ~you know, the, ~the to-do list or whatever it was I was working on at the time. Commit that, okay, it's working now. Let's have some fun. Let's actually go and ~like, ~play with it. Play with the design, throw some crazy stuff at it and see what happens. And ~that was, ~that was fun. ~That, ~that was Adam: Yeah. Now we're just like, sha cn and tailwind, ~you know,~ Jack: what's your take on Adam: make it fine. Jack: Are you a hate? Are you a hater? Adam: No, but someone said, fail wind last week, and I thought that was funny. ~Um, ~no, I like Tailwind. Tailwind has a special superpower in the CSS world that is unchallengeable and that is that the, you reach a maxima. Of the style sheet. So one style sheet, as soon as it's, ~you know, well ~now we have Tailwind four, [00:35:00] which will build a style sheet based on what's used,~ um,~ which does grow theoretically. However, it can reach a Maxima and make n number of pages with that one style sheet. So the fact that these all basically turn into pure functions, almost like functional programming, and then you can compose any number of pages off of it, that is amazing. ~Um, ~but the sides of it that I don't like are that there's a lot of CSS that doesn't fit into the class paradigm anymore because again, CSS is dealing with a wild amount of context and state and scenarios and syntax and just, and so sometimes Tailwind can get a little hairy trying to squeeze it into a class name. ~Um, ~and you have to break out and it's, and it feels naughty as soon as you break out and you write Jack: I know, right? Yeah, Adam: ~you're like, ~you're like, ah, this is now, I don't have the promise anymore. My, my style sheet now can grow as more of these infect my style sheet. ~Like, ~I guess there are container queries, but there's things like view transitions, and you need to define a custom view transition. You're not gonna do that [00:36:00] on the node, that's like a global thing. You have to go create app properties. ~Um, ~even custom properties in a lot of scenarios, like it gets really awkward the more you're,~ um,~ you branch out. And so ~there's a, ~there's a battle happening and I don't know what the result will be, but the pure class-based authoring style and promise is,~ um,~ losing a little bit to CSS growing and advancing and not caring that everything needs to be a class. ~Uh, ~it's doing its own thing. CSS will look out for itself. And so ~there's~ Jack: I was ~kind of ~surprised, Adam: there. Yeah. Jack: working with Opus just like four, seven, just recently, and it was adding, it was actually making its own and on a tailwind project, and it was making its own custom classes, and I was like, whoa. Okay. That's new. That's novel. All right,~ well,~ okay, if you think so, let's go. Adam: Yeah. And sometimes it's just nice to style all the children based off a parent. I know. ~Um, ~you can do that with the groups syntax and some of the other things inside a tailwind, but people don't tend to do that. They tend to go put all the styles on the ally instead of putting it on the ul that styles the [00:37:00] ally. ~Um, ~and ~so, ~yeah, sometimes CSS is like ~a, ~a one-liner and you get this phenomenal power and a tailwind you have to go mark up every single instance. And so trade-offs all over the board. ~Uh, ~overall though, I think Tailwind is great and has done a lot of great things,~ um,~ for the community. So Jack: to let Paige know. 'cause Paige is ~like, ~ugh, tailwind, ugh. She hates tailwind. ~So,~ Adam: is very polarizing. Jack: yeah. Alright,~ well, I, ~I, yeah. Okay. So we talked a lot about what you don't like and we've, you put in a couple of amazing suggestions for us, but like, how do you, when do you think it, like the current set of LLMs work really well and how do you ~kind of ~work in your workflow around that Adam: Yeah. ~Um, ~I've been getting some great results recently through, have you here, Jack? Have you tried auto research yet? Jack: now? Adam: Oh my goodness. Other than Open Claw, which I'm Jack: every day there's a new thing. Adam: ~I know, ~I know. But,~ um,~ open, like I'm an open Claw fan. I now have my own Claude Claw. ~Uh, ~so I can see. Anyway, I'm not gonna talk about that. ~Um, ~but maybe we will. I'm really enjoying that. ~Um, ~but I'm doing this things where, [00:38:00] like at Shopify we are using auto research a lot and auto research is like an a FK type of way to do work with LLMs. You stick them in a loop. ~Um, ~it's got a reinforcement loop,~ uh,~ it's got saves in it and it keeps track of all the data and makes these charts. And ~so, um, ~I'll set it up. ~Um, ~I'll give it a plan, tell it. It needs to, ~you know, ~attempt to make the score better on this one thing. And we can talk about how to use auto research later. But just imagine I want a score to improve and I give it the criteria of how it could go about doing that. But hey, get creative in the way you do it. Anyway, I wake up the next morning. To it, having done anywhere from four to eight hours of loop work. And it has results for me and it's got it in a data format. I think it's CSV,~ uh,~ it's got a markdown thing, and then I can have,~ uh,~ Claude or whatever LLM go review the eight hour log transcript basically and summarize it. But here's what I do, dude, here's what's really cool is I wake up and I'm like, okay, Claude,~ uh,~ go through all the data, summarize it,~ um,~ and then we're gonna make a, an interactive scrolly telling website for it. [00:39:00] And I want it to be, I want you to show code snippets about when things improved. I want you to make it really interactive where people can toggle the old and the new, or they can go through a little wizard that helps 'em,~ um,~ also follow along with the logic that led to this improvement. And ultimately then use all those impeccable design skills to make the site minimal and have all these, ~you know, ~anyway, I just do a bunch of art direction and it takes me a couple minutes to write like this one paragraph prompt. But ~the, ~the juice of what auto research gave me was content invaluable content improvements Jack: Did he actually get some insights outta whatever this was? Adam: Oh,~ always,~ always. And it's empirical. ~It's, ~it's proven and it can tell you how and where it got there, it can give you the actual results and make it all interactive. And so in five minutes,~ uh, you know, ~I've got a site, I push it up on an internal hosting thing and I send a URL out to anybody who's interested. And instead of me making a dinky ass doc that I know, you're only gonna read the first paragraph and then go, ah, ~you know, ~you can open up a site and you start scrolling it and it's telling you the story of all the improvements that it was [00:40:00] able to do overnight and it's giving you action items. And I've already got a PR up that makes the thing, ~you know, ~it takes the learnings and applies it into our code base. And now we have a faster loading homepage. ~Um. ~Dude, it's wild. And so I've been, yeah, so using these design skills, like impeccable, having content, you need to have content. So whatever can generate meaningful, juicy information ~in a, ~in a small format. Now you're just making websites all day with this stuff. ~Uh, ~and it can look great once you get a good grip on some design workflows. ~So, ~and you can try your creativity tools in there too. Be like, make me this site, make three different versions. Tell me the probability of why you were gonna do that one. And I'm gonna choose the lower probability so I get something a little bit more unique than the other schmo that asks for the same thing. Jack: Now I can, I didn't totally see why you do it. Okay. That makes sense. ~So, ~yeah. Okay. The thing is that you understand the power of visual storytelling though, Adam: ~Mm-hmm.~ Jack: And I don't think a lot of engineers do. And like I, I was lucky in that and one of my first jobs, the [00:41:00] boss was like a huge fan of Edward Tufty. And so he was like, you gotta go see this guy. You gotta read his books. And Edward Tufty is all about ~like ~the power of visual storytelling and how individual ingra infographics can ~like ~stop a war or tell you that you shouldn't have lost, ~you know, ~launched the space, settle challenger or whatever. And it can be very concise, but, ~you know, ~again, finding ~that, ~that those key insights and then understanding how to take that and actually architect like a story around it, is an amazing skill that you know,~ that,~ that very few people have. But ~I'm, ~I'm curious, are, do you have some good resources for folks who wanna learn more ~about, ~about this kind of visual storytelling and how to apply it in their jobs? Adam: ~Hmm, ~that's a good point. I've mostly acquired it 'cause I've been making, I've 20 years of making marketing websites, reviewing other marketing websites by Apple. ~Um, you know, ~I think the best thing to do ~is, ~is find something that told [00:42:00] you a story that you liked and emulate it. ~Um, ~there's even tools now with LLMs. You can point it at a URL and be like, turn this into a design system. Turn this into agradable skill. ~Um, ~so yeah, you could go find a storytelling,~ um,~ example, turn it into something that's replayable in a skill, and then ask for the skill to be applied to your custom data set and see what it does. Jack: Nice. That's a great point. ~You know, ~imitation closes form on flattery, ~you know, ~vest form, flattery, that kinda thing. Yeah. ~Um, ~and there, there are like, there are resources out there in terms of,~ uh, you know, ~books that folks have written around data visualizations. So like that they can,~ like,~ you can literally, they're almost like coffee table books. You can just ~kind of ~flip through and be like, oh, I see. That's a cool visualization. That was, ~you know, not, ~not the way that you'd expect. ~Um, ~and yeah, they, I tell you what, man, a cool visualization that actually like imparts a lot of data in a very short, in a very small window of time,~ is,~ is a fantastic resource for folks. Adam: It's good. You do need to verify your data though, 'cause it can hallucinate and so don't go make a fancy ass website about some hallucinated data because it will [00:43:00] be very convincing and people will want it, and then you'll be like, oops, it was wrong. Jack: So this auto research tool, which is giving you ~this, ~this good verifiable data,~ uh,~ is that something that's just Shopify internal or is that Adam: No,~ um,~ Kahar Carpa, ~Carpathy ~Carpathy, Jack: Carpathy. Adam: Andrea Kahar, he's like a very famous,~ uh,~ open ai, I think,~ uh,~ early,~ uh,~ AI model, a great follow on Twitter. Pretty much all the tools that he makes,~ uh,~ go wildly viral and for a good reason. ~Um, ~although I was working my second brain before he told everyone to make that,~ so, uh, ~I was ahead of him and ~just kidding, ~just kidding. ~Um, ~so he came up with this thing called auto research and it was that,~ um,~ I love doing an impersonation of what it does, so ~I, ~I make this plan and,~ um, uh, ~you, you wake up an LLM, so the LLM goes poof. Hey, I'm here to do some work. And it's ~like, ~Hey, LLM, I'm trying to make this score go up. Here's the past failures of the past two LLMs that have,~ uh,~ attempted stuff. They failed, the score went down. So don't do what they did. I want you to find a way to make the score go up and the LL m's,~ like,~ cool, and it comes back and it's like, Hey,~ I,~ I tried to make the score go up, but,~ uh,~ it didn't go up. And then the orchestrator goes, cool, you're [00:44:00] dead. Jack: dead. Bye. Oh yeah. Adam: agent comes up, Hey, I'm here to do some work, ~you know, ~and ~the, ~the Orchestrator of Auto Research says, Hey, so here's the past three failures of the people before you don't do anything that they did. You should try to do something new. And it's ~like, ~okay, cool, boss. I'll go do something new. Comes back, Hey, I made the score go up by one. And the orchestrator goes, Ooh,~ uh,~ we have a threshold that says the score has to go up by at least three, or else you die, or else I what? ~You know, he gets, ~he gets Jack: Bye. Adam: So then you bring up another one. So anyway, what happens ~is, ~is if it, this score is improved. It get, saves it and records what they did and then puts it in the data set. So you get to see there's these flat lines as LLMs try stuff because they're just next word vomit machines based on some context. They try stuff. Then all of a sudden the score jumps boop, and then it's flatlined for a little while and then boop jumps against. So you see the stair step of improvement across the score. ~Uh, ~and that's ~the, ~the core concept is there's an orchestrator that's managing the loop. You can assign which LLM you want to use. So you could use, ~you know, ~a lower quality LLM if you need ~to, ~to save some [00:45:00] bucks or not. ~Um, ~it's up to you. But the thing that's crucial, at least for me, is that a lot of people look at the limitation of it as a score, as a bummer. Meaning that it's mostly a performance tool, which by the way, it is a phenomenal performance impacting Jack: I can imagine, but no, I almost, anything can be evaluated as a score, ~you know?~ Adam: That is correct, especially with l lms. So you can teach them a rubric, but ~like, ~real quick, I was trying to convince,~ uh,~ the remix folks. I was like, you wanna get your per your render performance up, just chuck it at,~ uh,~ auto research. Just be like, here's all the benchmarks. Do whatever it takes to make the score go up and overnight you'll wake up to improvements 100%. So every framework out there, if you're,~ uh,~ he hasn't listened to me Jack: Aw man. Adam: trying to convince him that auto research is cool. He's ~like, ~no, I'm gonna do it by hand. I'm like, all right, whatever. ~Uh, ~I was about to do it for him anyway, ~but you know, ~I don't wanna shove,~ uh,~ that in front of him anyway. I. So I'm doing it with more subjective things where I'll teach,~ uh,~ an LLM,~ uh,~ a rubric,~ uh,~ a scoring matrix. And I'm like, here's the qualities of things to look for. ~Um, ~if they have them, they get a [00:46:00] point. If they don't have them, they don't get a point. Here's things that decrease the score. These are extremely negative attributes of what I'm looking for. Great. So now you have an orchestrator agent that spins up naive agents to try to improve a score. Then you have an agent that reviews the work of the other agents to produce the score. And you can also have another agent that tries to learn from the prior successes and failures to generate a new context for the next agent. So now you have this self-healing, self-improving loop. This is what I keep calling loop engineering. So if you listen to,~ uh,~ the show I'm on,~ uh,~ whiskey web and whatnot, we talk a lot about agentic loop engineering, and this is the a FK, this is all the people running. I've got five agents doing stuff. I'm doing that because I am literally running a team of five agents. ~Um, ~that are self looping and managing and trying to improve something based on criteria that I've taught it. So if you can fathom criteria that can create a score, it doesn't have to be performance benchmarks. It can be something ~kind of ~subjective and you'll wake up to improvements. ~Um, ~[00:47:00] it's worked. Every time I've tried auto research, LLMs have been able to come up with something clever and improve something. ~Um, ~man, is it cool? It's Jack: is really cool. Adam: so beware. Jack: it's similar to something that,~ uh,~ was around boy a long time ago called Artificial Life. And the idea was that you would have the, yeah, like a rubric and then you'd have different, al you'd have an algorithm that would have like different parameters effectively, and you just essentially just randomize those parameters at the start. And then you'd, ~you know, ~create a thousand, ver a thousand randomized versions, figured out which ones are the best, and then you'd have ~like ~a slight mutation on each one and you just continue. On those paths. And yeah, it's a sort of similar idea here. It's ~like, ~okay, that guy's, yeah, he's getting a little better, ~you know?~ So work on that and keep on going. And ~the, the, ~the thing you're gonna run into though, and the in only running one at a time ~is, ~is like a local minimum problem where you fall into a pit of success effectively, or whatever. You reach a peak and it's an [00:48:00] artificial peak. You didn't actually reach the ~peak ~peak, ~you know, you, ~you just, ~you know, kind of ~reach like a nice plateau and you couldn't find a way to get off of that plateau without lowering the score ~so, ~so much that, ~you know, ~you kinda, whatever. So it might be one of those things where like ~an, ~an addition onto that is ~like, ~cool. That's great. I think ~we've, ~we've reached like a steady state here where we haven't actually made any improvement over a number of generations. ~Like, ~let's just freeze that and then almost ~like ~start again in some entirely new random direction. But the nice thing is ~you're, you're getting, ~you're getting that randomness just from the nature of the models, the,~ like,~ the nature of the way that we run the models themselves, which is we start with, ~you know, ~essentially random waitings and we go from there. Adam: Yeah, that's a very astute,~ um,~ observation and that led me to, so I've been using auto research for a few weeks now, and the most advanced thing that I did was. Run, I looped the loopers. So just like you're saying, you ~kind of ~gotta step back because you will only in, in that single narrow thing, you'll optimize that one narrow idea. But what if you want to compare competing ideas? And so I had a thing where I had,~ uh,~ five [00:49:00] strategies for this problem I was trying to solve. And I was like, any one of these five could, can solve it. The question is, which one's the best? And so I set up a criteria, I set up parallel,~ uh,~ auto researching to happen. ~So, ~so that each of these five ideas can be the best and represented that they're possible being. And then at the very end, I compare and I compete them against each other to see which one actually,~ uh,~ emulates the qualities and gets the highest score based on the rubric that I'm looking for. So ~it, ~it allowed me to take this month's amount of research, condense it into an overnight thing. You see the best representation of each strategy down to the tactics because it's LLM. So there was actual code behind ~the, ~the score and behind the quality and behind the implementation. See which one actually scored at the top. See what trade-offs I was making at that particular point based on the different strategies and make an extremely informed decision. ~Um, ~because I spun up multiple LLMs, which were also representing a team. It was ~kind of ~fun that I was mentioning like a, there's a scorer. ~Um, ~I also had a, [00:50:00] there was a designer, a developer, and a product owner,~ uh,~ on the team, and it was scoring what they produced. ~So, uh, ~I got the idea from Dude Jack. You ever heard these people that are taking,~ uh,~ they want to user test their apps? And so they spin up 500 different moms in an LLM context, and then they have them try the app out and then they're like, Hey, moms, give us a review. And it's like surprisingly good. Jack: Yeah. ~Uh, ~IBM ~uh, ~had ~like this, ~this interesting like project where it was,~ uh,~ like basically ~like a focus, ~a focus group. And you could just have, ~you know, ~you submit an idea and I was using this for like my content stuff. ~Like, ~Hey, what do you think about an idea about, ~you know, ~this particular content thing? And I had ~like ~a senior engineer and a junior engineer and blah, blah, blah, blah, blah. And they would all give feedback on ~like, ~oh, I, I need more of this, needs more of that kind of stuff. Adam: That is a great idea. Yeah. Yeah. You workshop it on sim because they're simulated on Reddit, so they're gonna ~kind of ~represent, ~you know, ~a lot of people. So Jack: least it's not next door. Ugh. Adam: I, yeah. Jack: ~Uh, ~alright, ~well ~listen it, we talked a lot. I do want one, [00:51:00] one more topic though before we go. So you mentioned open claw. What does your open claw do? Adam: Yes. So mine does not do email. It does not do the things that other people do. I Jack: mine doesn't either, but yeah, go on. Adam: ~Uh, ~and ~uh, ~real quick too, I use CLO cloth so I can still use my OAuth, so I'm still using my sign in and that's 'cause it uses the C So it's a BUN server listening to discord messages that then spin up the CLO CLI and that's it works around their harness requirement. And they've just recently stated they're gonna continue to support the CLI. So they don't like pie as the harness,~ uh,~ but they will let you use the CLI. So anyway,~ um,~ I've gone through and I've started lib. I call it liberating my data. I'm doing air quotes, everybody who can't see me,~ um,~ because I'm scraping my data off of places where I've been investing in. Downloading it onto my machine. ~Um, ~and now I build my own custom apps with that data and I interface with it through Discord. So a good example is,~ um,~ birthday tracking. I now have my own birthday app. It gives me birthday alerts. ~Uh, ~I can add new birthdays into it just by texting Discord. I can,~ uh,~ I get Discord messages about upcoming birthdays, and now I'm free from what, [00:52:00] whoever owned it before, which is like Google Calendar,~ uh,~ all of the banjo songs I can play. ~Uh, ~I took all of my guitar tabs out of guitar tabs. You probably have a guitar tabs account. Dude, I took 'em all out. I took 'em all out. And I have my own, they're all on my machine now, so the whole world could go down. And I still have all of my guitar tabs,~ uh,~ right here on my computer. ~Uh, ~it's a second brain. So it's been tracking a lot of things for me that I have trouble remembering. Like names I suck at names so bad, but now I show up at the baseball game for my kids and ~so, ~and I'll show up and I'll be like, what's up? And I know everybody's name at the baseball thing because I reviewed them before I got there. So I'm like, Hey,~ uh,~ my,~ uh,~ Claude Claus called punk Ass, which is short for punk assistant. ~You know, ~it's punk Jack: Yes, of course it is. Of course it is. Adam: ~uh, ~and I'm like, Hey, punk ass. ~Uh, ~what are all the parents' names for Lincoln's birthday or for Lincoln's baseball team? And it goes, boop. And I could just show up and I get to look like a pro. Jack: do you, does it have images too? For me, it would have to have images. 'cause I need, Adam: have images. I can remember faces, but it's like their names are just complete. I've said Jack: I just need the face to the Adam: and they just fall off. Oh, you could do that. You could do that. ~Um, ~so yeah, I'm building personal [00:53:00] software. I'm taking my data out of places where it was a walled garden into my own space where I own it. Nobody can take it from me. And my apps are my own. And ~so. Um, ~and then I'm building cool experiences. They're all PWAs. Dude. Actually, here, lemme show you. Check this out. I'm opening up my iPhone. I'm gonna swipe over to this home screen and look at all the icons right here. Jack: Oh, those are all just your custom apps. Adam: Those are all different PWAs that I've,~ uh,~ built with,~ uh,~ my open claw machine. So I have tabs, I have a beer tracker, I have a whiskey tracker, I have a birthday tracker, and I have a banjo tracker. ~Uh, ~and then I'm also having open cloud build a Shopify store and some other different software for me,~ uh,~ asynchronously. ~Um, ~man, it's cool. Jack: Yeah,~ I,~ I'm ~kind of ~with you. ~I, ~I have really invested in like a home setup lately and, ~you know, I, I, ~I got my, I had a studio machine, Mac Studio that was running my actual studio and I wasn't using it all that much. So I've turned it into basically an open, an llama server that runs Orb Stack that has, ~you know, ~10 different apps on it for doing finances and all this stuff. And it's all ~like, ~it's all local. ~I mean, ~there is a cloud backup, but it's on my own thing. [00:54:00] And yeah,~ we,~ we are really starting to go like. I, ~you know, ~gosh, I get vendor free or something. I am sure that there's some word for it, like cord cutting. But yes,~ I'm,~ I'm totally with you. Great. Use. So much better than the other Open Law uses I've heard of. All right. ~Well, ~this was, Hey Adam, always a pleasure seeing you again. And thank you to Log Rocket for hosting this podcast and hosting,~ uh,~ Adam and I, we'll see you on the next Log Rocket podcast.