Benchmarks for Mortals | AI Inside podcast

Jason Howell and Jeff Jarvis discuss OpenAI's ChatGPT-4o image generation update, Google's Gemini 2.5 Pro release, the emotional impact of chatbots on users, The Atlantic's LibGen database search tool, and more!

Support the show on Patreon! http://patreon.com/aiinsideshow

Subscribe to the YouTube channel! http://www.youtube.com/@aiinsideshow

Note: Time codes subject to change depending on dynamic ad insertion by the distributor.

NEWS

0:02:19 - OpenAI: Introducing 4o Image Generation

0:19:48 - Gemini 2.5: Our most intelligent AI model

0:32:27 - OpenAI has released its first research into how using ChatGPT affects people’s emotional well-being

0:37:18 - The Unbelievable Scale of AI’s Pirated-Books Problem

0:48:16 - Open Source devs say AI crawlers dominate traffic, forcing blocks on entire countries

0:52:59 - Perplexity wants to buy TikTok and open-source its algorithm

0:55:50 - "KI ist nur ein Werkzeug. Jedes Werkzeug kann missbraucht werden"

1:03:07 - AMC Theatres will screen a Swedish movie 'visually dubbed' with the help of AI

Learn more about your ad choices. Visit megaphone.fm/adchoices

[00:00:00] This is AI Inside, episode 61, recorded Wednesday, March 26, 2025. Benchmarks for Mortals. This episode of AI Inside is made possible by our wonderful patrons at patreon.com slash AI Inside Show. If you like what you hear, head on over and support us directly. Thank you for making independent podcasting possible.

[00:00:27] Hello, everybody. Welcome to yet another episode of AI Inside, the show where we take a look at the AI that is layered throughout so much of the world of technology in so many different directions. Everybody has a new model that's the best model ever. That's like week after week that happens, and this week is no different. I'm Jason Howell, joined, as always, by Jeff Jarvis, my friend and co-host. How's it going, Jeff? It's our latest model of AI Inside. That's right. New features.

[00:00:55] The latest and greatest episode of AI Inside is this one, the one you're watching right now. Good to see you, sir. Good to have you on board. Real quick, just a thank you to those of you who support us on Patreon. We appreciate your support. Patreon.com slash AI Inside Show. Neil Wood is our patron this week from the early days of the show.

[00:01:22] So thank you, Neil, for sticking around and continuing to support us. We appreciate it. And if you happen to be watching live, of course, we love having our live viewers. It's great to have you all here. But you can subscribe as well, AIinside.show, if you want to just go to a web page and find an RSS link and subscribe, or just open up your podcatcher of choice. Find AI Inside. Pretty easy to find in there. And subscribe so you don't miss a single episode, even if you happen to miss the live recording.

[00:01:52] Which you can get on YouTube, by the way. Sometimes I don't mention that. YouTube.com slash at AI Inside Show. Just search for AI Inside Show. And those who follow me on Twitter, I won't call it X. I put it there every week. And also either Facebook or LinkedIn. I can't do three, so I choose. You tell me which I choose. Yeah. It's a bummer that StreamYard doesn't allow for more than that. But guaranteed to be on Twitter, on Jeff's Twitter. Cool. With that out of the way, let's get to the latest and greatest.

[00:02:19] Because that is really the top of what we got going on here. And actually, so text-to-image generation. I feel like we're a couple of years into this whole thing. And it's kind of like, okay, yeah, we know what we're going to get. But I think that text-to-image generation with text in the image is still kind of one of those things that's improved tremendously from when, you know, from a couple of years ago, but still kind of developing.

[00:02:46] And OpenAI released an update to their 4.0 chatbot or system, LLM, whatever you want to call it. And this really focuses on image generation. So about 10 months ago when they first announced 4.0 as their first, like, you know, their primary multimodal model, eventually it would get to the point to where we are right now.

[00:03:13] Where not only could you feed it images to understand, but you could also generate images with it and not have to use, is it Dolly? I think Dolly is OpenAI's kind of primary image generation. And so where they're at now with this update that's now releasing after, you know, like 10 months, 10 months since being announced, what they call our most advanced image generator. As you'll see in today's show, this is a pattern.

[00:03:42] This is just what happens with AI companies when they release a new model. It's always the most, the best, the whatever. But you can see here, so take a look at this command or this prompt that they put in. And, you know, I won't read the whole thing, but essentially what it is, you know, a wide image taken with a phone of a glass whiteboard. A field of view shows a woman writing, a t-shirt wearing an OpenAI logo. Handwriting looks natural, a bit messy. We see the photographer's reflection in the whiteboard.

[00:04:11] And then it lists out everything that's written on the whiteboard. You know, the text reads on the left side, it says transfer between modality, blah, blah, blah, blah, blah, blah. A pros and cons list on the right side, fixes, model, compress. You know, this is a really kind of creative way to detail the strengths of this new AI model. But what you see in the output, I got to say, that's pretty impressive that it got all that in there.

[00:04:37] And then when you're looking at the text and the handwriting and everything, I mean, this text is, you know, it's written without any misspellings. This is actually taking the text that was included in the prompt and not going off the rails. Isn't that appropriate handwriting font? Yeah, it looks pretty handwriting-ish. Yeah, I got to say, it's pretty impressive, you know. And then they do like a little tweak. Like, how about the photographer is giving a high five to the person at the whiteboard? And they came up, you know, with this.

[00:05:07] But anyways, it's meant to kind of illustrate just how good that these things are getting at text rendering, which is pretty, you know, that opens up some interesting doors as we get further and further down that road. I've never fully understood why text rendering was so hard. Good question. Because it's the... So I'm writing this book called Hot Type about the history of the linotype, and I've just written the part about PostScript and how that eliminates it all.

[00:05:37] And the thing about PostScript, of course, is that it does not see a letter as a letter. It describes a shape. Right. Right. And actually, it used formulae that were originally done by French car companies to describe the shapes of a fender on a car. And they realized that it could describe the shapes of letters in a way that was changeable platform to platform, font to font, size to size, and so on and so forth. Right? Yeah.

[00:06:07] I get it that when the models see an image and there's text on it, it's just a shape to them. Yeah. It's just a row of pixels, a pattern of pixels versus, oh, I see a letter. I mean... Except that it does recognize a dog as a dog. It recognizes a cat as a cat. So why can't it recognize an M as an M? And that's what I've not fully understood of why that hasn't worked,

[00:06:37] why it's so difficult, been so difficult for it to add text in. Yeah. And we'll have to get somebody on who understands how these models are trained. But I find that really fascinating how hard text is. Yeah. How hard it has been. And not to mention, it's kind of an interesting disconnect because when I think of modern AI, so much of it is centered around text.

[00:07:05] I mean, these are LLMs that we're talking about. Like, you know what I mean? At their core, it is all text-based. And yet, when it comes to image, turning that into pixels, it just completely throws it off. I'm not sure what the reason is behind that either. I'm curious about that. Well, but there is a difference here. Diffusion models versus LLMs. Right. Yes, right.

[00:07:29] So, a diffusion model, and I'm going to get this bad, generates images by iteratively removing noise. Right. Thank you, Google AI overview. Explain yourself. And they're also being explored as a new way to do language generation. But language generation is different, right? That we've talked about many times. That's about the tokens and the relationships of the words. And you know what the word is. Well, it doesn't understand the meaning. It doesn't know what the word means.

[00:07:55] But it knows this word is next to that word this often, along with these other words. Right. So, that's how the text part works. But the diffusion part works. It says, well, is this a dog? Let me try to get rid of more and rid of more and rid of more until it feels closer. Right. So, in that sense, the diffusion model doesn't speak text. Mm-hmm. Right. It only speaks pixels and shapes and images. Yes, exactly. Exactly.

[00:08:23] So, it's fascinating that when you hear the AI folks say constantly, well, we don't really know how it works. Right. It does. But we're not sure how. And I think that's the case here. But I have to believe that text is going to be a fairly easily solvable problem in the future. Totally. Where you get an image and you can add text onto it. It'll probably be a separate routine that deals with it. Mm-hmm. But it's text. Mm-hmm. Yeah.

[00:08:51] It's just, yes, it's one of those examples of things that, as a human, we find not that complicated. But as a machine trying to understand it or trying to make sense of it enough to kind of give that back to us in a way that we understand, apparently, it's very difficult. Because we're now two and a half years into this. And this is still a really, really big challenge. Even fonts.

[00:09:18] Like, I've noticed even when something comes back looking pretty good and the words are spelled correctly and everything and it's in a particular font. Once you start really zooming in on that, you start to see how not precise the kind of generation around that font actually is.

[00:09:35] If you go into Illustrator or Photoshop and you load up a font and type it in and really zoom in on it, there's so much precision to how these fonts are created. And the lines are incredibly straight and exact and just precise.

[00:09:55] And with AI, even when you zoom in on what looks from afar to be a pretty, you know, like I'm sure if on the video version I'm looking at kind of a, I don't know, a triangle wheeled vehicle patent that was created, you know, an image. And it has some text on it. And from a distance, this looks great. I'm sure if we kind of started to zoom in, I wonder if I can even do that with, yeah, it's looking all right from this perspective.

[00:10:21] I've seen some other things, though, where it's like you start to see as you get closer, like, oh, it's an approximation, but it's not quite the same. Let me tackle this in two ways. My friend David Weinberger is writing a new book. I got the privilege to read part of it. And I don't want to give away his book. But he has an interesting little chunk in the part I read about how if you describe handwriting, and we just saw an image where there was handwriting, and you say, what does an L look like?

[00:10:49] It's going to be one line goes down, one line goes over, right? What does a T look like? What does an I look like? It's going to be a vertical line. But then he looked at, I think it's used for OCR and like banks or something. And that's pretty meaningless. The rule of what the character looks like is meaningless because when people draw a one, it's cockeyed this way or it's that way. It's not fully vertical.

[00:11:16] And there's a commonality of where the pixels are to start to say, I think that's a one versus other characters, right? And so that's one angle of it is if you're trying to have this image analyzing program, analyze text of many, many different, not handwriting plus infinite number of fonts, it's not as easy as it sounds to understand it.

[00:11:43] Then on the generation side, as I read through the history, and I'm surrounded by stacks of paper having read through this, it was just a wonderful history of this. And I'll give a plug to my friend Matthew Kirshenbaum's Track Changes book, which was about the beginnings of word processing.

[00:11:59] But even before that, the creation of PostScript, and there I should plug Pamela Pfiffner's Inside the Publishing Revolution, the Adobe Story, and the Computer History Museum did a great seminar with lots of videos from, I think, about 2017 of the people, including John Warnock and John Siebold and others who were involved in this early process. Finally, to the point, it sounds easy to say we're going to take a font and we're going to put it on a screen.

[00:12:30] The only way to do that at first was to bitmap it, was to do bit by bit, this is what this letter is going to look like. Well, then I want to go to a higher resolution. Well, no, there's other bits here now. No, I want to go to a different size. No, that's other bits now. So what it was going to require was having to manually bitmap every font, every letter, and every size. Clearly, that was not sustainable, and especially in early PCs with less memory and processing speed. That was too much.

[00:12:57] And so how did you come up with a mechanism to produce a font independent of device, independent of manufacturer, and independent of size, and also enable things like twisting it around and doing things to it? That's what led to PostScript. And PostScript was a language of Bézier curves. As I said before, it came from a French car manufacturer. I think that one was Renault.

[00:13:25] And so to generate the letters is really complicated. Even when you had the early bitmaps, to make it pleasing to the eye, they had to adjust for the screen and how many bits there were and all this kind of stuff. Right. So all I'm saying, letters are hard to both recognize for the machine and then produce for the machine. You're right, Jason. The fact that it's text, this word is dog, is not hard for the machine. But how do you then display that in a way that makes sense?

[00:13:55] Yeah, represent that. In a program that isn't used to dealing with text, and none of them can do meaning well, but in a program that is really not used to meaning because it just deals with images. So I get why it's complicated. I can go through all that. But still to us, shouldn't it be easy to say, go spot go? Can't we figure that out? Right. So it's fascinating how this works. And so when you see these advances, and they are advances, it is worth being impressed by them.

[00:14:25] Yeah. Yeah, indeed. God, I feel like I saw also an example here that I'm searching for of like a comic book was another thing that I saw. Yeah, there was. I can't remember when I saw that either. Oh, comic strip. Okay. Yeah, I found that. Now let's see here. You know, make an image of a four-panel strip, some padding around the border, a little snail. You know, it explains it with the kind of punchline and everything.

[00:14:53] And yeah, I mean, suddenly anybody could create a comic strip. I guess you could do this before. But now that the text is more dialed in, now that you can say this bubble says this, this bubble says this, becomes a lot more definitive that what you're going to get is actually going to make sense on the other side. What's most interesting to me here is when you have a bubble, the same as the whiteboard, you had a space for the text. Yeah.

[00:15:22] Okay, I can block that off, and that's where this different thing is going. What's harder is to label things, but in this cartoon, there's a drawn sign that says cars. Mm-hmm. I guess that's just another space for text. Mm-hmm. But it's there, and then the car itself has an S on it in different angles. Mm-hmm. So that's more complicated. Yeah, totally. I think a lot of what they're showing here is really kind of – and the whiteboard image is a prime example of that.

[00:15:52] Like, the fact that it's able to kind of get those reflections and then the text – like, I was looking at this and how the text kind of gets – because of the perspective and the angle, the text gets bigger as it gets closer. It's smaller as it's further away, and it's kind of scaled along the way. It's not just getting the text correct, but it's getting the text correct to match the scale and the aspect. Well, what's hilarious, too, is that the photographer is visible. Yes.

[00:16:19] And what's behind the photographer is – I think that's the Bay Bridge, yes? Yeah, I think the model – the question – yes, it is. It says overlooking the Bay Bridge. You're right. Right. So I bet it managed to do the Bay Bridge behind. Yeah. Yeah. It's pretty impressive. You know, one thing I didn't know is that there is a wine glass problem, apparently, with AI rendering.

[00:16:40] And I thought this was kind of fascinating, is that up until now, image generation has had a really hard time dealing with what they call the wine glass problem, which is essentially like if you ask an image generator to create a glass of wine that is filled to the brim with wine, it has a hard time doing it because most of the training data –

[00:17:04] like, you rarely ever see any sort of images where the wine is poured right up to the brim. It's usually like halfway or three-quarters or whatever. Like, I even tried to do it with this model because I read somewhere – I think it was on Forbes to give at least some credit because I didn't notate it. But I read somewhere that this was the thing and that this model had solved it. I tried it myself, and I still couldn't get it to the brim.

[00:17:31] But apparently the reason for this is because as humans, we can conceive of what full means even if we haven't seen it before. If it's a glass and we say, fill that – I want to imagine this glass being full to the brim with wine, even if we've never seen that before, we can kind of understand what that probably looks like. But to an image generator, their model, their training data is so filled with half full that it's hard for it to conceive of that. And apparently it still is.

[00:18:01] Well, this is what we talked about last week. I'm trying to find my notes from last week about Jensen Wong talking about the physics engine. Right? That's right. Yeah. We want to discuss this afterwards on the next show. Leo Laporte said, well, yeah, game designers have always had physics engines. They've always had the way to tell it what reality is. But that's explicitly said. The ball will bounce, and this is what bouncing means or whatever. This is all about the next phase.

[00:18:29] We hear this from a lot of people, especially Jan Lakuna has been talking about this. The next phase of AI development is reality. How does it know reality? And that's a physics engine. Mm-hmm. Mm-hmm. So the wine being full, it has to experience that. It has to know. It has to know what overflowing looks like. Yeah. Right. And I use the verb know advisedly. Right. No. Yes, exactly. While not actually knowing. Okay, cool.

[00:18:57] I can actually give credit where it is due. It was Asat Dedezad, I believe is how you pronounce Asat's name. But anyways, wrote about the full glass of wine and the image that apparently was created with the current model with the wine all the way up to the top. So it is possible. I just couldn't get it to do it, but I thought that was interesting. I had not heard about that before. And it's a really good example of some of the challenges of the logic and the real world.

[00:19:26] I can't remember the physics terms, but then you also have the top is really the top because it will have a slight indentation because of the way the surface tension works. Yes, right. And all that, right? There's further difficulties. Indeed. Yeah, that's a good point. Well, so that is OpenAI's most advanced blah-de-blah. Then we've got Google, and Google has their most advanced – actually, no, they call it most intelligent AI model.

[00:19:56] So we're picking up on a marketing pattern here. This is Gemini 2.5, and it's smarter. It's faster. It's better at reasoning primarily, beating OpenAI's GPT 4.5, beating DeepSeek, at least as far as the LM Arena leaderboard is concerned. Again, that's another thing that always seems to happen with these new releases. It's the biggest, it's the best. And oh, by the way, it beat everything on this thing for the next week or two.

[00:20:26] It'll reign supreme until the next one comes along. Anyways, not to denigrate Gemini 2.5 Pro. It gets a score of 74% on the Eider polyglot. Yeah. Whoa. You know that. Okay. Convince me now. AGI is around the corner. Yeah, that's the problem with these benchmarks is they mean – to be getting a reality, they have no connection to our reality at all. Yeah. As mortals.

[00:20:50] I suppose if you're steeped in this stuff and you're working with these on a regular basis and you're trawling through the ratings and doing these comparisons, at a certain point it probably makes sense. You know one thing I noticed with these things is people who follow this stuff a lot seem to have like – and I don't know if they've come up with it on their own or if they've researched enough to come across these things and use it as their pinnacle kind of target.

[00:21:20] But people often seem to have the prompt that is guaranteed to throw everything off, kind of like the wine glass thing. It's kind of like I'm walking the earth with this prompt that I know is going to trick every single currently existing model out there. But someday there's going to be a model that can solve this and that's when I know it's really good. And it seems like a lot of people have that and I don't have it. Should I have one of those? I don't know. It's a whole discipline almost now.

[00:21:49] If you go to agi.safe.ai, you will find humanity's last exam, which is one of the things that Google has tested on. And by the way, the test goes – so Gemini 2.5 Pro gets an 18.8%. OpenAI 03 Mini gets 14%. OpenAI GPT 4.5 gets 6.4%. Claude gets 8.9%, right? So it's a slow number. This test is hard. And so there's a whole question here.

[00:22:18] If you go down, you'll see these questions are submitted by people. So Henry T. at Merton College, Oxford, gives a representation of a Roman inscription originally found on a tombstone, provide a translation for the Palmarine script. A transliteration of the text is provided and then it spells it out. And rah, rah, rah, rah, rah, rah. Oh, okay. Next one.

[00:22:41] Hummingbirds with apodiforms uniquely have a bilaterally paired oval bone, a sesamoid embedded in the cotolateral portion of the expanded cruciate aponeurosis of insertion of M-depressor cudae. How many paired tendons are supported by the sesamoid bone? Answer with a number. Three. So this is the stuff that they put against it.

[00:23:09] And then the next one, if you flip the math, I'm lost immediately. Jeez. Right? There's linguistics, physics, trivia. Right? So yeah, there's a constant effort to come up with these things. But part of the problem is when you put them on the web, guess what? The machine can read it. It becomes part of the training set. Right. So there's the need to have things that are not put up so that they can be used and shared. So it's a fascinating discipline.

[00:23:40] This is all Turing test times a thousand. In Greek mythology, who was Jason's maternal great-grandfather? Now that is interesting because then you've got to figure out what is a maternal side, what's a great-grandfather, in addition to who's Jason in Greek mythology. Yeah. Right. Interesting. So then this leads us to the thing that I've been wondering about for today's show. Yeah.

[00:24:09] Let me do this as a question first. Without use of dictionary or thesaurus or AI model, how would you define reasoning? How would I define reasoning? Hmm. I think, I mean, just, okay.

[00:24:30] Asking a lot of questions to understand as many aspects of something that can be gleaned from asking those questions to, yeah, to get a better understanding of a single thing. I suppose something along those lines. Right. So you're kind of, I think, presaging the way the AI models define reasoning because you're witnessing that, right? What it does is it cuts things into a task. Right. Yes.

[00:24:59] And so they're all bragging about, well, we're reasoning now because we've cut it into tasks. I don't know that that's reasoning. Yeah. I mean, yeah, I don't know that it is either. And I don't know that my definition of reasoning isn't at least partially defined by what I've seen from reasoning models. Exactly. That's why I asked, right? Because I think that's what we kind of, it becomes a self-fulfilling prophecy. Yeah. Definitionally.

[00:25:25] Because they all tell us their reasoning and then this is how they say their reasoning. And then we say, oh, that's reasoning. But I'm now asking, is that reasoning? Is that what human reasoning really is? If I need to decide what's the best thing to do in this case, yeah, I might try to cut up a task, but that's not the only way.

[00:25:46] I also need to, in the view of digital twins, look at alternative futures and weigh them against multiple criteria and factors and odds and risks and benefits and rewards. That's all reasoning to me. And just slicing something up to say, well, okay, Greek mythology, Jason, okay, that says who Jason is. Now what's a grandfather? What's a great-grandfather? What's the maternal side?

[00:26:16] That's not to me really reasoning. That is a logical or not even logical. It's a dissection of a question. Or deduction. Into steps. Is it deduction? Yeah. That's another good question. I don't know. Deductive reasoning would imply a lot of experience, I think. Oh, okay. Yeah.

[00:26:44] So all I'm saying with all this is that I think we bought this idea that these models are reasoning because we've seen that behavior. But I want to question that behavior as reasoning. Well, and I mean so much of this is marketing as well, right? It's like what name do we give this process? Reasoning sounds right. All right. All right. Let's go on with reasoning.

[00:27:04] Can people connect with the concept of reasoning when they see these questions being broken down into multiple different avenues and steps to explore? If I search online, but college, I'm only picking it because it's at the top of the results says reasoning is the process of using existing knowledge to draw conclusions, make predictions or construct explanations.

[00:27:36] I don't think that fits with what they're doing. Yeah. I mean, that's a piece of what they're doing, I suppose. They're constructing explanations. Yeah. I don't know. That's a really great question. Does reasoning actually describe what's happening? That's why I like this show because we ask those big questions without any answers. Yes, exactly. We may never have the answers, but we find lots of questions to ask, so at least there's that.

[00:28:06] Let me, let me, okay. Okay. So I'm going to ask Gemini to define reasoning. Okay. All right. Reasoning at its core involves the process of thinking and drawing conclusions from evidence or premises. Oh, you just said that. Here's a breakdown. A use of logic, drawing inferences or conclusions. Oh, you said all that. Seeking understanding. Different forms of reasoning exist include deductive reasoning, moving from general principles to specific conclusions. Inductive reason, moving from specific observations to general conclusions.

[00:28:36] Abductive reasoning, forming the most likely explanation based on incomplete information. Hmm. AI does a lot of that. Yeah. Okay. That was fun. Yeah. Interesting. Well, nonetheless, whatever you want to call it, I suppose they're all chasing that dragon right now. And Google is but the latest.

[00:29:02] Gemini 2.5 Pro is the beginning of what they are calling the reasoning capability, kind of, you know, their next phase of reasoning in their AI models. And they've said that all future models that they release are going to have these capabilities in them. Also, and I think this is a strength of Gemini, is a large context window. One million token expanding to two million tokens at some point.

[00:29:29] And, you know, some people see that as a real big strength for what Google's doing with Gemini. So, yeah. Cool stuff. Actually, what did I see? Oh, yeah. Real quick before we go to the break. They shared, Google shared a video, which I won't have sound on it, but a video of using Gemini 2.5 Pro to make me a captivating, endless runner game, key instructions on the screen, blah, blah, blah. You know, it's a very short, short kind of prompt to make a pixelated dinosaur running game.

[00:30:00] Runs the process, shows you the process. The video is sped up. It takes maybe a minute, I think, a minute and a half to create all the code. And then they copy that code and put it into, you know, an online kind of code, whatever you want to call it, thingy thing, that turns it into a game. And suddenly you see the game is playing. So, you know, AI generated games from start to finish in a minute.

[00:30:26] It's not the most complicated game in the world, but still, you know, shows the direction of where all this stuff is headed. And it's pretty impressive. Yeah, I didn't know how impressed to be. Yeah, yeah. I mean, yeah, maybe we should expect more than a pixelated dinosaur running game. A couple of years ago, it would be really impressive. I don't know. You know, I'm not a coder. I'm not a developer. So you're right. I don't actually know how impressive this is.

[00:30:56] I'd be curious to hear from one of you. So, you know, find us on social. Let us know. The game we just had with the final human test. Somebody needs to come up with those kinds of tests that we mortals will understand. Mm-hmm. Mm-hmm. And I think it would be more about questions we would normally ask and tasks we would normally give it and say, how sophisticated is this against those? We need a different set of benchmark. We need benchmark for mortals. Mm-hmm. Mm-hmm.

[00:31:26] I like that. So get to work on that. All right. Well, I usually, you know, we have the thing we always ask it to do. So I was trying to use the new ChatGPT or OpenAI drawing thing, and it was busy, so it couldn't do it for me. Yeah. But I always ask it to have Johannes Gutenberg at a laptop. Oh, and how does it do? I didn't know from this one because it was too busy. But generally, it does pretty badly. It does very badly.

[00:31:55] And so it's interesting to me to see if I use the same question over and over and over again, will I see progression? Yeah. Well, and that's kind of what we were talking about earlier, how everybody seems to have their thing. Image of – let's see here. So let's try this. We might as well. It's our show. We can do what we want. Image of Johannes Gutenberg, which I copy and pasted because I didn't want to misspell it. Sitting at a laptop? Just add a laptop. See what it does. At a laptop.

[00:32:25] See if it could be for presumed sitting or working or whatever. Yeah. And so it gets started. I did have noticed that image generation takes a little bit of time. It's not the kind of thing with this model that it just spits it out immediately. But how about this? We're going to take a break. Yeah, it's starting right now. We're going to take a quick break. When we come back, we'll check in on Johannes Gutenberg sitting at a laptop. That's coming up in a second. Trust isn't just earned. It's demanded.

[00:32:55] And whether you're a startup founder navigating your first audit or a seasoned security professional scaling your GRC program, proving your commitment to security has never been more critical or more complex. That's where Vanta comes in. Businesses use Vanta to establish trust by automating compliance needs across over 35 frameworks like SOC 2 and ISO 27001, centralized security workflows, complete questionnaires up to five times faster and proactively manage vendor risk.

[00:33:23] Vanta not only saves you time, it can also save you money. A new IDC white paper found that Vanta customers achieve $535,000 per year in benefits. And the platform pays for itself in just three months. Join over 9,000 global companies like Atlassian, Quora, and Factory who use Vanta to manage risk and prove security in real time. For a limited time, our audience gets $1,000 off Vanta at vanta.com slash AI inside.

[00:33:50] That's V-A-N-T-A dot com slash AI inside for $1,000 off. Everyone's talking about AI these days, right? It's changing how we work, how we learn, and how we interact with the world at a tremendous pace. It's a gold rush at the frontier. But if we're not careful, we might end up in a heap of trouble. Red Hat's podcast compiler is diving deep into how AI is reshaping the world we live in. From the ethics of automation to the code behind machine learning, it's breaking down the requirements, capabilities, and implications of using AI.

[00:34:19] Check out the new season of Compiler, an original podcast from Red Hat. Subscribe now wherever you get your podcasts. All right, and we're back, and it is now filling in the details. You can see it's pretty slow. Bit by bit. Bit by bit. But hey, this is magic happening right before your eyes. You can tell me at least by the time we get through a third of the image if this even looks like Johannes. Is it Johannes or Johannes? It's Johannes. Johannes Gutenberg.

[00:34:48] The thing is that nobody actually knows what he looks like. He's presumed to have a beard, which is what he's going to have here. But at that point in his nation and kind of cast, they didn't have beards. Got it. Oh. But images of him have this forked beard. Huh. I wonder why that is. Why? Yeah. So this is pretty straightforward. It's a guy with a beard and a weird hat sitting at a laptop with his fingers. Let's see. How many fingers does he have? That seems to be right. Yes.

[00:35:18] Okay. So it's okay. I would say so. He's not really looking at the laptop, but we didn't say look at the laptop. No. He's thinking. He's in the midst of writing and writing something. And he's thinking before looking back down at his computer to type whatever. Does the shape of the laptop seem odd to you? No. There's something about the shape of the laptop. There's something a little bit off, but all things considered, not too shabby. We don't have a logo.

[00:35:46] Maybe we could add a fancy Gutenberg logo to that. Yeah. Fun times. I love that stuff. That's great. I'm happy we did that. Okay. Let's see here. OpenAI, still talking about OpenAI, conducted research into how interacting with chatbots impacts users on an emotional level, which is definitely a topic that we talked about before on this show.

[00:36:13] This was in collaboration with MIT Media Lab. They did a couple of studies. And 40 million interactions, somewhere around 4,000 users, 4,076 users, sorry, were analyzed for this. Most users treat chat GPT as a productivity tool, but a subset of users engaged on more of an emotional level.

[00:36:37] And when that happened, the study showed results of higher loneliness, higher dependency. Ultimately, users who trusted and bonded with the chatbot had higher likelihood of that loneliness and dependency. And strangely, it seemed like voice interactions actually lessened the emotional dependency of the users, at least initially.

[00:37:02] If they chose the opposite gender for that voice, then that seemed to elevate things. So, I don't know. Interesting. I mean, it is an open AI study, but it's in collaboration with MIT Media Lab. What were your thoughts on this? It's also very short term. Very, yeah. And the issue with these – I'm not a social scientist.

[00:37:23] I don't know how to do the methodology of these things, but I think that trying to ascribe abstract and important emotional descriptions to a single factor, they used chat GPT or they didn't, is a bit limiting. Right? Because there's obviously many other – it's the same problem with media analysis. When you say, well, TikTok did this to the youth of America.

[00:37:51] Well, nobody spends every minute of every day just watching TikTok. There's a lot of other factors. There's the news. There's all kinds of things. And so, it's hard for me because I'm not a social scientist to understand the methodology here and thus judge it. But I think we've always got to – in all these cases, you have to take it with a grain of salt. It's interesting that open AI is doing it and that it had its negative aspects. Yeah. Yeah.

[00:38:16] It did leave those out and it was okay with kind of addressing, I suppose, that in the study. So, that gives me a little bit more – yeah, when I first saw open AI has released research, my immediate thought is, oh, okay, well, it's coming direct from open AI. Like, how accurate? How on board? But, you know, MIT Media Lab in collaboration with, so that certainly helps to have somebody else integrated with. And it wasn't an overly optimistic, positive study either.

[00:38:46] So, at least there's that. Okay. The Atlantic. This was such a big story last week and then I feel like it just kind of dropped off. The Atlantic released a tool to allow anyone to run a keyword search on the LibGen database. This is the database. I think it was originally created like in Russia or something back in 2005 or something. It's been around for a very long time.

[00:39:12] But anyways, it's been updated and used by Meta to train its AI systems. And actually, the Atlantic posted a search tool so that anyone can go in there and get furious at what they find. Although, I don't have the – yeah, I don't have the version of this article that is open to everyone. But there was a way to get it. I'm a subscriber to it, so I can put in any author you want. Is it an obscure author? Let's see.

[00:39:42] Let's say Jason Hell. You're not obscure. Oh, you'll find me in there. I mean, it's not Mimi, but you'll find a Jason Hell there. I did search for that. I was like, oh, I've written a bunch of scientific journals and stuff. I had no idea. Let's see if Leo Laporte is in there. Yes. His 2005 gadget guide is. Yep. And then yes, I'm in there. Yeah, of course. Would you say all your major work is in there? What would Google do? This is why I'm going to very cleverly get in plugs for every one of my books.

[00:40:11] Indeed, yes. What would Google do? Gutenberg, parenthesis, public parts. Geeks, Bearing Gifts, The Web We Weave, and Magazine. Yep. Yeah. All of them, including The Web We Weave. It just came out, you know? And so then the question, of course, and I asked this knowing the answer already, but for people watching and listening, how do you feel, author Jeff Jarvis, about your books appearing in this database?

[00:40:40] I feel like you're holding a microphone up to my face. So how do you feel? I'm glad about it. And I know that'll get me in trouble and people won't like that view. But if I were left out, I write to be read. I write to put my crazy ideas into the world. And I want AI models, if we're going to be using them, and we are, to have as broad a base of input as they can have.

[00:41:09] And so, yeah, I'm kind of glad it's here. I understand people are not happy, but this is where we have to, again, go back to the analysis of whether or not it is used to merely train the model. Right. In which case, it's like teaching a kid to speak. Right. It's knowledge. It's not. Yeah. Well, it's not even knowledge. It's a skill. Oh, okay. Yeah. Yeah. Look at that. Right. It's just a skill.

[00:41:38] Then the next question is, and this is very relevant to this, how was it acquired? Was it acquired? Very relevant. Yeah. And if it was stolen and it wasn't bought or it wasn't properly borrowed from the library and scanned or however you choose to look at that acquisition, then that's an issue. If the model quotes at length without permission, that's an issue. But in terms of training, I'm cool with it. In fact, I wish my blog were here.

[00:42:07] I wish other things were here. They're not. And they could be and should be. So let's talk then about the acquisition aspect because that is a big part of this is that this is all related to the release of court. My understanding, anyways, is that it's related to court documents in the case that was brought a few years ago by Sarah Silverman and others.

[00:42:31] And through that, it has involved the release of emails and communication and stuff that was essentially released to the public like a couple of weeks ago or whatever. And in that communication, it's pretty obvious that the folks at Meta using this data set knew a few things about it were not quite – they weren't being completely straight.

[00:42:59] They were using BitTorrent to torrent a lot of this material. BitTorrent, of course, is a file sharing protocol. It's not just that you're using it to download someone's copyrighted works. But you're also kind of like simultaneously sharing that out to others. And so you're distributing copyrighted material potentially when you're doing that. So there's that aspect.

[00:43:23] There were also emails where they were talking about, do we pay for the rights to do this? Do we pay someone to gain access? And in the emails, they basically say that takes too much time. And if we even pay for one of these, then our fair use argument goes out the window. And so there's not – so the release documents illustrate a knowledge of Meta that what they were doing,

[00:43:52] the methodology of how they were collecting this information to use it inside of the database might not have been very understandable or on the up and up, whatever you want to call it. And I don't know. Does that change things? Yeah, I think it does. I mean, in OpenAI's suit against the New York Times, my argument – and I'm not a lawyer and I'm certainly not a judge –

[00:44:15] my argument is that if OpenAI has one subscription to the New York Times and the machine gets to read it, legit. That's my view. I'm not sure that that'll be what happens in court or certainly not what the New York Times thinks. Right. And they'll argue scale and they'll argue all kinds of other things, but I think that that's the case.

[00:44:35] But then the issue becomes, again, we've got to ask a larger question as a society, is do we want these things that purport to have a window on our knowledge as a society be incomplete? Mm-hmm. And so let's just say this for the sake of it. Let's say that Libgen indeed bought every book. Let's say they did. Mm-hmm.

[00:45:03] They'd buy one copy of each book. One copy of every single book that existed. One copy. Yeah. It didn't get you very far, right? Nobody's going to get rich off that. Nobody's going to argue, well, you stole 15% of $20 from me. So the damages to each individual author are de minimis. Now, the author might say, well, but because of this, I'm not selling other books. Well, no.

[00:45:33] If it doesn't regurgitate major chunks of it, that's not the case. Right. Because it was used for learning. Mm-hmm. So I think it's interesting to examine the actual damage here. It's not a loss of sale. It's a loss of one sale and then an ethical violation, but not a huge monetary issue. Mm-hmm.

[00:45:59] So I think we have authors screaming like stuck pigs here and artists, but no. The other example I would give is that this week we saw, and Jason and I were wowing at this before we got on, the Atlantic signal story. Mm-hmm. Mm-hmm. And so it's a really good example to talk about. I don't know if it's still the case, but if you go to Tech Meme, they- Oh, yeah. I know where you're headed. Yeah. Right. So there was only one authority.

[00:46:29] No, it's not there now. But there was only- I go to Yesterday's. Yesterday's. And I've got it in my, good. I've got it in my social. So there's only one authoritative version of the Jeffrey Goldberg Atlantic story. That is Jeffrey Goldberg's in the Atlantic. And Tech Meme lists, I can't figure them, 100, 80, 100? I mean, yeah, it's massive. It takes up the entire screen. Other sites that write about it.

[00:46:58] Well, they can only have rewritten, read and rewritten the Atlantic. Right. There's no other source of this information. Right. That was the only authoritative original source is Jeffrey Goldberg in the Atlantic. And so are we going to have the same issues with Wired, CNET, ZDNet, Forbes, Wall Street Journal, The Bulwark, BBC, Politico, Reuters, and so on that we have with LibGen or OpenAI? Because what journalists, and if journalists want to have that issue, they've got to deal with this

[00:47:27] because they all read and learn from each other. They all reuse each other's information. And yeah, they give credit in the case where they quote it at length, but they learned from it and they wrote about it. And they maybe added value too, but so do transformative models add value. Oh, yeah, 100%. There's a precedent here in journalism that the journalists too often want to forget. Yeah. Yeah, it's kind of a, what is the phrase that I'm looking for?

[00:47:55] It's having cake and eating it too. It's like, do as we say, not as we do, or something along those lines, I suppose. Yeah, it's a really interesting kind of example of that. Really kind of great way to kind of flip the lid a little bit and show the other side of that. Because it's true. That's how journalism works a lot of the time. That's, yeah. Fascinating. And education and art and all kinds of things. Mm-hmm. Right. Yeah. Interesting.

[00:48:23] Well, also, real quick before we move on. So parts of the database were filled with BitTorrent to get the sources. Trying to understand where that fits in. Because the database existed already. So is that to say that Meta downloaded a version of the database, a snapshot of the database as it was,

[00:48:46] and then continued to add to their own version of that database with the BitTorrent sources? Is that how that goes? I don't know. You know what I mean? I thought that there was a story about LibGen some months ago, and I searched it then, and I wasn't in it. And some of these things are obviously more recent than others, so I think it's a screen of information, yeah. Yeah, yeah.

[00:49:10] But I mean, it's not like I made the mistake when I first read this of going like, oh, it's Meta's LibGen database. Well, it's not Meta's. No, it's not Meta's. It did not begin with Meta. It began well before. And so a lot of that stuff was put into the database. I mean, maybe someone does know how that stuff was put in there or procured. But part of this story was the whole BitTorrent aspect of getting current works and feeding it in.

[00:49:39] And I'm just out of just sheer curiosity, you know, wonder how that fits in with an existing database. I don't know how LibGen works. Yeah. Yeah, don't either. Interesting, nonetheless. And then AI web crawlers. This was a story that I saw on today's tech meme list that I thought was interesting. They are overwhelming the infrastructure of the open source software community.

[00:50:03] And of course, you know, we talk about these models requiring high quality information, high quality data. And that's what we want out of them. We want them to be the best versions of themselves. And so they have to, you know, they have to have high quality data to train on in order to do that. Well, apparently AI web crawlers are just going to town on open source repositories like GitHub, you know, like and so many of the other ones.

[00:50:32] It's leading to service disruptions. It's leading to an increase in cost. The bots account for 97% of traffic on some projects, at least as far as this story on Ars Technica by Benji Edwards, you know, mentions. They can get around standard blocking measures. So if you've got a robots.txt on your site or rate limiting or whatever, apparently these bots are really good at getting around those restrictions.

[00:51:00] In order to be cut and primarily because these are such rich kind of sources of information, sources of data when you're talking about, you know, learning, you know, it's structured, it's technical content, it's high quality. And so you can understand why AI companies would want this to train their systems. But open source, you know, repositories and the people who run these code projects are just saying it's too much.

[00:51:28] It makes us want to close them off. So this is relevant, I think, to the prior discussion. And it's relevant also to what I've been arguing with the news industry. I tell the news industry, I wish you would create an API to news. So you make it very convenient for the AI companies to say one-stop shopping. Here is current, credible, verifiable, well-branded information. And let's talk. And we now have the bait. We're going to make it easy for you to get it. We're not going to sue you.

[00:51:54] And you're going to be free, okay, but then you're going to pay us and you're going to agree to whatever we negotiate about presentation and branding and links or whatever. Well, the book publishers should gang together and do the same thing. The problem is they're going to think that they're worth $3 trillion. And the AI companies are going to say, well, actually, for training, you're only worth so much. And you end up back to the thing where if every author got money for one book being sold, then they get a dollar, right?

[00:52:24] Yeah, right. BFD. Same with these open-source repositories. I think that's a more likely case where they could say, listen, we're going to gang together. We're going to create a crawl like Common Crawl Foundation, our first guest on this podcast. That's right. And say, here, guys, you can have it. Just stop bugging us 100 times over because there's nothing special about your crawl versus anybody else's crawl of all of this. And I think it goes back to the discussion about Google Books scanning back in the day.

[00:52:53] When libraries – I thought of this example. This is an example for you. So a library says we own the copy of the book. We acquired this book. We bought it because libraries buy books. And we're choosing to let Google scan it. That led to a whole massive kerfuffle in the courts. And on the other end of it, it turned out okay because especially older books that I, as a researcher, I couldn't otherwise get.

[00:53:18] Now all the time I can get through HathiTrust and I get through other methods because it was scanned, because there was a model there. So I think we've got to wake up and smell the AI, understand that they're going to go out there and they're going to try to get this information. And the smart thing to do is rather than resisting it is to say, okay, how do we do it in a way that's going to be mutually beneficial? But these industries are not that forward thinking. But I would think that open source places could be. Yeah. Yeah. Well, yeah.

[00:53:47] It is a – when it impacts the resources the way this sounds like, the immediate knee-jerk reaction, and I understand it, is this is – yeah. This is unsustainable. What can we do to protect, you know? But yeah, you're right.

[00:54:04] That would be nice to have a single kind of – a single thing, a single nugget for these systems to pull from instead of having to go in there day after day, time after time to do it over and over and over again. Yeah, this conversation would actually be really interesting to have with Rich – is it Rich Scrantza from – Rich Scrantza, yeah. Yeah, Rich Scrantza from Common Crawl. I'm sure he'd have a lot to say about this.

[00:54:34] Perplexity wants to buy TikTok. This is nothing new. Which they've been saying. They've been saying that. But there's something new here. But they want to – yeah. So they are pitching their vision to rebuild TikTok in America. They want to rebuild the algorithm, which is a signature selling point of the platform. I think that would make anyone slightly nervous. Like, wait a minute. That's the thing that really, really works well, you know, depending on how you define well.

[00:55:04] They also want to add community notes. They want to open source the For You page. Make a lot of changes in TikTok. To TikTok, I suppose. In an effort to save it before the ban on it expires – or sorry, the order – the executive order that protects it from a ban is set to expire on April 5th. And so Perplexity is basically saying, oh, we're okay. We could do that. We could do that.

[00:55:33] I don't know how the resources stack up against Oracle, but – Yeah. A Washington Post story today said that the ban is unlikely to happen because we haven't heard much steam around it. And Trump is likely to extend it further and keep going. We'll see. Yeah. You can't predict anything these days. And again, I think Perplexity, this is not going to happen. But one more time, I really admire their creativity and chutzpah. Their tenacity. Yeah. Yeah.

[00:55:57] They stick themselves into things for good ideas, and they're not just doing the, we have a new model, we have a new model, we have a new model game. They're trying to make this stuff useful in different ways. And so – Well, they kind of need to. They kind of need to because they aren't a company that necessarily has a new model. True. They are a wrapper for a lot of other models. True. And so many people would use the term moat. They don't have much of a moat around them. True. Well said.

[00:56:25] And so this might be an effort to make – to give them more longevity, more relevance in this ever-shifting AI world. But with things like Discovery, their news side, they're doing a really good job of creating a useful layer on top of models. So, yeah. And they can do it social too. Use it every day.

[00:56:50] I mean, another part of this, of course, that they mentioned is that they would integrate TikTok's short-form videos into its search engine, which that – I heard bells on that one. I was like, oh, okay. So that's why you really want TikTok. Yeah. You want TikTok so that you can bring it in and be the kind of go-to TikTok source in the world of AI or whatever. But, yeah. You know, they put themselves into the conversation. Got to give them credit for that. Quick break.

[00:57:18] Mike, then we're going to come back and speak some German. That's coming up in a second. All right. I did not put in the article that is entirely in German. Although, I realize – is it ZIT? ZIT online? Dietzeit. Zeit. In German, you always pronounce the second vowel. Oh, interesting. I didn't know that. Yeah. So it's Dietzeit. Dietzeit. Dietzeit online.

[00:57:46] It's an interview with – Richard, is it? Richard Sutton. Richard Sutton. Turing Award winner. This is a German weekly newspaper. But I will say before we kind of get into – because I'm curious. You definitely have thoughts on this and you pulled some quotes from it. But I'm just so impressed by Google's – by Chrome specifically, its ability to translate on the fly. Yeah. It's amazing.

[00:58:15] It wasn't long ago that an article like this was just unreadable by someone who only reads English. And now it does it immediately and it's totally readable. It's not like weird, broken translation. No. It's excellent. And I've taught – my deutsche slide is sehr schlecht. I didn't pay attention in high school and college. It sucks. I'm too old to learn now. That's the reason I go to places like Dietzeit is I want to learn. But now it's just so easy. They right-click and translate on Google and it's there and it's brilliant.

[00:58:44] I often do quote these things and I will choose to try to – if I'm going to quote a paragraph, I'll go ahead and at least test the translation and see whether I agree with it. Not that I'm very good at it, but I can at least make my own judgment. So poor Jason, he saw me put in the rundown, KI is nur ein Werkzeug. Jedes Werkzeug kann misbraucht werden. What the hell has Jarvis done now? I'm like, I'm just going to hand the keys over to Jeff.

[00:59:14] I'm just going to be like, you tell me what you got. Here's the Volkswagen, which translates to AI as just a tool and any tool can be misused. The reason I put this up there because it's a Q&A and I did use Google Translate, translate the whole thing. I wasn't going to spend the whole time to do it. But I found this was a really good explanation is we talk about AGI all the time. And they asked him which models are kind of closest to AGI.

[00:59:38] And Sutton responded, models like KGPT, chat GPT, are trained once and then don't learn anything new. Oh, that's interesting. He said, furthermore, these models are still very poor at generalizing, i.e. inferring from their training data to new unknown data. Okay. Like that. Above all, I don't believe AGI is possible without reinforcement learning.

[01:00:05] And he's one of the founders of reinforcement learning, so it makes sense that he would see that. This aspect that AI has a goal and learns from experience is missing from language models. That's why I don't believe language models are ultimately sufficient to achieve AGI. I thought that was a really good and cogent explanation of where we are. Now, we can still debate whether there's any such thing as AGI. I still think it's BS. I think it's the wrong path to go down.

[01:00:33] But even if you want to try to buy some of the definitions, I think Sutton is really on it right here, is that this idea that it doesn't really learn. It learns a skill. It uses the skill. It doesn't learn anything new. It doesn't have any reinforcement to reality. So it can't be more intelligent than us. It cannot solve problems that we can solve because we have so many more inputs. A three-year-old, this is Jan Lekun again. He says, if we could only be to the point of a three-year-old toddler and look at the input they have in their life. Or a cat and a cat brain.

[01:01:02] They have more reality. So I thought this was really interesting. He went on. I'm just a different topic here. Yeah. He said, I think fear of AI is being deliberately stoked. And that's a shame. Of course, it's possible that jobs will be lost as a result. But that also happened in the Industrial Revolution. In the end, it created more jobs. Dietz Seid asked, AI could lead to an even greater concentration of power and money.

[01:01:29] And he said, but AI could also lead to the decentralization of power. There's no such thing as AI. What we understand by that is constantly evolving. Sure, the big tech companies are investing a lot of money now. But at the same time, there are also open source projects. So I just thought this was somebody who is on the inside of this, who understands it extraordinarily well, better than I ever will, who has this kind of smart, reasoned view of it. So I just wanted to share that. Yeah. Yeah. That is really fascinating. Interesting.

[01:01:59] And I love that we can even read it, even if we don't. In Gischen Deutsch. Don't read German. Interesting. Yeah. So my thought on the whole Industrial Revolution comment, because we've talked about that before, how this is just kind of the evolution of the world. Moments like this happen and things shift around.

[01:02:21] My thought on that is just we are seemingly in the eye of the storm currently, where it seems like, okay, we're on the precipice of major change because of this technology. And if we went through the Industrial Revolution and came out on the other side and survived, and I'm sure people lost their jobs along the way, but also new things were created and everything.

[01:02:48] I guess my empathy perspective is all that can be true. And there may be a time where we are finally at a point to where we say, and see, it turned out all right. But when we're in the eye of the storm, it can be very difficult to see that. Absolutely. And being in the eye of the storm right now, I don't begrudge anyone for worrying about that. No, and I think that present tense worry is the thing to worry about.

[01:03:17] Jobs and environment and labor, how labor is used for these things. I agree with that. I think that's what matters more than the doom crap. So four or five years ago, I went to the International Journalism Festival in Perugia, Italy, which is a wonderful thing to do. The pasta is great. The cacio e pepe is to die for. Of course. And I was having a debate with a German regulator. Ooh, what fun.

[01:03:43] And I said, well, you know, after Gutenberg, there were peasants wars and there was a 30 years war. Maybe we have a 30 years war ahead of us. And he said, without the slightest irony being German, this is German day on AI Inside. He said, yeah, it is too soon to joke about that. A million people died. Okay. Yeah. But it was centuries ago. Right. Right. But yes, to your point, in the middle of it, well, print's not going to turn out to be a big deal.

[01:04:13] It's okay. Don't worry about it. It'll be a great thing. But if you're living in the middle of the 16th century, it was a big deal. Yeah. It was a big deal. And it took a while to get to the point to where it wasn't. Yeah. So maybe that's where we are anymore. Yeah. Yeah. No, I think so. I don't know why. And not even at the middle. It kind of feels like we're kind of at the start. Yeah. That is very true. It is at the start. Yeah. I fully believe that. Yes. Yeah. Absolutely. Okay.

[01:04:42] Well, then to lighten things up for the end here. This is just kind of... After destruction and really good. Let's turn to movies. I have often wondered when we might start to see films in different languages, not only translated and dubbed into that language as we've seen. And as AI has gotten really good at translation and voicing and all that kind of stuff.

[01:05:10] But also seeing a film that's been translated, that's been dubbed, and that AI has been used to make it look like it's in its native language when you're watching it. And I don't think it's a stretch to truly believe that that would happen eventually. So I'm sure not alone on wondering when that would happen.

[01:05:34] Well, there's a Swedish film, a science fiction film called Watch the Skies that's going to show in AMC theaters starting May 9th. And it will have visual dubbing for the theatrical release in the US. It's not going to be on a ton of theaters, you know, a ton of cinemas. I think it's going to be in like 100 screens or whatever. But it's interesting. I was like, well, how does this technology stack up? And it's done by a company called Flawless. Their technology is called TruSync.

[01:06:03] And, you know, they say your film, their language. What's notable here is that it complies with SAG-AFTRA rules. And those are the rules that came into effect after this, like, very lengthy four-month strike that happened in 2023 when they were really kind of going to town on rights of, you know, people in Hollywood and their ability to continue to work even in light of these AI tools coming out of the scene. So this is fully in line with those rules. And, you know, they've got a little promo.

[01:06:32] Hopefully you can hear it. If you go right into 28 seconds in. Introducing TruSync. Okay, 28 seconds. Immersive and authentic viewing experiences in over 40 languages. And it's all my fault. At speed. And at scale. That's it. That's exactly how we make movies. With TruSync technology. Yeah, it's interesting. I mean, you know, so they're tracking.

[01:06:58] So it's not just, you know, they're moving the lips, but they are kind of tapping into the emotion, I guess, of the actor. Right. I mean, it looks pretty good. I wonder if 90 minutes of that or however long a film is on a very big screen, if you kind of get to the point to where you're like, eh, I'm seeing a little too much of it. You know what I mean? Kind of from the same perspective of like 20 years ago, CGI in films looked cutting edge.

[01:07:25] But now you look at CGI from 20 years ago and you're like, eh, my eyes, you know, see it. And it's, you can't unsee the fakeness of it. And I wonder if we'll kind of get that with perspective and time on stuff like this. But to me, there's two parts of this. The less important part is that when I see someone who is, let's say French, I, the way they move their mouths is they move it around the French language. Yes.

[01:07:54] And you can see, I can see somebody with no sound on sometimes and guess whether they're French or Swedish or Chinese based on kind of the vowel shapes that occur. Or Latin languages with rolled R's and things like that. Right? So there's a little bit of that. I think that's minor. The important part here is the drama, the inflection. Is this a joke? Is this angry? Is this sad? Is this scared? It's weird.

[01:08:22] And in that little clip, it sounds like it did pretty well. Of course, their promotional clip, you know, they took all their best examples and put them in there. Yeah. But we talked about this before in terms of being able to, if you wanted to have something that reads a book, let's say, I've talked about the desire to have an emotional markup language. This is a punchline. This is angry. This is whatever. So that the computer can know those kinds of things that come out with that. In this case, it has something to mimic.

[01:08:50] It has the original to mimic, which I think is important. But I also hope for that next step where you could edit not just the language, but the emotion. Yeah. Interesting. Because I'm also thinking that I'm realizing as I was poking around on this site. Let me see if I can share this real quick. Poking around on the site of Flawless, you know, they have their TrueSync.

[01:09:17] They also have something they call Deep Editor, which is meant to be an editor that when they go in, you know, you've shot your footage, you have your footage, and you realize when you're kind of putting it all together that instead that you're missing something critical to a scene or whatever.

[01:09:35] Instead of, you know, the kind of costly, you know, reshoot or getting the cast back together and setting all that up, you can take the material that you have and add around that. And I think there is some aspect to this that ties into what is the emotion that your character needs to see. And I don't know if that's a markup or, you know, it's more of a graphical kind of approach. But yeah, I think that is happening.

[01:10:05] I don't know if it's a markup language necessarily, but they're certainly paying attention to that and working around that. So interesting. The tools that will be used, like in five years, I imagine the tools that people who are making these things are already using will be integrated with this stuff, you know, and it won't be this like obscure extra thing. Actually, I don't actually know that. That's what we're seeing with Adobe tools, you know.

[01:10:34] Making things less expensive to make is obviously a benefit for big companies, but it's also a benefit potentially for new small competitors to be able to make things at that level. For sure. So that's what I hope for. Yeah, indeed. Well, that is it for this week's episode. Lots of fun topics to talk about. That's a lot of high altitude. Plus German. Plus German. Yes. Plus a German education.

[01:10:59] If you don't have access to the meta database, the whatever it's called, LibGen, and you want to find out what Jeff has written, well, all you have to do, really, you don't have to use that search. You can go to jeffjarvis.com instead. And you can find the books and get them for yourselves. The Web We Weave, the Gutenberg Parenthesis, Magazine, and soon to be another book adding to that shelf.

[01:11:27] And we'll certainly let you know when that happens. Jeffjarvis.com. Thank you, Jeff. Thank you, boss. AIinside.show is the page, the web page that you can go to for all the ways to subscribe and to follow and everything. All episodes are listed there with audio and video. Everything you need to know about this show can be found at AIinside.show.

[01:11:51] And then finally, we have our Patreon, patreon.com slash AIinsideshow, where you can support this show directly. And if you are supporting at a certain level, you get ad-free episodes. You get access to the Discord. You get a whole bunch of other perks. You can also get a T-shirt if you're an executive producer of the show. And I'm scrambling to find the button in order to put that on the screen easily. But there we go. Sorry, that's not the wrong.

[01:12:21] Oh, it's just falling apart. It's entirely falling apart. There we go. That's, I know. There's too many dang buttons. Executive producers this week. Dr. Dude, Jeffrey Maricini, WPVM 103.7 in Asheville, North Carolina, Dante St. James, Bono DeRick, Jason Neffer, and Jason Brady. So many Jasons. And it's just great to have you all supporting us each and every week. We can't thank you enough. So thank you for that.

[01:12:47] Thank you, everybody, for watching and for listening to this episode of AI Inside. We will be back once again next Wednesday with another episode and a whole lot more to talk about. Thank you again, Jeff. Thank you, everybody. We'll see you next time on AI Inside. Bye, everybody.