This week Jason Howell and Jeff Jarvis talk with Dan Patterson of Blackbird.ai about their context-providing service Compass before diving into the week's top AI stories including Google's "biased" Gemini model, data sharing deals between companies like Reddit for model training, and the risks and benefits of open sourcing AI systems.
INTERVIEW
- Overview of Blackbird AI's mission to track narrative threats and attacks like misinformation and disinformation
- Introduction of Blackbird's new product Compass for providing context around claims using AI analysis
- Explanation of how Compass works to check claims and provide contextual information from authoritative sources
- Discussion around Compass being built on Blackbird's Raven Risk large language model (LLM) and related APIs
- Examples provided of using Compass for real-world claims like "Is the earth flat?"
- Intention for Compass to help provide clarity and essential context to media content
- Discussion around target users for Compass - social media companies, comms agencies, journalists
- Explanation that Compass determines authority based on how authoritative sites reference each other
- Discussion around Compass having a framework for integrating with fact-checking databases
NEWS
- Discussion around the challenges and nuances of implementing guardrails for AI
- News segment on Google's Gemini model controversy over biased image generation
- Deals emerging between tech companies to sell data for AI model training, including Reddit-Google and rumored Tumblr-OpenAI
- Tyler Perry putting his studio expansion plans on hold due to the emergence of AI like OpenAI's Sora
- Analysis of benefits and risks of open-sourcing AI models
Hosted on Acast. See acast.com/privacy for more information.
Well, what's up, everybody? Welcome to another episode of AI Inside. This is your weekly show where we, you know, sometimes we talk to people who are doing amazing things in the world of AI.
Sometimes we just talk about the news, the biggest news that's happened the last week in artificial intelligence. All the times we have fun while we do this show. And I hope that you enjoy the show while we're having fun.
You're having fun too. I'm Jason Howell, one of the hosts joined as always by Jeff Jarvis. How are you doing, Jeff? Hi there.
I'm surprised the music was going on. I know. Yeah, sometimes I hit the wrong button. This is the challenge of switching a show while you're hosting it.
It's not always the easiest thing. Good to see you, Jeff. Good to see you.
Good to have you here once again. And this week we do have a guest joining us. Last couple of weeks we've been doing just kind of like newsy shows and we've been getting some pretty good feedback on that.
We're going to have that later in the show. We're going to talk about some of the big news of the week. But before we get there, we should invite in our guest who's actually a good friend of mine, a good friend of ours, Dan Patterson, who is just an all-around awesome guy, but also happens to be director of content and communication at Blackbird AI. How's it going, Dan?
Hey, great to see you both. I always learn something from the guests you have on the shows, but you two have such a rapport that I think the news shows and you two talking about AI. I learn stuff from those all the time as well.
Oh, right. Well, the point is, Dan, we always say that we're here to learn too because we don't know anything. Yeah. Me neither. It's a journey together. Yeah.
Exactly. We're learning right along with everybody. That's kind of the beauty of the show. And I think I've said this in previous episodes. It's like a selfish desire to do this show because I know that by doing this show, I'm going to learn a lot more about artificial intelligence. And it really seems like the right moment in the world to know as much as we can about this technology.
It feels like- What I enjoy about it so much is that one of these stories comes up. It's not the story we talk about so much, but the implications of it. Yeah. The context of it. What does this mean? That's just so much fun to try to figure out. Absolutely.
Because it's new. Yes. And figure out we shall, or at least we will attempt to do today. Real quick, thank you to everyone who's posting the reviews for the show. We love you. We love all the positive. I'm just hearing so much positive feedback from you all. If you haven't done it already, please give us a review and share your thoughts on Apple Podcasts or whatever you review your podcast. It really does help. And then finally, before we get into the interview to start things off, I do want to mention our Patreon.
Patreon.com. Sorry, that's the wrong one. I keep meaning to remove that.
I have deleted it officially. It is Patreon.com slash AI Inside Show. That is where you can go to support this show directly. You're supporting Jeff and I in doing this show on a weekly basis as we build it up from scratch, which we have the past month.
We're seeing some really positive growth right now. The folks in the Patreon are literally helping us directly and participating in the creation of this show. You get some special perks when you do that. And you also happen, one of those perks happens to be getting your name called out on the show. So this week, we're going to call out Abdullah, who was one of the first, I believe the third person to support us in the Patreon when we launched it a little more than a month ago for making sure that this show happens.
So Abdullah, thank you so much. Thank you to everyone who supports us on Patreon. Patreon.com slash AI Inside Show. Okay, we've got the house cleaning out of the way.
It's time to get to the fund, the meat of the show. We'll start off with Blackbird, which, as I mentioned, kind of in the intro here, Dan Patterson, you are working with a company called Blackbird AI. And you have a new product that has hit the market.
But before we kind of get to Compass, it's probably a good idea to kind of start with the mission, like the company mission. What is Blackbird AI all about? People who want to go check it out for themselves, go to blackbird.ai. But tell us a little bit about the foundational aspects of the company.
So, you know, as both of you know, because we've done shows together for years, but maybe your audience doesn't, I was a journalist for a very long time, 20 plus years at various media firms, large and small. And my role at Blackbird is kind of similar. I direct editorial meaning I run our blog, which is a balance between I spend time with our threat analysts, and our technical teams, the artificial intelligence teams.
And just like on the show, I feel very fortunate because I learned stuff every single day. The fundamentals are, we track what we call narrative threats or narrative attacks. And that's more kind of a simple way or a more simple way of explaining stuff you guys and your audience probably understand pretty deeply, which is misinformation and disinformation. And often, you know, when we say a narrative attack, we can look at say, you know, some of the cyber attacks that targeted casinos, or some of the the disinformation that followed Taylor Swift around through the course of the NFL season, through the Super Bowl, I'm sure we're all familiar with AI generated deep fakes and the other kind of toxic content that really was a big part of the Taylor Swift, again, narrative, right. And so we track the bad actors or the threat actors, we track the narratives and how they spread. I will share, you know, with you guys later on YouTube, that kind of helps visualize what we do, but you can literally see the pinpoints, the influencers, the narrative threads and how they spread and metastasize. You know, we can look at say, the information environment in Ukraine and kind of get a sense of who the players are, and the narrative threads that they're lobbying at each other.
There is always an information war when there's a kinetic war. So anyway, that's kind of the fundamentals of what we do. And through the fall and winter, we've been building this product called Compass, which is pretty neat. I know this is my own company.
So it sounds like I'm log rolling a little bit, but I truly I use Gen AI products and I have long before I worked at Black Bird and Compass is great. It is what we call a context check. So any claim, anything you hear, you know, that social media is full of claims, we don't try to be a fact checker. But what we do say is that we will provide context. So a great example would be say, you know, is the earth flat?
That's a kind of a test case, but you can pipe that in and get all kinds of really fascinating stuff. One of the examples that we list on the website is I don't know if you remember the Eiffel Tower meme that happened a couple weeks ago, you can drop in a link to a Twitter, a Twitter link, a YouTube link, a threads link, a blog link, and it will check the veracity of it and give you context along with citations. So just speaking as a journalist, the writer, reporter and a producer, this is a super useful tool, especially if you spend a lot of your time on social media, sourcing stories, it's really, really useful.
And we hope that in this year where, you know, there's 65 elections or something like that happening across the globe this year, including the United States general election. This might be a tool that helps provide some clarity by giving essential context to media content.
So Dan, when I saw the tool, I got all happy, which I'm sure you had.
Awesome. Yeah, I would imagine you and your students, it might be pretty useful.
Yeah, but even more than that, I mean, Blackbird is aimed at dealing with and understanding misinformation and disinformation. And I was involved in that world ever since 2016, raised a bunch of money to support things and so on and so forth. I've come to see more and more that those who think we can knock out this disinformation, like we gave it back in the mall, are fooling themselves. And we need to shift our attention to that, which is incredible that an authoritative and expert and so on.
And so looking at the compass, it was just so great because that's where it went. Yes, it'll tell you something is wrong. I put in there, I did Donald Trump win the 2020 election and it came back with a straightforward answer and said, no, and explained that. But it's also about going to trust an authoritative sources to give you good information rather than trying to knock down the bad information, giving you good information. So how does it do the sourcing? That's what I'm really curious about, because I think it's so valuable to try to identify those who know what the hell they're talking about.
Yeah, so in fact, I sat down this morning, I was in the office and talking with Vanya, one of our AI leads who helped build this. This was, Compass was like a pet project of his through last summer.
And it looks like that. It looks like we know all this stuff we could also do play. Like Google News was originally.
Yeah, that's exactly right. It was kind of Skunkworks and then he built it and then Paul, our other AI lead kind of joined on. And then the product teams get interested and we all were like, oh, that's really cool.
And just me as a journalist, I was egging everyone on like, yeah, yeah, we should do it. But it's so the short answer, Jeff, is it requires a long answer. So it works kind of like what Vanya explained to me today is that works kind of like, it's obviously an update of this, but kind of like if you remember Google's page rank, right? Where and it's obviously it's artificial intelligence and this constantly shifts and changes, but it determines authority based on similar signals that are sent kind of like how page rank was, which were a number of inbound links where it would count the number of inbound links and rank higher sites that had additional inbound links. So this again, it changes over time. It's pretty dynamic. And I think they're constantly updating the model, but that's how it's based on authority, Jeff. It's how authoritative sites talk to each other.
I should mention two things to you. Go ahead, Jason.
Well, I was just going to say who is this for like, if you had to pick like the target audience for a tool like this, like obviously everyone could benefit from this, because it allows us to flex the skill and to learn the skill of being critical in our thinking when we are presented with information and not just taking it, you know, full value and actually verifying things and everything. So everyone could benefit from this. But I see it as a tool that might have very specific uses for certain industries, journalism, obviously, as we're talking about, is that kind of the idea behind a tool like this, that it will be useful to everyone, but particularly used as a tool by certain kind of areas, certain groups.
Yeah, I think that's probably the idea, although I would imagine this kind of shifts and changes over time from a just a pure business perspective. And I don't run the business, so I can't really claim to speak for that component of our company.
But I would imagine it's a great white label product, right? Like think about trust and safety at social media companies and be really easy to slide it in as I think one way we're trying to frame it, this is kind of like the context layer of the internet, right? Like you could build this into a lot of different sites and it would add additional context through interacting with the site kind of like community notes was on the older version of Twitter, the Jack Dorsey version of Twitter, there was a crowdsourced community notes, this could fill a similar function. I would also imagine tons of comms agencies and definitely cybersecurity and threat intelligence companies. I think from a business perspective, there's probably a lot of uses, but really, Jeff, you hit it on the head, this was kind of a skunkworks projects and kind of jokingly, the team and Nasha or CTO were, they were so, you know, they genuinely into this as being a helpful product.
Just like you said, Jeff, it is kind of a game of whack-a-mole to look for instances of disinfo and point it out and a better way to deal with it is at least reliable information and context. And I think that Nasha was like, yeah, we're doing it for the love of democracy. Like we actually love the fundamentals and the things that we can all kind of agree on as well. I'm trying hard not to say real or facts or true because that stuff changes too, but things that we can find common.
Verified, yeah, right, reliable.
Yeah. So a lot of things come to mind. And I'm going to do journalism geek out for a minute. Yeah, please bring it. Bill Adair who founded PolitiFact, which is an important resource. He does a lot of work. When 2016 happened, the association of fact checkers got all involved. And there were efforts to, I like that you're used to the word claim because there was a lot of effort to catalog claims out there that were made in journalism, in political speeches, wherever, so that when they had been fact checked, you could link from the claim to that in a database.
Right, yeah. And so it strikes me that here, this connection would be really valuable with Bill and company and all the fact checkers is that when they've also checked something, that's a source for you, but also the existence of the claim says this is something you should check out with a click. And so I could see news sites putting this hyperlinking a claim by a politician or by anybody and saying, you know, look it up.
And there it is. Similarly, on the source side, somebody I worked with, the Connie Moon Sahad, who's now at doing some work at the Center for News Technology and Innovation, I funded some of her work earlier. She did work most recently with Wikipedia's because they go through a lot of choosing the right sources. Yeah, and they have tremendous discussion across the whole network.
So she was working trying to bring that work together. I don't know what the latest piece of it is. But there's other interesting you're right, this is not fact checking per se, but it's it's it's a cousin of it.
Yeah. It's a complex building. Yeah. And there's and there's a structure, I think, I mean, if you guys know these people already find if they don't be happy to introduce, because I think there's a structure into which this fits. Yeah. I'm glad you're thinking that way, because that's that was kind of the fundamentals that this was built on that I think when you say structure, that's maybe you're saying something similar to when we say the context layer of the internet, like there's a framework around this that right, it's context, it's a cousin of fact checking, and it can get along very well, but it also, you know, it's built on our API, and it can be found, it can be a framework as well as this product.
Given the we'll talk about the story, I'm sure later, the woke Gemini moment. How politically sensitive, I don't mean within company politics, I mean, national politics is a product like this, or you already having to deal with that anyway, because you're already saying this is false, or this is disinformation, right? You kind of have the thick skin as a company, to be able to deal with what to some people are delicate questions, not delicate.
What do I try to say? Controversial? Yeah, that's such a fantastic question, Jeff. Sorry, I've got a little bit of the daycare sniffles. So yeah, thick skin, and definitely, we're doing stuff that kind of, as a reporter, I'm like, it's political season, I'm like knee jerk.
Oh, a minor thing happened, somebody sneezed, let us write stories about this and try to pick it apart. But really what we're doing is sitting back and creating, well, not creating content, but we are gathering data around particular events. And if there is something that we think is really noteworthy, and we have to comment on it, we will. But our response to this particular political cycle, and the many dozens of elections that are happening this year, was to build Compass. It was so that we didn't have to write a blog post and pick out, that was a deep fake, and that was disinfo, and these were Russians, and those were who have, when you're doing that, you are playing whack-a-mole. So instead, I think we tried to build a tool and an API, a framework that could structurally handle those issues on a broader scale.
In a C, sometimes it feels like we're in a C of these attempts at disinformation, whether intentional or not, I guess by definition that would be intentional. But when we're surrounded by all of these pieces, these attempts at disinformation, how does Compass declare or decide on which of these to spend the time on?
Is it a systematic algorithmic approach, or is this something that a team of experts are monitoring and saying, all right, this is important, we need to be sure that we have some sort of coverage or some sort of information on this particular claim?
Well, we're trying not to put our thumb, I have humans put their thumb on the scale. Obviously, humans write code, and in this case, our humans did write the code. But it is algorithmic, like you said, and it is dynamic and changing.
Now we're constantly looking at it. Well, let me say, I'm not building the AI. Our AI experts who are far smarter than I am are looking at the code and kind of adjusting things that they go. But I think it is far more like page rank.
It shifts all the time, and the authority of something changes depending on how the internet and the authority on the internet changes. It's, yeah, that sounds a little like, hey, it's black box, just trust me on that, just trust me. But that is, the short answer is a much longer answer and more technical.
Mm hmm. Cool. This is built on a large language model?
Yeah, yeah. So first, we built the, we're calling it the Raven Risk LLM because it looks for risk signals.
So first, we did that just go out for you too, Jeff? Yes.
Oh yeah, that might be right. Yes. Yeah. So pick up from, so first. Yeah, I bump my XLR in it. That's okay. So first, we built what we called the Raven Risk LLM. It's just a product name for it, but we built an LLM that this can rest on and then APIs based on that. And this is a product based on those APIs.
And it's also calling to current information?
Yeah, I think it's pretty, yeah. I used to put it, so my nickname for this was Lazy Checker. Before we had a product name, I would put this right next to Google News. And in the morning, I get up at 4.35 and I exercise, take the dog out, drink coffee. And like, I'm a news junkie. I want news in the morning. So I've got Google News next to the New York Times, Poc404 platformer.
And I also had Compass there. And anytime I would see a claim, I would, like, you can either go to Google News or Google and like, look this up and spend some time. Or I can just put it into Compass and then drink my coffee. And it is slower. You will notice like, Compass doesn't instantly give you results. It's because it is shifting or looking through the shifting and kind of dynamic range of authority. But by the time I took a couple sips of coffee, it would have a bunch of news and information for me that provided context to a story.
I know that sounds like I'm kind of hyping our own product, but I just called it lazy checker because I could just sit here and be kind of lazy. And I didn't have to like find those links. This found those links and I could still go and read the articles.
Yeah, it freed you up to drink your coffee. I mean, it was a lazy checker.
Yes, I think it's I think it's really, really promising. Do you have any idea what is going to be made public? Now that Jason and I are privileged to be in it.
I don't. Although I would say like anyone, if you're listening to this, I don't use social media heavily, but I have accounts everywhere. So just at me, go to compass at blackbird.ai and just reserve a ticket and then at me on a social platform and I'll talk with our team and get you approved. Sweet.
Cool. Yeah, it's really cool. Everybody should definitely do that. Look for Dan Patterson and check it out for yourself. It's a really neat. I call it a tool, but I realize it's a service. It's not a tool. I suppose there's a difference there, but it can be used as a tool for you.
There you go. Yes. Yeah, there was my lazy checker tool. There you go. It's the lazy checker also known as well. Dan, that's that's excellent. Thank you for telling us all about Blackbird's compass and blackbirdcompass.com.
And you're going to stick around for this kind of second half of the show to talk a little bit about the news. Because as you mentioned, Jeff, you mentioned earlier, Google's Gemini. This has been the week of Gemini. It seems like every week there's like a particular thing. The previous week it was Sora and the ripple effect that that's had. This week has been all about Gemini and things got really political and interesting around this. I don't know. Give us your thoughts because you actually wrote about this, Jeff. Yeah, it was it was
weird to watch because the problem was that Gemini basically couldn't be forced to show white people. And when asked to show, for example, the founding fathers of America, there was a black person at the table and asked to show popes in the past.
It showed a woman pope and a Native American pope. And clearly what it was doing was was being told in its guard rails, watch out for bias. And when you're asked for doctor, don't make them all white men. And it did the same thing with popes and founding fathers. And there was some really weird stuff like black Nazis, which of course makes no sense. But as forever, let's point out that large language models and generative AI have no sense of they don't know what a Nazi is.
They don't know what black and white and red and blue are. Anyway, there was a fewer. And this time it came from the right, with all kinds of people writing, oh, my God, it's woke, Gemini, and it's anti white. And much fewer came out as a result.
Google pulled it down and the image part down. And then people found other things in the text. So I'm watching this as a spectator, as we do. But a few things occurred to me. And I just did some tweets as I went.
And then I turned that into a medium post real quickly. It is number one, again, unlike compass, which does deal with facts and does find credible sources, large language models on their own as the model, unless they are given a certain corpus of data to call from, they don't know facts. They don't do facts.
And so what occurs to me is, I've said this before in the show, I almost wish that generative AI had been presented as a creativity machine. None of this is real. Have fun with it. Make stuff with it. Based on everything that everybody has done before, see where society is all wrong and screwed up based on what we've done before. But that's not the way it was presented. It was presented as part of search engines.
You put in a query, you're going to get something back that's going to be reliable. That's not true, folks, but that's part of the, I think, fraudulent presentation of them. So that was one point. The other point that struck me was, and I've talked about this before too, but this really became an illustration, is the impossibility of guardrails. That people expect the, the, we've talked about this a little bit on the show before, they expect the model maker to prevent bad things being done with them.
But if it is a general tool, then it will be, could be asked to do anything. And there is no way to predict everything bad that everyone could ever do. But because that pressure was on, that's why Google and Sides said, don't shout too many white people, make sure to be diverse.
Don't just presume the bias of society that white men are doctors and white women are nurses or whatever. And so, again, it applied that in a way that was seen as problematic. But I also think it should make us stand back and say it's just a machine. It's just a tool. It does what we ask it to do.
Including those guardrails because people demand it of them. And every decision like that has implications. So I think that we're in a real simplistic stupid period when it comes to AI.
One more point that I'll shut up. Julia Angwin, who started the markup and before that was at the Wall Street Journal, who I've had complaints about because I think she started the entire moral panic around cookies when she was at the journal. And she attacked Section 230 in the Times recently, which doesn't make me happy.
So she's now started a new venture. And today she had a story which we didn't put in the rundown saying that they cataloged all the mistakes that these models make. Well, we know that there's no news in that we know they're wrong. If you ask 100 more questions and they're wrong 20% of the time, then you'll get 20 more wrong answers.
Or if you ask a thousand, you'll get 200 wrong answers. We know that. And I don't think it does much to say it's wrong because we know that there are other things we think AI can do, like summarize and translate and transcribe. And I wish people would put effort into testing that and seeing whether that's reliable. But we're in the stage of moral panic around AI right now. So any weakness like being woke is going to cause fures and they'll be congressman. We go nuts about this.
And when I testified before the Senate, Marsha Blackburn went after me because she could get a poem about Joe Biden being wonderful, but couldn't get a chat if we need to make a poem about Donald Trump being nice. That's the guardrails. And so I found this to be a fascinating little episode. Sorry to go on so long, but there was a lot of stuff happening in this.
Yeah, Dan, I'm curious to know what you think about this, about the argument around, you know, guardrails. I guess it's not as easy to say guardrails good or bad. Like in my head, when I think of guardrails around this technology, I'm, and I think I've said this on the show before.
I'm of mick, I'm of two minds of it because on one hand, I completely agree with you, Jeff, that like maybe that's the wrong place to, to focus the attention at the time. At the same time, people want, you know, want technology companies to answer for mistakes that do happen. And sometimes those mistakes can be damaging. So the effort should still be there, even though it shouldn't be the only thing that exists in this conversation. But what are your thoughts, Dan?
You know, if ever there was a time to better or to have patience and try to better understand the nuance, it is now. It's, I think, exactly like you said, this isn't good or bad.
It's not a binary one or the other. And right, we do risk having kind of a moral panic and an outrage over something that we just have to understand is an emerging very powerful technology. Love it or hate it. It's a huge component of our economy and our culture now. And we should have patience and try to better understand the nuances of whether we are technologists or not. It's also totally unreasonable for me to expect people to have patience and nuance.
And, and, you know, it might be kind of privileged of me to say that too. A lot of people don't have, you can't just sit here and say, I'm going to contemplate technology. They are impacted by technology. I think it's a challenging time. Yeah, it's really hard to educate about this stuff. I have to learn about it every day. I've got to figure out these implications, but it's, it's, it's easy to over simplify it. Yeah. Yeah, indeed.
Well, that is just one of, although that was probably one of the bigger stories that I think was reverberating in the world of AI over the last week. I thought this is interesting companies making deals with their valuable data to allow for the training of AI with that data. And there's a couple of examples here, one that I put in here, one that you put in Jeff and I'm happy that you did because I wasn't aware of it before you did. But the one that I had recognized was the Reddit partnership with Google announced last week, giving Google access to Reddit's data API stream. It's historical data, as well as its real time content. And just, you know, it's interesting because if you remember last year Reddit had implemented new API changes to really to pinch off its data being used for things like training others. AI models, and they said back then that it was an opportunity to really explore new pathways for generating revenue for the company and what do you know, here in 2024, here we are. So Reddit sharing its data stream with Google and that partnership Reddit also actually gains access to Vertex AI, which is Google's service that aids other companies and their search result quality.
So Reddit is getting something out of this as well. And then Jeff, you put in here that Tumblr and WordPress, this is not announced, this is not official, 404 media have sources or a source at least that says that there's a deal in the works Tumblr WordPress separately, planning to sell data to both mid journey and to open AI. And like I said, that's unannounced, but starting to kind of see a little bit of opening up, I suppose on the, on the, you know, in the hands of the companies who, for the last year, at least many companies have been saying, Hey, wait a minute, that's our data, you can't just take it and do what you want with it.
And now there's a little bit of a payday happening and kind of, I don't know, feels like a little bit of an about face, or is this just kind of expected like maybe their point all along was hey, you can't have our data, unless you give us something for it.
And this was the subject of my testimony in the Senate is that they want to try to get even freely available, searchable content. Everything's copyrighted. So the fact that it's copyrighted is now they're here and they're there. Some years ago, where you no longer had to file to copyright something so anything's copyrighted. And so you must pay us for it just to read it.
And we'll get to this in the next story about New York Times in a minute. Is that the case? Is that a necessary right? I'm not sure that this raises a few issues. One is, it sets a precedent, which worries me a little bit is that everybody out there, every little blog is going to think they're making a fortune from teaching models which don't really need that much to teach them when you get down to it.
It's about learning grammar and words and historical associations of words and so on. Second is the users of Reddit and Tumblr and WordPress and I've had a WordPress blog. I would think wouldn't necessarily be very happy right now on the one hand, because these companies are enriching themselves with our stuff. On the other hand, WordPress is cheap and Reddit is free and how else they're going to make some money. So, gee, okay, so they get to make some money on our backs and it helps make the service free or cheap.
Okay, it's probably hidden somewhere in the terms of service anyways that most people don't read.
Exactly. But there's no negotiation of that. At some point, do you need a council, do you need a governance structure for WordPress? Do you need a governance structure for Reddit?
Every subreddit in a sense has that. But does Reddit out of a hole have a voice in these negotiations? And right now they don't. And that's, I think, an issue. Same time, Meta is learning from everything on Facebook and Instagram. Let's be clear.
Yeah, for sure. So just so happens. Yes, exactly. It just so happens they have their own AI models that they're feeding their own data into, which I guess to a certain degree was kind of what I thought Reddit was going to do. They've got this sea of data, this really rich source of data and information. I just thought, well, everybody's got their own AI model that they need data to train. Like, why wouldn't Reddit also just announced its IPO?
Yeah. So getting a new source of $60 million is very helpful for that.
That's true. That's a really good point.
Yeah, ding ding. Yeah. Indeed. Jeff, you want to set up this open AI lawsuit? So this was fast pushing back anyways. We've talked about the New York Times be suing open AI for stealing its stuff. Allegedly. It says in the openings of the suit that news has always been protected by copyright, which is false.
It wasn't until 1909 that it was. It contends that open AI absorbed whole articles and then spat them back home. But even at the time, people looked at this and said, one of them was a review of Guy Fieri's restaurant and the paragraphs that were quoted were quoted because they were so nasty were quoted all over. So there were tons of places that open AI could have gotten it. And then as people also speculated, they said, well, you know, there's ways that basically the time is trapped open AI into the point where the only next word could have been the one from the articles.
We understand it. Well, so open AI responded with a filing on the suit. And I thought it was well done.
And I said outright that I'm on team open AI on this one. They said the allegations of the Times complaint do not meet its famously rigorous journalistic standards. Isn't that a lovely little dig? The truth will come out in the course of the case. But what they say is that is the Times hacked open AI.
Probably not a great use of the verb. But their point was to say that that's that that a company they hired had to do thousands of attempts to get it to do what was bad. A, B, it was a violation of open AI's terms of service. See, nobody uses it this way. So you're suing is if it's a violation that's going to happen.
Nobody else is going to be able to do this. And D, yeah, it was a bug and we fixed it. So they talked about hired guns. And they said the real question here, which is right, is whether it is fair use under copyright law, I'm quoting, to use publicly accessible content to train generative AI models to learn about language, grammar and syntax, and to understand the facts that constitute humans collective knowledge.
That is to say, our comments. And the one hand, oh, open LA is worth billions because of this. On the other hand, it's feeding us back to us in useful ways. And the times I saw a story somewhere that that some large number of companies have pulled off of being scraped by AI companies. Well, that's going to affect the quality of the AI that we're going to end up using.
I'll be curious to hear whether this is discussed often with Dan's company. But at some point, the courts do need to decide fair use that fair use important, famously has no explicit definition. That's the point is you negotiate it. Larry Lessick said fair uses the right to hire an attorney. And that's where we are with the times. I thought openly I did a good job.
I think they made a pretty strong case. What do you think, Dan?
You know how we were talking about nuance a few minutes ago. Yeah, I think this is just challenging. I can understand. This is really one of those cases where I can understand the times argument and I can understand the argument. And I'm a journalist, but I think I probably lean towards being convinced by open AI. So I really don't know that I have any insights here. Other than I think this is fascinating and it's just watching. Well, we live in an interesting era and interesting times.
You know, Dan, whether your company does it's my friends at Cromwell call don't like to call scraping their own crawling of sources out there, or do you use things like common crawler? Do you know how you're navigating these questions of using material on the public web?
Yeah, I know that we're navigating it by having ongoing conversations and I think a pretty open ongoing conversations. But I also know that we have our we like every startup we make data sharing agreements like we buy the data that we use. And we buy it under like very traditional and very legal, very sound like we use best practices.
So I don't I don't know the totality of what we do, but I think that we try to adhere to is not just legally but morally and ethically practiced as possible.
I imagine at some point it becomes a challenge for every company if the notion of AI gets cooties and they're all bad, we're not supposed to touch it. Then you affect things like compass and its ability to fact check news and to use that news as sourcing to underlie the credibility of news. There's a benefit to being included.
To be fair, we're not calling it fact check on purpose because we want to make sure that it's context and facts can evolve and change depending on who you talk to. At one point it was a fact that the Earth was the center of the universe. Air quotes fact. Right. You know, Jeff, what I'm really interested in right now is this idea that models need these massive training models and at some point they will have to just recursively train on AI generated content.
It seems as though we are kind of in this. Did it drop? No, no, no, no, I'm agreeing with you entirely.
Like, yes. There's probably in the short term going to be some sort of cold war between model companies, AI companies and trying to secure these deals, whether it's with the New York Times or Reddit or Automatic or whomever. There certainly will be this short time where everybody gets a small advantage by acquiring better data or more data. But at some point they'll run out. There will be the entire corpus of human knowledge stuck into these systems. And then what do they train on?
Yeah, themselves. My friend Matthew Kirshenbaum from University of Maryland and English professor there wrote a piece in the Atlantic, which I've mentioned often. She talks about the text apocalypse, that when that happens, when it's recursively training on itself, it ends up in a gray goo. And it strikes me that CommonKarl and I are holding at the end of April as we were talking about this. One of the sessions that I want to have is we talk so much about what to take away from AI, what to take away from the web, what to put behind paywalls, whether that paywall is for readers or for machines. But the truth is the web is missing all kinds of stuff. What we should be talking about is what to add to the web. Communities that aren't represented there, languages that aren't represented there, knowledge that isn't represented there.
You know, the Wikipedia ethic of saying what are we missing and let's go get it and add to it. And so I think we're in a phase of self-destructive subtraction where we should be in a phase of generous addition. I like that optimism or maybe that's pessimism.
I try.
You will fall to do. Shifting gears just a little bit. I saw this story earlier and I guess this would qualify as pessimism versus optimism. Tyler Perry, you know, famous actor, director, I mean, commands an entire studio of content over the last couple of decades, had been planning a major studio expansion in Atlanta. And I'm talking like to the tune of $800 million or more over the past four years been working on this. And had a had an interview with the Hollywood Reporter last week that the project is now on hold. This comes after the news of Sora was announced, or is that the video generation app that we talked or service that we talked about a couple of weeks ago from open AI. His interview when he talked about this stress the emergence of AI like open AI Sora as a big reason for the postponement of his plans. Well, at the same time he also expressed that he's not opposed to AI. In fact, he's relied upon it in some of the production tasks on upcoming projects that haven't been released yet. But but about Sora he said, it makes me worry so much about all the people in the business because as I was looking at it I immediately started thinking of everyone in the industry who would be affected by this actors grip electric transportation sound editors and looking at this.
I'm thinking this will touch every corner of our industry, which I mean, I think he's probably not wrong. But I don't know if it necessarily means that all goes away. Maybe it just changes things. But I mean, this is a really big shift with a lot of money, you know, behind it. Some people have have thought, well, maybe he had plans to postpone and this was just a very convenient reason to do so. I don't know that's all conjecture. But I don't know what did you what did you think of this story? Dan, what did you think?
I think, you know, the most cynical take would be that but I also think that it could be pretty straightforward. Yeah, look, if I'm and I mean he's, I think he's a very well liked guy and he's obviously a smart genius and brilliant business person.
And I think he probably is smart enough to just do a cost benefit and say, you know, look, a lot of the what I would provide as a studio could be AI threatens that business and just maybe a sad but smart decision. Yeah.
Yeah, I think you're right.
Yeah. So I'm working on a book about the line of type. Just a little test here. Dan, do you have any idea what a line of type was?
No. This is why I didn't learn on a typewriter.
This is why you need to read. Yeah, this is why you need a book about line of type.
Yeah. I read it. So what's was I'm holding up for those of you listening, I'm holding up a piece of type. So all type and all languages are set one letter at a time to line of type.
A name of the manufacturer of a line at a time. Now the type centers who were doing this a letter at a time did say, oh, that's us. But the interesting thing that happened was that the union said it's inevitable. It's coming. So we must take charge of it. We are the ones who understand how to set type we understand the rules of grammar and type setting and aesthetics, and you must give us authority for these machines. And they lost some jobs for a while, but they recognized that publishing would explode.
And with it, they would get more jobs than they ever have. That's what happened. And so I wonder whether the electricians, I don't know, though they may be putting in servers somewhere.
I wonder whether some of these trades can in fact expand because there can be more creativity, more things can be made at less cost with more talent, more solutions. Yeah, and I just I just think that there that's the way to try to look at it. And I know it's easier said than done. And it's a power game and a money game. But I think leaders in these fields, I hope should be looking at ways to expand with those stuff. Because what else what choice do they have it is pretty.
Yeah, but it's a big gamble right when you've got $800 million to sink into a project and suddenly you recognize as a business person. Oh, wait a minute, here is a technology that at least right now I get the feeling that this technology is big enough and powerful enough that it could do a lot of the things that I'm building out this infrastructure and building up this team and paying people lots of money to do at a fraction of the cost. Is it wise? Is it smart to invest the money this way right now, given what I think I know about the future. And so it's a total gamble right because he could be completely wrong. He could also be completely right. You just don't know at this point, but it's a lot of money on the line. And I think it says a lot that that you know if that is truly the reason that he's shutting down the project that really says a lot about the capabilities what we're seeing with this technology right now.
Stanford University's human centered AI research center with a paper that analyzed the use and need of open source AI models, things like llama to stable diffusion Excel. This was your story that you put in here, Jeff.
Yeah, I haven't read the whole paper yet. But when I went to a the World Economic Forum's AI governance summit in San Francisco some months ago, there was a lot of discussion. I would say the greatest area of disagreement in the room was around open source AI. Andrew Ning folks like that were saying, I can't believe we're having a debate about this we've got to have open source AI. But other people in the room were saying, oh, it's dangerous because once these models are put out there that whatever guard rails see our earlier discussion are can be can be certain vetted.
And oh my God what dangers can happen you can't control it. Well, a I don't think you can really control the model level anyway, but be there's there's there's argument about the benefits of open sourcing so this paper from from like 25 scientists, including people I really respect like run Chaudhury went through. And then there's a summary on the axios reports that there's also a summary on the Stanford site, where they look at issues of what's the existing risk, what are the existing defenses what's the marginal risk. What are the uncertainty, but then they also came around. And they looked at the benefits of open source that it increases innovation, it enables scientific research because you're not just in the hands of the big companies that academics and startups can use it that you're enabling transparency which open source is so important about accountability and transparency and technology.
You're mitigating monoculture and market market concentration, which is to say that open AI and the AI boys end up in charge of everything. So, you know they come around and look at and again I'm looking at the summary rather than paper. How to try to deal with these issues going forward. I think it's an important issue to understand we've talked about the show before. The desire to have these points of control the desire to have the guardrails the desire to have regulation, and how much we can and can't count on that and being open. I'd about it as we go so that's as much as I know from the paper I just thought it was worth mentioning as something to keep an eye on, and I'll find the actual link to the paper so you can put it the rundown Jason.
Excellent. Yeah, fascinating stuff I think. In kind of my reading through at least the the Axios article because I also didn't read through the full research. Just the researchers kind of pointing out that and I've heard you say this many times, Jeff that you know the risks like disinformation like you know scams and bio terrorism was another thing they called out. All that stuff existed prior to generative AI, it wasn't created with generative AI. And you know they recognize that AI amplifies potentially accelerates those risks but it's not the creator of them.
Well it takes us back to Blackbird to encompass I think, in that what I hear all the time people say oh my God, deep fakes AI, we're going to have a worse disinformation problem. And I was talking to a Norwegian podcaster this morning about this, and I said okay I can lie today without AI. I don't need a machine to lie. I can tell you now that vaccines are dangerous, which is not true folks.
And yeah I can make a pretty video about it easier and I can make more copies of something easier, but disinformation and lying and ignorance have always been there. And so it's interesting when Dan talks about what Blackbird does, it's looking for those vectors. Yeah. Distribution. That's the, it's not a volume game, it's a spread game. Am I wrong, Dan?
Yeah, I mean I would say it's probably also is an amplification game, but what type of amplification by what actors wear for what purpose. And you're absolutely correct, you know you could lie before, all of these tools can enable bad behavior that existed long before the tool itself.
You've used the two most important words I think that it took me a long time to learn is there's the ABC framework, actors, behaviors, content. Yeah. And the discussion tends to go after the content. We've got to kill bad content.
No, you'll never win, number one and number two. People use actual facts, but they mislead with them. It's the actors and their behaviors that matter.
So what I hear Blackbird is doing is recognizing those actors and their behaviors and how they spread and are amplified. And that's the smarter path to try to deal with this problem.
I think it is, I mean that's nice of you to say, but I think it is also like when you spend time thinking of it, thinking about the challenges and then again the nuances of the challenges. You probably Jeff come right back to that conclusion and I would imagine that's where a lot of smart people come or land. Yep, I agree.
Well, Dan, pleasure having you on the show today. Compass.Blackbird .ai is the site of the new service, the Compass service that we were talking about earlier. Earlier, people want to go there. You can click that button right there in the middle. It says reserve your seat or as Dan said, you could find them on socials and who knows, maybe you'll hook you up. By the way, I'm not going to make any promises, but
reserve your seat is a nicer way to say join the waitlist.
Yeah, I think we used to say join the waitlist, but it sounds, we didn't want it to sound a feat.
Yeah, I like this. There's a certain generosity to it.
Yeah, I mean, it really was going back to the beginning of the conversation like this really was a passion project and we want the language of the site to reflect the fact that we care about this. It's not just some cynical tool to join the generative AI crowd. Blackbird has been in AI for a long time. This is just something that everyone was really excited about building. Nice.
Well, thank you for coming on today and sharing that and Andrew thought about the news stories and just hanging out because I also consider you a friend. So it's really great to have you back. Yeah, it's really good to see you. Thank you, Dan. Jeff, I think I know what you want to want me to plug or want to plug on the show. It's a Gutenberg parenthesis.com.
If I had to get both the good word parenthesis and magazine with discount code still. So thank you very much.
Nice. And you've got more on the way. You're a busy guy.
The web we weave coming out this fall from basic books.
Dang, I don't know how you do it. Honestly, I really don't. I don't know. You don't know either. You just do it at this point. It's just part of your life. Well, I'm really looking forward to what you have coming down the pipeline. Everybody should check it out. Gutenberg parenthesis.com. As for me, I'll just plug yellowgoldstudios.com, which as you know, maybe you do or do not know, but AI inside is a product of yellowgoldstudios. And you can find if you want to watch us do this live, you can. You just if you go to that URL, that actually redirects you to the yellowgoldstudios channel on YouTube where we do stream this show every week live.
And there's other stuff hitting there as well. I'm doing a comparison of the one plus watch to the sequel that just came out earlier this week to the original to kind of see what the differences are. I've got a lot more plans. So yellowgoldstudios.com. Appreciate your support there. Everybody, thank you so much for watching and listening to this episode of AI inside. Like I said, we do record usually every Wednesday 11 am Pacific 2pm Eastern. For those of you who go to the channel to watch it live. If you just want to subscribe, you know what, that's what most people do. So, AI inside.show is the web address to subscribe to the show. If you want to support us directly, well, you can do that as well. As I said earlier, patreon.com slash AI inside show that support comes to us directly to really fund the creation of this show on a weekly basis. We really appreciate you.
And then outside of that, just follow the show on all major socials at AI inside show. And really, that's that's all we got for this week. Thank you so much for watching and listening once again. We'll see you next time on AI inside. Bye everybody.