In this episode, Sven Størmer Thaulow, EVP/Chief Data and Technology Officer at Schibsted, joins Jason Howell and Jeff Jarvis to discuss the creation of a Large Language Model (LLM) for media in Norway, the importance of aligning AI models with cultural values, and the potential for American media companies to collaborate on creating English language models. The conversation also explores the use of AI in generating article titles, providing a different user interface, and experimenting with an AI agent for tech reviews.
INTERVIEW
- Introduction to the guest, Sven Størmer Thaulow, EVP/Chief Data and Technology Officer at Schibsted.
- The work Sven and his team are doing at Schibsted.
- Background on Schibsted and its success in the news industry.
- Schibsted's successful subscriptions in Norway and their high-quality news products.
- The creation of an LLM in the Norwegian language and Schibsted's role in it.
- Different attitudes towards AI and language models in Norway and the US.
- Schibsted's AI strategy and their work with AI in the recommendation system space.
- The importance of data and language data for AI and language models.
- Efforts to build Norwegian and Northern Germanic language LLMs.
- The need to align AI models with cultural values.
- Potential for American media companies to collaborate on creating English language models.
- Sven's experiment with using their LLM to generate article titles at Schibsted.
- The use of AI to provide a different user interface and relationship with content.
- Experiment with an AI agent for tech reviews.
NEWS BITES
- Google's new Lumiere AI video generator, Video Poet, can create stunning clips by using space and time together.
- Microsoft's changes to its AI text-to-image generation tool in response to reports of people using it to create nonconsensual sexual images of Taylor Swift.
- OpenAI and Common Sense Media have partnered to address concerns about the impacts of AI on children and teenagers.
- Keep It Shot is a tool that allows Mac users to rename, organize, and quickly find their screenshots using AI.
Hosted on Acast. See acast.com/privacy for more information.
Hello, everybody, and welcome to episode two of AI Inside, our new podcast on all things artificial intelligence. In this world, over time, we will talk about everything. We will leave no stone unturned.
I think it's going to take us a while to get there. Who is us? It's me, Jason Howell, joined as always by my friend, Jeff Jarvis. Good to see you, Jeff. Hey, boss, how are you? Good to see you. I'm doing well. It's great to see you too. I'm really happy.
Like, you know, episode one went off without a hitch. We've received a ton of really great feedback. Yep. Thank you all out there. Yeah, thank you to everyone who is really helping us kind of kick this show off just real quick related to that.
If you like this show, you know, we are in the early stages of this show. So spread the word. Give us a review on Apple podcasts or wherever you can review your podcast. It really does help a lot, especially in the beginning, as we're kind of building the momentum and and also you can support us directly if you want to. Instead of just subscribing to the feed, go to patreon.com slash AI Inside show. We have some extra perks for people who want to contribute directly to us in the production of the show. And we've got some really great ideas for that.
So check that out. That's the housekeeping. We got it out of the way.
It didn't take very long. Why don't we get to our amazing guests this week? Sven Størmer Thaulow is the EVP and Chief of Data and Technology Officer at Schibsted. And Sven is here with us to talk all about kind of creating an LLM for media in Norway. Sven, it's really nice to meet you. Nice to have you on the show today.
Thank you. Really great to be here. Looking forward to this hour. Yeah, yeah, we've got some really great stuff to talk about. I mean, last week we had Rick Screnta on from the Common Crawl Foundation. That's right.
Common Crawl Foundation. And we talked, you know, it kind of tied into Jeff's visit. Jeff goes to DC a couple of weeks ago. And you were kind of part of part of that soupy mix as well. And Jeff, I got to say, you made a very strong case for Sven because Sven and his team are doing some really cool work with Schibsted. Tell us a little bit about your, Jeff, your interest in kind of this story. You know, let me give you a little media background here. Yeah, Schibsted is and I do not exaggerate the most admired news company and cause of the most jealousy in the news industry in the world because they've been successful and they figured out the internet better than any other company I know of online.
And they've done that in various ways. They have successful subscriptions in Norway, very high penetration, very high subscription rate versus other countries. They have very good news products.
They've built other news companies. So I watch Schibsted regularly. A lot of news people do and they wanted to go visit Schibsted all the time. Finally, Schibsted enough with you all, you visiting fire people.
That's it. When I was in San Francisco for the World Economic Forum, AI Governance Summit, I was standing there at a cocktail party where I knew no one talking to a very nice Norwegian executive. And we're talking about LLMs. And he said, oh, you know, so we're Norway being Norway. All of our stuff is digitized and we're building an LLM in the Norwegian language. And Schibsted is leading the way to get other publishers to join in.
I said, aren't they all complaining like they are in America? They said, oh, so I looked it up and I saw Sven's address to the Nordic Media Conference doing just that, encouraging publishers to join in and lease the research phase on building a Norwegian language LLM. And it shows such a different attitude to what we see in the US where publishers are trying to clamp down and expand copyright and shrink fair use and they're getting all hostile with each other. And it's just different in Norway. So I wrote a post about this and I ended it with saying, why can't we just be like Norway? And when Jason saw, Jason, when you saw that, you said, can we get Sven on? Yeah. So Sven is here.
Thank goodness. We can we can talk about the LLM and Schibsted's AI strategy and more if I got it right, wrong, Sven, by all means, say so. No, I think it was pretty, pretty accurate. Absolutely. Let's so let's dig into it.
Yeah, let's dig into it. So first of all, I guess your when did you join the company? This was what five years ago and when you joined was this a glimmer in your eye at that stage or kind of how did how did this all kind of take place? And if we look at the kind of the the the timeline of the last five years, five or so years since you've been there, like how did this all progress?
How did it begin? So I mean, Schibsted has been working with AI in its more larger sense for about 10 years, right? So but that's not been on the gen II naturally. But since we live up the written word, I mean, language models and the related technologies that has always been a focus for us. I think about I joined about the two nineteen and very soon in around twenty twenty one, we started on a strategy project that we call a RISEN, which was for the Future Institute with Amy Webb, but not team, trying to look a bit further into the horizons of the major trends and using their proprietary way of doing foresight work, which was very helpful for us, actually. And we looked into not only technologies, but, you know, more on generic level of trends that can be macroeconomic trends, politics, it can be technology and so on and so on. And one of the all of the technology trends that would not like pointing directly to AI, which we've been working with for a long time, particularly in the more the recommendation system space, right? Pointed to consumer attitude and things that we thought were important for our company were the underlying enabling technology was AI. So we decided to double down and create, you know, our own central team that was into AI and future technologies, which we didn't really have at that time. We actually had shut it down because we didn't really succeed, was, you know, hard to get these experts to really interlock with our business.
But now we restarted it. And that was 2021, 2022. And then we started to fumble around, you know, with language models and we went into research collaborations and so on with universities, which traditionally ships to as a media company, even though Jeff says, you know, you know, really are in the forefront among the media companies, we will have never really had a relationship with universities. And we went specifically then very, very targeted into AI research centers that were funded by the Norwegian states. So that was kind of the beginning of it, I guess, when we started. And then where so, so the reason I called you was because of the effort to build the Norwegian and Northern Germanic language, LLMs.
If you could explain, as you did to me before, that process of the research to the phase, to the commercial phase and how what you said to a fellow publishers and how they've reacted. Yes. So we went there. There were two research labs and one of them were in Trondheim, where our largest technical university with the head of the research lab is called you not legalize a professor in computer science and AI. But he's always into linguistics, which is quite normal within the generative AI space. And I asked him when we started, you know, it's like an eight year run to start a lab like that. And I asked him because I was the chair of the new, this new lab. And I asked him as the head of lab, I said, you know, what kind of dent in the small, can we do in this large universe within that time? And he said, well, we need to make an original language model.
And I said, well, at least let's do that. I mean, we do a lot of other things because Norway is really into energy and windmills and, you know, petroleum and stuff like that as well in that AI lab. But so he really went into getting that done. And of course, that was before open AI and chat GPT and so on. And after a while, we understood, you know, that, you know, the data is really key for us and the language data is really key.
And you start really figuring out how do we get hold of that data and how do we get the rights to use that data? And then one of the key players in a small country like we are, it's a small language. No, it's not English is to get hold of the the Norwegian data that has high quality. And one of the sources are the media companies. So what we wanted to do was then first to get the media companies to to give us the rights to use the data for research purposes, to actually prove that our hypothesis was that, you know, that these models can perform better in our native language than the big frontier models. And that's what we did. And that's what you watched there, Jeff, in Bergen.
I think that was last year. When we tried to really argue towards the other media companies, you know, that this is important for our culture. It's important for because language is an important part of any culture. And particularly in small countries, it's super important. Secondly, it's super important to align these models according to our values.
And our values are different than other countries' values, at least the region and the Nordics. You know, Jeff, you mentioned before we started this thing that you knew the editor of Poston back in the days, and he was the guy that actually really confronted Facebook when they deleted the image of the napalm girl. Do you call that? Yes, as Michael Hansen put it up, it would have been taken down from an artist and then he put it up to kind of daring Facebook. And Facebook took it down and he couldn't get anybody at Facebook to say, this is journalism, this is not pornography. Yeah, right.
There's a leader in the world in trying to change the relationship of news and Facebook. So in that respect, you can see the language models as the same thing, right? It's like, who decides what kind of values the language model has. And when you think about language models as infrastructure, for example, educational tooling and textbooks, how important isn't it really to align those models according to the values of the country?
So that was another argument. And the third one, which is a bit more, I mean, on the long term side, we don't really know, but how important infrastructure are the LLMs going to be for countries? So let's say that the country uses an LLM from something that is not an ally and it's instrumental into all services in society, even in the services of the states. Is it OK that all these things are run just like cloud somewhere? Or are there particular use cases where you would like to have some more control over that infrastructure? And I'm not talking about the frontier, massive frontier models.
I'm talking about more foundational models that are used for your own language and for certain purposes. And then, of course, in general strokes, I think we all said, you know, this is a part of our responsibility as media companies in a small country like ours. I mean, we live of the written word. We generate a lot of of content. And it's really amongst our responsibilities to make sure that, you know, the language that we have thrives in a new digital space, which the generative space actually is a big, long story.
But that's at least how we approach it that time. So I've got tons of questions, but you also explained to me when we talked before my testimony that you're in the process of making a deal for the purposes of research. That's going to be separate from a commercial use and probably then revisiting the publishers and revisiting the business model at that time. One of the discussions that we're having in the US is about the use of content for the purposes of training a model versus for the purposes of output.
And a lot of us see training as fair use because it is transformative, because it's just the right to read and learn and be taught. And no, I don't think the machine is anthropomorphic. I'm just using that in a general way. The company that runs the machine has a right or the institution that runs the machine. And so that's what that's different from the machine being asked to quote something and quoting it if it does not have the proper rights. So do you see that separation between training and output?
There's various words that are being used around that. And is the interpretation of your fellow publishers is the use of their content for this research in essence, use for training and the commercial and the output and will come later? The two step model we've been thinking about is that first we release the content, which we think is really our property. For trying to build this LLM and see how it performs in our language for research purposes. If we improve that this copyrighted content is important for making good quality LLMs in Norwegian or in the Nordics for that sake, then we need to figure out the commercial terms.
So I think that's kind of a two step model instead of just going straight to the really difficult part, which is the commercial part. But then I'd like to make a couple of notes on that. So first of all, we believe, of course, that the content that we create is our property and it shouldn't be taken by anyone without asking us. However, we don't really think I'm not really concerned about being paid for that historic data in large amounts.
We don't think that's a business model insert itself. What we are more concerned about is really having access to the results of the training at fair terms and particularly when we are contributing to data. So let's say what we would be angry about is let's say we were scraped. All our content was taken, probably quite a bit of it is, it's taken by the big frontier models. They come back to us and they ask us to pay shitloads of dollars to use that language model.
That is just not fair. So there needs to be some kind of a trade where we contribute to something and we get something back. And then we are also really into open source. I mean, we're small companies like we are in the larger sense of the world. We would like a very, very thriving open source community.
So we are more leaning towards open sourcing things to get things back to us instead of buying these on a proprietary basis. Then I would say that after we talk, Jeff, there's a lot happening in the Norwegian government, which is quite interesting. So the Ministry of Culture gave a task to the National Library which has digitized about 90% of all Norwegian content. That's quite impressive, right?
Just very impressive. So it's like the whole corpus of everything, right? And they will then together with the research institution that I'm the chair of and together with the University of Oslo, we will now train models with all also of the publishers, so the books of Norway that are copyrighted to really see if those models perform way better when we add long text to the LLMs. And if that is the case, then we need to discuss with the states, the publishers, the media companies, whether those rights should be bought out, for example. So that's kind of the sentiment in the Norwegian society.
And our, I would say, our government is quite forward-leaning in this now. The question that comes to mind for me is about convincing fellow publishers into this, like the challenges there. Like, obviously, I have a very US-centric view and what we're dealing with right now are publications like The New York Times, flipping a lid over this and saying, no, absolutely, this is not OK. And cutting off the siphon because they want to protect their work and it is their work, so I can understand their desire to do that.
But how do you go about convincing publishers into this? Did you run into any interference or any kind of hesitation around this, the way I feel like we've seen in so many other examples has happened? Well, there is hesitation about commercial models. I don't think anyone has really figured out how is the compensation to the writers of the content.
And so that's still kind of a bit unsolved. But I think that the mentality is that none of our companies, the media companies and the Nordics are large enough, and we're the largest one, to do this by ourselves. So we need to do it together. And secondly, we can just benefit from doing it together because we need these foundational models to make better products for our users or have better productivity in our companies.
So unless we get these models, these trained models, and we can then specifically train them with our, I would say, very important data that we don't share with anyone for specific purposes in our companies, then we are going to fall behind in this development. I think that this is one of those big moments in the media industry going forward. Unless we really endorse this and try to think about how we get our jobs done towards our customers in a way, in a different way with this technology, then we are going, this might be the last battle of the media industry towards the big tech giants.
So what would your advice be to American media companies, given what you've watched happening here, so different from what's happening under your leadership in Norway? I think they should think about collaboration, making language models, foundation language models, that are not the frontier models, right? But language models in English that they probably can do together, which can be a basis for their product development without having paying a lot to other companies. I think they shouldn't be too optimistic about how much that content is worth. I mean, New York Times is a special case, right, for making language models. I mean, what is the willingness to pay for a Delta in the corpus? That's just enormous. It must have extreme high value.
And I'm being willing to go into reasonable trades if you want to do that with commercial players in addition. You told me that you've experimented already with putting... A, you told me that the model that you've built is performing better than so far you're proving the hypothesis. Tell me more about that. But also that in Schibsted, you did an experiment of doing what I want to see American publishers do, which is to find new uses for generative AI. And one that you talked about was putting your LLM in front of Schibsted content so that readers could enter into a different user interface and relationship with that content. You told us about that.
Yeah, I love that. Yeah, no, I can tell you a little bit about what we're doing, right? So we're also training our own model. That's not actually based on the one that... It's not based now on the one that we're building in Trondheim, but it's based on one of the National Library's early models where we are pre-training that continuing to pre-train it. And we tested it for... Well, the simple use case a lot of media companies do, right?
So I want to generate a title out of this article that the journalists wrote so the desk can get different proposals on articles. And we tested it with the open AI version and a couple of other models and this specific model. And the specific model that we talked about right here just outperformed by far all the other models.
So it's a very smaller foundation model, but that is specific on Norwegian language that is performing way better. So now we have articles where the title is generated by an AI. There's a human that is actually approving it, right? But still, the desk gets four different proposals in their content management tool. You say, these are the four proposals we have and they just click on this one and they push it out there. So that's one of the cases that we've done.
Then the second question you asked me was about that agent, wasn't it, Jeff? Regarding tech reviews and stuff like that? Yes, exactly. So I think what we really want to do is to really experiment with how can users... And we have a very strong destination company, both in our marketplaces business and in our media business. So people come to our sites.
They don't go through Facebook or Google to find our news or file our items on a classified site. But we would like to experiment with how can users in this conversational interface interact with us in a different way. So what we did was to say, okay, let's just try it out. Just make an agent on top of a tech crunch kind of newspaper with lots of reviews on different consumer electronics and stuff like that. And just make a bot where you can say, hey, I got this small living room. It's like four meters from the sofa to the TV. It's a bit bright. I don't have more than $1,200, but I'd like to buy a TV. So could you recommend me one?
And then it kind of pops back, yeah, based on our tests and what you wrote right now, we would recommend you to buy this in this TV. It would just ship it out there. And it was hacked by some of our own people, you know, within a couple of hours. And they just tested it back and forth and ran it for two, three hours, and then they took it down.
And then they learned shitloads and they're probably going to launch stuff again. So that's just the way we try to operate, just to try to experiment and figure out how can we do this in the best possible way. Yeah, it seems to me... Yeah, no, I was just going to say, it seems to me that the... As I'm hearing you talk about kind of the benefits for a publication or media company of creating an LLM around their own content, I could see, like take New York Times, like I could see New York Times being open to something like that. But, you know, in a very controlled environment, in a very controlled situation, taking all their content and saying, this is our value add. This is how we allow people who are fans of our content, who, you know, are dedicated to the New York Times to lean into their trust of what we've created and feel like we have some sort of control and can add value with that in new and different directions. But I guess that goes counter to this idea of all the different publications kind of working together with one giant pile of, you know, collaborative data. Isn't that kind of the input and output, Jason, where the training of the model everyone shares in the benefit? Yeah, it should be.
But if you want to serve to your readers, your content, then you have the rights to do that as a separate operation. Yeah. Is that kind of where you're headed? Yeah, but I think, I mean, is it all content or is it some content? Right, so, I mean, for a news company, there's lots of content that just breaking news just has a value for hours, right? And then it's off.
It doesn't have any more value. So, so you can take historical, typical historical data that doesn't really have a value and get really good output of make using that data for making a foundation model, which is the basis for what you build those specialized LLMs for your purpose, right? But I mean, let's take, I'm not New York Times, but I mean, they have lots of food recipes, right? But that's data that actually is value for a very long time. So, it's not given that they want to give that away, right?
That's fair. I mean, so, but if they, they'd be a bit more nuanced, I think that probably, you know, they would take base models like we do now with Mistral is a French company that's building a base model that they open sourced. And then we train on top of that to get that model to become better with Norwegian language, particularly also from Schibsted.
So that's the approach that probably New York Times should look into because they need those kind of language models. That's for sure. They can't base themselves on open AR or any of those. This just too, too costly.
I'd like to brainstorm a few possible uses that journalism should be making. You've already gone through a couple. I have tons. I have tons.
Let me mention two and the hero to think of those that I'd love to hear for your yours. One is whether the news industry should besides building an LLM build an API to their news so that models that are out of date as they are when they need to call on current content, you have a key, you have a business deal and you make it more of a service to the AI industry rather than being hostile to that industry. Let's create a service for them. That's one and Amy Reinhart who is at the Associated Press is working on building that API for the industry.
She's doing that in the executive program I taught in. I also think there's other interfaces between AI companies and news. I'll throw that out first. What do you think of that? I think it's quite an interesting idea because it's kind of an open innovation idea where you say, okay, let's be honest, folks. Since the internet came, what we did was to connect the paper newspaper to electricity. That's more or less what we've done.
It's just the artifact, the article is still there. There's like barely companies that have done recommendation systems that are 10% of what they can do in Netflix. That's the kind of innovation done.
Let's be a bit blunt about this. To kind of challenge ourselves and say, okay, we can't do all this innovation ourselves. Let's do an open innovation mode. I like that idea.
I think it's quite interesting. You have to have some kind of a payment thing. Yes. It needs to be a bit careful in the way that you don't want to be pressured down in the value chain too much, right? Because that's the really, really, really holy grail of the news business is to have the end user there. It might not get innovation if you just stuck with that all the time. The other idea that I've had is for the newsroom use. I talked to an editor of a not-for-profit newsroom in a large state in the US and said, imagine if you had your readers go out and record 100 school board meetings and you come back and the machine can now very easily transcribe that.
Then you could query that data. You could say, how many school boards are looking at what's going on in America right now? Advanning books or worrying about restrooms or I saw today that one school took all the mirrors out of the school because kids were using the mirrors to make ticktocks. What are they talking about at the school boards? You could do something that no single journalist or no single newsroom could have done before but now could do in collaboration with the public as gatherers of data.
What does that sound like? I think it's super interesting. It's like citizen journalism in a way. I think we've tried it out in different companies in Norway before. I think the big, really big game changer is the ability to crunch all of that data to find those anomalies. The investigative journalist, the really, really good journalism where people are using, our journalists are using a whole year to gather data from different municipalities or from courts or whatever it is to figure out what's going on. Then they get that data and then they have to spend loads of time and data scientists to figure out those anomalies to say, in this municipality and this municipality, they're doing this, this and that.
That's the story. Now you probably can just parse those documents or spreadsheets or videos or whatever it is to try to find those things with those tools. I'm really looking forward to seeing that one of the jobs to be done that we have is to keep the ones with power accountable for what they do. That's going to be way harder with these tools in the hands of the journalists. I think those people abusing their power, be aware. We had a minister that had to leave her office last week. The reason for that was that one guy figured out that she was the minister of education, by the way, that he said that they had taken a student to Supreme Court for copying some stuff in their master's degree.
You don't be super strict with the student. They had lost and they took her to the Supreme Court. This guy said, well, let me look at the master's degree of these ministers.
He took the master's thing and he put it into this AI tool which they use in schools to find out whether they copied stuff. She was busted. She left one hour afterwards. It's on Twitter. This is her copied stuff. The master is called Plagatory.
She's out. What are some of the other uses that you're using at Schibsted, the general technology of AI? I'm not going to use all the time to talk about all the usual things we do. That's transcribation and stuff. What I think we really need to do is to think about how can we move away from just putting electricity on an artificial device.
How can we serve people in the job to be done that we are there to solve for them in a different way with AI? Let's take one idea that we are funding. Are we just talking about that? Are we going to make this, I would say, an infinite article, just a live thing that goes around the story, almost like it is in TV, right? That's continuously just describing the domain. It's not like this is an updated article.
That's the way we do it right now. You can zoom out of the article from a domain perspective and just look at this in a bigger view. You can zoom in and you can understand more in details on one of the sections.
Practically, you can just see that this becomes like a cognitive map of a whole domain. Can you do that? That's fully possible to do, but you need to really use these tools to be able to do it.
By that, you can practically reach the holy grail of getting the job to be done. You, Jeff and you, Jason, you have different opinions on what is good for you. We know that since the internet came along, but has the news industry done anything about it?
No. We don't have enough journalists, so we can't produce the content. We're crap at using technology for recommendations and we're also a bit careful because we don't want to make echo chambers. In this space, I think that's really where we need to skate. That's where the puck is going and we just need to endorse this and try it out.
I love that too. What does that look like in practical terms? When I'm thinking about, and maybe this is, I guess, the big challenge is, okay, we know that this would be an amazing use of this information to broaden out what we know about journalism and how we stay up to date on a topic or a current news item that's kind of building.
The previous paradigm was to tack on new information as we have it. Now this is like a living, breathing document, but what is the interface or the kind of approach through which a user might interact with that? Are they chatting with the LLM about this particular thing or is that LLM creating that living kind of environment around that story?
I'm just kind of curious if you have ideas around that. I think the LLM can and the generative AI can synthesize, you know, truly produce content by journalism in this kind of a live stream and a live media of something, right? I think that that can be fully possible to do. I think in my mind, it's almost like I use mind maps, right? I love mind maps. You guys use mind maps?
Yeah, right? So you can view them in like a 3D way, right? You have this kind of 2D thing as you can click into something and then you go into some deeper understanding of it. Whether that's, you know, conveyed in text or in images or video or audio for that sake, I think that just developed. I mean, that's multimedia in a way. But I don't have this view of, you know, that's the interface.
I'm going to look. I'm not the UXer. Yeah, yeah. I like that a lot, Sven, in terms of looking at, I was part of a startup years ago called Daylife.
Rest in Peace, it's gone. But we speculated about how to present news in different ways. And one was to get our heads around the article versus the story. And the story is something that goes on and on and on and on and on and on. And our articles just snapshot in there, right?
Which we, our production mechanisms made us do. But if you stand back and say, this is the larger story of book banning and censorship in schools, then you have the opportunity, you're right, I think, to present this in a mind map way. But you also have the opportunity to take large amounts of data and let people, let the machine summarize it, find themes in it, allow the user to query it. So a reporter might come back in the future, and not just write a story, but put in all the transcripts of all the interviews and all of the documents and even let the public query this with their questions and what they want to know from it.
If you still write a story, it might come later in the process. I think it's a perfect way of doing it. This is kind of the agent model, right?
I mean, you can have an article and then you can say, this is a part of a story and you want to talk to the story. Yes, exactly. Or do you want to talk to the archives of Shipstand about this topic?
Absolutely. You want to ask what happened 20 years ago and where patterns are. You have the opportunity to get across this in a way that goes far beyond search.
Let me ask another question. You said earlier that if we don't get our act together, this could be the last battle. So for a moment, be dystopian and pessimistic. And if we don't do what we're supposed to do in the news industry, if we don't follow your good leadership standard, what happens?
Well, I think we are going to be disintermediated. I think people in general are lazy in terms of what level of friction they're willing to go through to get the information, right? And we started off as paper newspapers and trying to catch different kind of stories and synthesizes and tellurize it to the masses.
Then we tried to do it a bit more individual, still on the internet and the way we've done it on electricity on paper. And now it's going to be even easier for people to access this. And I personally am a big fan of audio interfaces. I think it's going to be extremely powerful. I think we're a bit disappointed, you know, based on Siri and its likes. But if you start to combine the gen AI tools that we have out there now with audio, it's going to be super powerful.
You're going to talk to computers in natural language, you're going to have a conversation about things. And I think we just have to be, you know, throw away this, oh, it's hallucinating. It's not working like this. Well, it's going to work. But so if we're not going from not those companies being able to be part of that experience that the users are expecting, lowering the friction in access to information that's relevant for you, then we are going to be in trouble. I don't know how it's going to play out, but I think the most important part is that, you know, a lot of the ad, a lot of the newspapers of the world on the internet are still very, very advertising based.
Right. And we have a quite healthy subscription business, but it's tough even in the Nordics where, you know, the reading the willingness to pay for newspapers quite is probably record high in the world. And if you lose eyeballs and they start, you know, accessing this data, this information of the places in a different way, it doesn't take a long time before margin goes from 10 to 12% that best and then goes minus 10.
And then it's trouble. I like the picture you're painting is also of using AI as a relevance machine for people to make this mass that we created for the masses, more relevant. Jason, I've got another absolutely unrelated question.
So, he also on topic. Well, I mean, I guess, I guess the, you know, a long kind of what you're talking about kind of setting this dystopian view, you know, of white things could possibly go if it doesn't happen. What, what could you identify as the probably the biggest challenge we face then when it comes to avoiding that dystopian view? I think it is the lack of willingness to invest and take risk.
Yeah, yeah, risk. I think, I think, I think, you know, that the media industry has shown to be slow moving industry in the transition from the early 2000s. It has, it has a couple of lighthouses around the world that are trying to do things differently, but they are quite conservative. And they are also protective. And in some countries, like in the US, very threatened, right? They probably feel this even more than we do in some of the Nordic countries. So I think if they are approaching this in the way which we've met from some of our peers in the Europe as well, it's like, this is bad. It's going to kill us. We don't want them to take our content. We don't really want to be a part of this in a way.
If you like stick your hand into this, you head into the sand and away. Then it's just that's, that's the worst enemy. That's the worst enemy. So let me ask a related question there. So I lied. I probably have two questions.
I always do. I'm working on a new program that I hope to get started a new degree program university on what I call for now internet studies or internet humanities or bring other disciplines into education for tomorrow's leaders of the internet and society on it. And so I wonder, looking at ship step besides hiring your technologist behind sides hiring people who are going to do coding if we even need coders in the future, one could argue in the general larger skills in a company like ship step in the newsroom but also in sales and management and and and marketing and so on. What kinds are there different skills that you think we should be looking training students for and that you should be looking for in the near future because of it. And the internet. They just need to understand how to use the tools. I think that's the most important. I mean, the statement is not mine.
I mean, you're not replaced by AI but by somebody using AI. And it's what we do is just to be very mindful of this. It is not only about journalists or tech people or something. It's where you know, we're training now trained. And I think three days ago we had the employee number 1000 out of our 6000 people that have gone through very thorough courses in how to use AI prompt if you're if you're a developer or your journalist or you're in sales or whatever you are.
People go to different kind of tracks. So I think that's really the essence is to really add my daughter is studying computer science up in NT and you know I said, you know, the only thing you need to do is to use whatever AI tool you find. Just use it because that's just it's like being in the Internet when you know, when the web is coming along. It's just like it's the same thing happening and you just need to test out everything and learn how to use those tools.
So I think that's the most important Jeff. It's just not you don't you don't need to train them, but you need to be very open about, you know, these tools are here to stay. It's like it's like if you're using a manual typewriter and then there's comes a computer along.
He says, I don't want to use computer. Yeah. So I know the guitar behind you.
Jason is a musician. So I'm curious what you see happening with AI and other areas besides journalism and let's just say in media and entertainment. Are you seeing development in the Nordics as well there in interesting ways? Well, there are I mean, in the Nordics, they're very different kind of societies in many ways.
I would think you're not in Norway. It's you know, it's a lot about B2B business and industries. And I think particularly in like heavy industries, it's there's they haven't moved a lot. They're working a lot more with predictive analytics and AI in that way. But on Gen AI, I haven't seen that much in the entertainment space. I guess the musicians are, you know, they're probably going to use it quite a bit.
Whether they are going to be threatened. I'm not convinced yet to be honest, even though, you know, I've shipped lots of examples of good songs being made by these different services. And I've made a couple of myselfs, which are quite impressive, but you can't ship them and make money on them.
Can you? No, no, doesn't seem that way. It's really interesting and very timely because last weekend I was in Anaheim and, you know, down in Southern California for a music marketing convention called NAMM, probably one of the biggest ones. And I went there wondering if I was going to see a lot about artificial intelligence just based on the moment we're at and hearing about all the, you know, the people getting up in arms about copyright around music, you know, cloning and that sort of stuff with artificial intelligence. And I didn't see much while I was there. But what I did pick up is that it really it was almost like AI was it was a terminology that was used in whispered terms almost like well, used AI.
It was almost like it was a bad word and no one wanted to admit if they were using it, they really didn't want to admit it, you know, because they didn't want any, any, you know, sort of analysis around what that actually means. So I'm looking forward to South by Southwest this year and looking at the, trying to see what's catching up there. It's going to be probably Jen there all over. I think you're right. Absolutely. I think that's probably the better, the better place to go for that's that sort of like combination of technology and music where I went felt very kind of like, felt very much like the traditional thought of making music.
There was a lot of technology but anyways, didn't see nearly as much around AI as I had hoped. We do have some news items to get to you but I don't want to end the interview until I know that we're actually at the end of it. Jeff, do you have any any kind of final, final things burning? No, I just think what there's other time would love to hear more about the training but we don't have time for that. I mean, we could we could talk about the training and Well, just wanted to ask the quick question for what's what's the essence of the training that you're putting people through journalists or not. Well, the for the tech people, it's really about how do they of course use AI and build AI tooling, right? That's that's that's the key thing.
Product people is like what problems can you solve with AI and which ones should you not try to solve with AI because sometimes you try to, you know, to just solve it with AI just for the heck of it. And that's not what we want to happen. Then we have more a general one that really explains what it is and what tools we have and now we're making a prompting course. So we're going to drive we're going to drive that through all our employees. And then we're going to make it more specific for certain disciplines. We haven't really chosen which ones they are, you know, but most likely sales should have one. Journalists should have another one and so on and so on.
So more like a generic, you know, how do you use these tools and then go branch out and sorry for that branch out in different in different, I would say disciplines in a way. I'm just I'm just real quick. I'm just taken by how empowered I feel in hearing you talk about your you and your company's approach to this. It's a very empowering position as opposed to I think a lot of people come to this from that point of fear from that point of no, retraction, pull, you know, hold things close so that things don't change. And really this entire interview has been an example of what what you can get if you open up to this in this inevitable reality, which is that AI is here and they are tools to be used. And if you open yourself up to them, you can create some really amazing things. I'm just really inspired by by that by your approach. Good to hear.
Good to hear. That's what ship service is doing for 183 years and powering people in their daily lives. And endorsing endorsing new technologies from from the printing press practically. So that's what we do. I'm going to make this conversation required listening for various of my people in the business, but thank you so much for doing this. I'm grateful. Should we do some news, Jason?
Yeah. And Sven did say he'd stick around a little bit because we just have a couple of news items we have probably like maybe 10 more minutes in the in the show before we have to end but so you know just a couple of news items that kind of caught our attention this week and mine. I'm I continue to be fascinated by you know things like the video generation. Kind of the creative the creativity, you know morphing and crossing that bend diagram with with artificial intelligence and Google had an announcement about Lumiere their AI video generator, which basically does things a little bit different from my understanding from other video generators. If you've ever used a video generation AI, you often will see that frame to frame certain things aren't constant like the frames change and they evolve. But certain realities about what you're seeing. If you take the first frame and then you go to the last frame, you know what used to be of, you know, a forehead may have evolved into like a wispy hairdo or something something along those lines.
And Lumiere essentially is doing this differently. And it's what is it called motion and location in tandem so there's space and time it analyzes where things should be placed. Also how they should move, and it does that through a single run through process, and that allows for this kind of smoother motion output. And things stay constant for the most part so you know it also does in painting so you can do the silly things like hey change that shirt from a t shirt to you know button up or whatever the case may be but there are other competitors out there that are running their runway Pica labs which I've had some some interaction with and is doing really cool things and and everything. This isn't a product that you can use right now it's a research product but you know so either we're going to get the ability to use it somewhere down the line or Google's going to do what they often do which is integrated into some other part of their Google verse. But before they get bored of it and decide to kill it.
That's true. I found this interesting Jason, Gary Marcus, who writes a lot about AI, put up a post today about the mistakes that AI makes because it knows the close relationship of one pixel to the next or one word to the next, but the larger picture playing out and he put up an AI generated photo of a man hugging a unicorn and the unicorn's horn goes right through his head, apparently with no harm and no blood. And and and it just shows that it doesn't understand that larger context yet which is just kind of kind of fascinating.
Oh man. I mean it's got it's almost make you know I don't know what the point would be but it looks like it's making some sort of a point in and of itself it's kind of beautiful to look at even though it's totally wrong. So if Google can find coherence within a video that's fascinating spend you see much use for video at Shepard's dead. Not at this point in time. Not at this point in time.
No. So, another story that I found interesting I'm eager to hear spend's view on this one is, you probably all saw that that the videos were put up unfortunately of Taylor Swift, who's the subject of every kind of bad social thing. And on Twitter, they didn't know what to do so they just stopped all searches for Taylor Swift which is kind of a blunt way to deal with moderation.
And Microsoft is making changes to its tool, so that it can't be used to do this. And I've been arguing for some time. That I think that people who believe that we can build guardrails on foundation models to prevent it from doing bad things. That's as foolish as saying that you can good work could have made the press so that it couldn't print certain things. It's up to the humans who use it, who can get around any rule and will use it however they choose. And we'll find bad ways to use it especially if we put up guardrail saying, Oh, we got you. You can't do anything bad here and people are going to do it.
Kevin Roos convinced chat GBT to fall in love with him and to give it dark visions of life. I think we're going to see a difficult moment where people are going to realize that this is a general use tool like a printing press, like a camera, and it can capture and create bad things and good things. And no, we can't make the company anticipate every possible bad use someone could imagine and build protection against it. I think you're perfectly right.
I think that you will have to just to see this any other tool. I think though that the large services, the ones that are relating to hundreds of millions of consumers most likely have to implement guardrails. Because it's just they don't just want this content to be created. But I think that to have that as a regulation is super, super difficult.
It's not realistic at all. I mean, to just take one example that we've done, you know, we were just flooded with all kinds of fake photos, particularly from Ukraine. So we just gathered collaborating between the different media companies, and we made a group of people that are fact checking all these images and videos from the different media companies, rather than doing it ourselves, just because it's just so damn difficult. And you can't expect that there will be or any of these to make systems that say, OK, we're going to put any Walter Mark on everything that's generated from our tools. I don't believe it. It's just going to be too much open source out there.
Yeah, I think so. And well, that's one of the arguments people make against open source is oh my God, they can get around the guardrails then. But that that just makes the big old companies. You wouldn't be able to use Mistral if open source were out outlawed in Europe as some regulators were discussing.
So I think, yeah, I think that's a fool's errand. Jason, you have one more than I want. I know that Sven had a story to he mentioned before. So I want to get to that as well. Yeah, well, this one was just the this one essentially is open AI and common sense media, which as a parent of two young daughters, I've used common sense media over the years so many times to check up on like a movie and see, you know, are there moments in this movie that are that might be inappropriate to show my kids and it was just a really nice tool and still is. But open AI and common sense media have entered into a partnership, aiding teens and avoiding, you know, misuse and harm in the AI tools. This is kind of part of something common sense media had already been doing.
They've already been reviewing AI assistance for in recent months, but this part of this agreement apparently has has something to do with open AI is GPT's and potentially common sense media creating family friendly GPT's, which I think is interesting had to see it come in have no idea what that even means or what that what that turns into. But the reason that I wanted to put this in there in here is because I was in my 10 year old daughter's classroom yesterday helping the teacher do some work. And while I was there, the art teacher was up at the head of the class and they were kind of going over their artwork and everything. And in a 10 year old's classroom, one kid shot his hand up and he said, Can I use AI for this problem for this assignment? I mean, at 10 years old, they're they're realizing and they're connecting the dots and everything. What was the teacher's answer? No, well, I mean, her answer was no, you can't but I don't, you know, and I'm sure that she would be able to detect, you know, the work of a 10 year old versus the work of an AI but just really interesting this moment that we're in, especially around education.
I know that's a topic that I would love to get on a future episode. Yes. Yeah. So Sven, you mentioned something that happened that I didn't even know about before we got on the air, so to speak. Yeah, yeah, there was a colleague of mine, one of our brainy guys on AI, Iron Vendement, that just made the note to me that, you know, that GPT just have this new function where you can pull any GPT into our conversation just using the at name of the GPT, right? And he had a very interesting reflection. So how long is it before we make a mini me out of yourself with all the content that you produce so that can be pulled into any dialogue or you can have even multiple individuals that are actually GPTs that talk to each other.
So where is this going to end? It may do a good job. Yeah, I took my I have a book coming out in the fall about the internet takes them forever. The copy that it's done, called the Web we weave why we must reclaim the internet from moguls, misanthropes and moral panic. And so I took the, the titles almost as long as the book, I took the entire manuscript, I put it into Google's notebook LM and asked to summarize it. And in seconds, it did and it did a good job, which might mean that I wrote a simplistic book.
But it was, it was pretty awe inspiring. And I've asked LLMs, you know, the obvious, you can tell some questions things about me and they get things wrong. We know that. But in summarizing content, they're pretty amazing and being able to grab on to the essence of a speaker and mimic them. Yeah, that could be. Yeah, mini Jeff, mini Jeff, the Jeff GPT. Jeff, the Oscar. More people want to shut me up and have me speak more. Jeff, yeah, Jeff GPT with a mute button. That's the premium version.
Yes, you got to pay extra for that. All right. Well, this is, this is so great. Oh, and you, you had the last one.
Did you want to do you want to mention that one? I saw the first really useful GPT application. I haven't used it.
But it was a great idea. A GPT that summarizes your screenshots called keep it shot on my Mac. I use for the shows. When I take screenshots, it's just it's just 100s of them. And I have no idea what they are. I can't find them. I can't get thumbnails. This is just the simplest little thing that's useful.
So I just wanted to. Yeah, we don't we don't want those as well. We've been, we just added, you know, all the privacy documents we have that people have to relate to, you know, it's super complex. And it's just called privacy GPT and anyone can the company can just ask it or we put the AI, AI act into a GPT and you can just ask the AI act about stuff. So all these kinds of things that you never access, right? Just put them into a GPT and open it for people. That's, you know, that's making me think about terms of service and stuff. And this this thing that, you know, everybody is presented with a million different directions and all the things they use and most people rarely if ever read a single lick of it, partially because it's so darn long and exhausting and everything and you know you got to kind of comprehend all that stuff.
So it's not always that easy. Custom GPT might actually be a service that a particular organization or site could offer to give you as a user a direct ability to check in on certain things that could be incredibly terms terms and conditions GPT am I being screwed. Yeah, yeah, right.
The GPT that looks out for you. Oh yeah, sure. Yes. Well, Sven, this has been such an honor and a privilege to have you on for the last hour and also as late as it is where you are right now I think it's now 11pm. So I apologize for that, but we really appreciate you joining us for the last hour to talk about. I think this was a really, you know, inspiring conversation. This shows what can happen if we don't turn toward the kind of fear and uncertainty and the doubt the fud that's out there around this stuff and instead look at it. It is an opportunity to maybe create something better for everyone. That's something to be totally respected and I appreciate that you guys are doing that awesome work.
Thank you so much for being here. Yeah, absolutely. And then it's Schibsted.com. Anything else you want to point people to this is kind of your opportunity to direct them anywhere you want them to go.
Yeah, you can just write me Sven.thaulow@Schibsted.com. Simple as that. Right on. Well, Sven, it's been an honor. Thank you so much. Thank you so much, Sven.
Yeah, for joining us. And Jeff, what do you want to plug as we write things out? All right now, just the usual Gutenberg parenthesis. Gutenberg parenthesis.com. There are discount codes which weren't working but are working again for my books, the Gutenberg parenthesis and magazine.
Excellent Gutenbergparenthesis.com. As for me, aiinside.show for this show, of course, but yellowgoldstudios.com will take you to the at least the YouTube channel right now that we have for this particular show and other things that I'm working on. Ultimately, I'm going to have a website up, but I'm not quite there.
I'm trying to do all the things myself right now and let me tell you it is not easy, but I'm doing my best. This show, AI Inside normally publishes every Wednesday and it will again this week, but we recorded it a little early. So sorry to throw you off. Sometimes we have to do that. We have to time shift. Normally, we record every Wednesday at 11 am Pacific 2pm Eastern. Like I said, if you go to YouTube or Twitch, you can search for yellowgoldstudios and you will find this show as one of the shows on the network. Subscribe by going to aiinside.show. You can support us directly by going to our Patreon at patreon.com/aiinsideshow and thank you so much to those who are patrons of this show supporting us directly. We've heard from so many of you and it's just really cool to know that we have you in our corner.
Got our back and we really appreciate it. Look for us on all the major socials to search our names or AI Inside show will get you to us as well. And then if you have any feedback you want to let us know your thoughts about any of these stories, you've got ideas for a future episode, interview possibility, any of that stuff, contact@aiinside.show. Thank you so much for watching and listening everybody.
We will see you next week on another episode of AI Inside. Bye everybody.