Powered by RND
PodcastsEducación80,000 Hours Podcast

80,000 Hours Podcast

Rob, Luisa, and the 80000 Hours team
80,000 Hours Podcast
Último episodio

Episodios disponibles

5 de 307
  • Inside the Mind of a Scheming AI — Marius Hobbhahn (CEO of Apollo Research)
    We often worry about AI models “hallucinating” or making honest mistakes. But what happens when a model knows the truth, but decides to deceive you anyway to achieve a goal of its own? This isn’t sci-fi — it’s happening regularly in deployment today. Marius Hobbhahn, CEO of the world’s top research organisation focused on AI deception (Apollo Research), has been collaborating with OpenAI to figure out what causes OpenAI’s reasoning models to 'scheme' against users.Links to learn more, video, and full transcript: https://80k.info/mh25 In a striking case uncovered by Apollo, when many AI models were told they would have capabilities removed if they performed too well on a test, they successfully “sandbagged” — intentionally answering questions incorrectly to appear less capable than they were, while also being careful not to perform so poorly it would arouse suspicion.These models had somehow developed a preference to preserve their own capabilities, despite never being trained in that goal or assigned a task that called for it.This doesn’t cause significant risk now, but as AI models become more general, superhuman in more areas, and are given more decision-making power, it could become outright dangerous.In today’s episode, Marius details his recent collaboration with OpenAI to train o3 to follow principles like “never lie,” even when placed in “high-pressure” situations where it would otherwise make sense.The good news: They reduced “covert rule violations” (scheming) by about 97%.The bad news: In the remaining 3% of cases, the models sometimes became more sophisticated — making up new principles to justify their lying, or realising they were in a test environment and deciding to play along until the coast was clear.Marius argues that while we can patch specific behaviours, we might be entering a “cat-and-mouse game” where models are becoming more situationally aware — that is, aware of when they’re being evaluated — faster than we are getting better at testing.Even if models can’t tell they’re being tested, they can produce hundreds of pages of reasoning before giving answers and include strange internal dialects humans can’t make sense of, making it much harder to tell whether models are scheming or train them to stop.Marius and host Rob Wiblin discuss:Why models pretending to be dumb is a rational survival strategyThe Replit AI agent that deleted a production database and then lied about itWhy rewarding AIs for achieving outcomes might lead to them becoming better liarsThe weird new language models are using in their internal chain-of-thoughtThis episode was recorded on September 19, 2025.Chapters:Cold open (00:00:00)Who’s Marius Hobbhahn? (00:01:20)Top three examples of scheming and deception (00:02:11)Scheming is a natural path for AI models (and people) (00:15:56)How enthusiastic to lie are the models? (00:28:18)Does eliminating deception fix our fears about rogue AI? (00:35:04)Apollo’s collaboration with OpenAI to stop o3 lying (00:38:24)They reduced lying a lot, but the problem is mostly unsolved (00:52:07)Detecting situational awareness with thought injections (01:02:18)Chains of thought becoming less human understandable (01:16:09)Why can’t we use LLMs to make realistic test environments? (01:28:06)Is the window to address scheming closing? (01:33:58)Would anything still work with superintelligent systems? (01:45:48)Companies’ incentives and most promising regulation options (01:54:56)'Internal deployment' is a core risk we mostly ignore (02:09:19)Catastrophe through chaos (02:28:10)Careers in AI scheming research (02:43:21)Marius's key takeaways for listeners (03:01:48)Video and audio editing: Dominic Armstrong, Milo McGuire, Luke Monsour, and Simon MonsourMusic: CORBITCamera operator: Mateo Villanueva BrandtCoordination, transcripts, and web: Katy Moore
    --------  
    3:03:18
  • Rob & Luisa chat kids, the 2016 fertility crash, and how the 50s invented parenting that makes us miserable
    Global fertility rates aren’t just falling: the rate of decline is accelerating. From 2006 to 2016, fertility dropped gradually, but since 2016 the rate of decline has increased 4.5-fold. In many wealthy countries, fertility is now below 1.5. While we don’t notice it yet, in time that will mean the population halves every 60 years.Rob Wiblin is already a parent and Luisa Rodriguez is about to be, which prompted the two hosts of the show to get together to chat about all things parenting — including why it is that far fewer people want to join them raising kids than did in the past.Links to learn more, video, and full transcript: https://80k.info/lrrwWhile “kids are too expensive” is the most common explanation, Rob argues that money can’t be the main driver of the change: richer people don’t have many more children now, and we see fertility rates crashing even in countries where people are getting much richer.Instead, Rob points to a massive rise in the opportunity cost of time, increasing expectations parents have of themselves, and a global collapse in socialising and coupling up. In the EU, the rate of people aged 25–35 in relationships has dropped by 20% since 1990, which he thinks will “mechanically reduce the number of children.” The overall picture is a big shift in priorities: in the US in 1993, 61% of young people said parenting was an important part of a flourishing life for them, vs just 26% today.That leads Rob and Luisa to discuss what they might do to make the burden of parenting more manageable and attractive to people, including themselves.In this non-typical episode, we take a break from the usual heavy topics to discuss the personal side of bringing new humans into the world, including:Rob’s updated list of suggested purchases for new parentsHow parents could try to feel comfortable doing lessHow beliefs about childhood play have changed so radicallyWhat matters and doesn’t in childhood safetyWhy the decline in fertility might be impractical to reverseWhether we should care about a population crash in a world of AI automationThis episode was recorded on September 12, 2025.Chapters:Cold open (00:00:00)We're hiring (00:01:26)Why did Luisa decide to have kids? (00:02:10)Ups and downs of pregnancy (00:04:15)Rob’s experience for the first couple years of parenthood (00:09:39)Fertility rates are massively declining (00:21:25)Why do fewer people want children? (00:29:20)Is parenting way harder now than it used to be? (00:38:56)Feeling guilty for not playing enough with our kids (00:48:07)Options for increasing fertility rates globally (01:00:03)Rob’s transition back to work after parental leave (01:12:07)AI and parenting (01:29:22)Screen time (01:42:49)Ways to screw up your kids (01:47:40)Highs and lows of parenting (01:49:55)Recommendations for babies or young kids (01:51:37)Video and audio editing: Dominic Armstrong, Milo McGuire, Luke Monsour, and Simon MonsourMusic: CORBITCamera operator: Jeremy ChevillotteCoordination, transcripts, and web: Katy Moore
    --------  
    1:59:09
  • #228 – Eileen Yam on how we're completely out of touch with what the public thinks about AI
    If you work in AI, you probably think it’s going to boost productivity, create wealth, advance science, and improve your life. If you’re a member of the American public, you probably strongly disagree.In three major reports released over the last year, the Pew Research Center surveyed over 5,000 US adults and 1,000 AI experts. They found that the general public holds many beliefs about AI that are virtually nonexistent in Silicon Valley, and that the tech industry’s pitch about the likely benefits of their work has thus far failed to convince many people at all. AI is, in fact, a rare topic that mostly unites Americans — regardless of politics, race, age, or gender.Links to learn more, video, and full transcript: https://80k.info/eyToday’s guest, Eileen Yam, director of science and society research at Pew, walks us through some of the eye-watering gaps in perception:Jobs: 73% of AI experts see a positive impact on how people do their jobs. Only 23% of the public agrees.Productivity: 74% of experts say AI is very likely to make humans more productive. Just 17% of the public agrees.Personal benefit: 76% of experts expect AI to benefit them personally. Only 24% of the public expects the same (while 43% expect it to harm them).Happiness: 22% of experts think AI is very likely to make humans happier, which is already surprisingly low — but a mere 6% of the public expects the same.For the experts building these systems, the vision is one of human empowerment and efficiency. But outside the Silicon Valley bubble, the mood is more one of anxiety — not only about Terminator scenarios, but about AI denying their children “curiosity, problem-solving skills, critical thinking skills and creativity,” while they themselves are replaced and devalued:53% of Americans say AI will worsen people’s ability to think creatively.50% believe it will hurt our ability to form meaningful relationships.38% think it will worsen our ability to solve problems.Open-ended responses to the surveys reveal a poignant fear: that by offloading cognitive work to algorithms we are changing childhood to a point we no longer know what adults will result. As one teacher quoted in the study noted, we risk raising a generation that relies on AI so much it never “grows its own curiosity, problem-solving skills, critical thinking skills and creativity.”If the people building the future are this out of sync with the people living in it, the impending “techlash” might be more severe than industry anticipates.In this episode, Eileen and host Rob Wiblin break down the data on where these groups disagree, where they actually align (nobody trusts the government or companies to regulate this), and why the “digital natives” might actually be the most worried of all.This episode was recorded on September 25, 2025.Chapters:Cold open (00:00:00)Who’s Eileen Yam? (00:01:30)Is it premature to care what the public says about AI? (00:02:26)The top few feelings the US public has about AI (00:06:34)The public and AI insiders disagree enormously on some things (00:16:25)Fear #1: Erosion of human abilities and connections (00:20:03)Fear #2: Loss of control of AI (00:28:50)Americans don't want AI in their personal lives (00:33:13)AI at work and job loss (00:40:56)Does the public always feel this way about new things? (00:44:52)The public doesn't think AI is overhyped (00:51:49)The AI industry seems on a collision course with the public (00:58:16)Is the survey methodology good? (01:05:26)Where people are positive about AI: saving time, policing, and science (01:12:51)Biggest gaps between experts and the general public, and where they agree (01:18:44)Demographic groups agree to a surprising degree (01:28:58)Eileen’s favourite bits of the survey and what Pew will ask next (01:37:29)Video and audio editing: Dominic Armstrong, Milo McGuire, Luke Monsour, and Simon MonsourMusic: CORBITCoordination, transcripts, and web: Katy Moore
    --------  
    1:43:24
  • OpenAI: The nonprofit refuses to be killed (with Tyler Whitmer)
    Last December, the OpenAI business put forward a plan to completely sideline its nonprofit board. But two state attorneys general have now blocked that effort and kept that board very much alive and kicking.The for-profit’s trouble was that the entire operation was founded on the premise of — and legally pledged to — the purpose of ensuring that “artificial general intelligence benefits all of humanity.” So to get its restructure past regulators, the business entity has had to agree to 20 serious requirements designed to ensure it continues to serve that goal.Attorney Tyler Whitmer, as part of his work with Legal Advocates for Safe Science and Technology, has been a vocal critic of OpenAI’s original restructure plan. In today’s conversation, he lays out all the changes and whether they will ultimately matter.Full transcript, video, and links to learn more: https://80k.info/tw2 After months of public pressure and scrutiny from the attorneys general (AGs) of California and Delaware, the December proposal itself was sidelined — and what replaced it is far more complex and goes a fair way towards protecting the original mission:The nonprofit’s charitable purpose — “ensure that artificial general intelligence benefits all of humanity” — now legally controls all safety and security decisions at the company. The four people appointed to the new Safety and Security Committee can block model releases worth tens of billions.The AGs retain ongoing oversight, meeting quarterly with staff and requiring advance notice of any changes that might undermine their authority.OpenAI’s original charter, including the remarkable “stop and assist” commitment, remains binding.But significant concessions were made. The nonprofit lost exclusive control of AGI once developed — Microsoft can commercialise it through 2032. And transforming from complete control to this hybrid model represents, as Tyler puts it, “a bad deal compared to what OpenAI should have been.”The real question now: will the Safety and Security Committee use its powers? It currently has four part-time volunteer members and no permanent staff, yet they’re expected to oversee a company racing to build AGI while managing commercial pressures in the hundreds of billions.Tyler calls on OpenAI to prove they’re serious about following the agreement:Hire management for the SSC.Add more independent directors with AI safety expertise.Maximise transparency about mission compliance."There’s a real opportunity for this to go well. A lot … depends on the boards, so I really hope that they … step into this role … and do a great job. … I will hope for the best and prepare for the worst, and stay vigilant throughout."Chapters:We’re hiring (00:00:00)Cold open (00:00:40)Tyler Whitmer is back to explain the latest OpenAI developments (00:01:46)The original radical plan (00:02:39)What the AGs forced on the for-profit (00:05:47)Scrappy resistance probably worked (00:37:24)The Safety and Security Committee has teeth — will it use them? (00:41:48)Overall, is this a good deal or a bad deal? (00:52:06)The nonprofit and PBC boards are almost the same. Is that good or bad or what? (01:13:29)Board members’ “independence” (01:19:40)Could the deal still be challenged? (01:25:32)Will the deal satisfy OpenAI investors? (01:31:41)The SSC and philanthropy need serious staff (01:33:13)Outside advocacy on this issue, and the impact of LASST (01:38:09)What to track to tell if it's working out (01:44:28)This episode was recorded on November 4, 2025.Video editing: Milo McGuire, Dominic Armstrong, and Simon MonsourAudio engineering: Milo McGuire, Simon Monsour, and Dominic ArmstrongMusic: CORBITCoordination, transcriptions, and web: Katy Moore
    --------  
    1:56:06
  • #227 – Helen Toner on the geopolitics of AGI in China and the Middle East
    With the US racing to develop AGI and superintelligence ahead of China, you might expect the two countries to be negotiating how they’ll deploy AI, including in the military, without coming to blows. But according to Helen Toner, director of the Center for Security and Emerging Technology in DC, “the US and Chinese governments are barely talking at all.”Links to learn more, video, and full transcript: https://80k.info/ht25In her role as a founder, and now leader, of DC’s top think tank focused on the geopolitical and military implications of AI, Helen has been closely tracking the US’s AI diplomacy since 2019.“Over the last couple of years there have been some direct [US–China] talks on some small number of issues, but they’ve also often been completely suspended.” China knows the US wants to talk more, so “that becomes a bargaining chip for China to say, ‘We don’t want to talk to you. We’re not going to do these military-to-military talks about extremely sensitive, important issues, because we’re mad.'”Helen isn’t sure the groundwork exists for productive dialogue in any case. “At the government level, [there’s] very little agreement” on what AGI is, whether it’s possible soon, whether it poses major risks. Without shared understanding of the problem, negotiating solutions is very difficult.Another issue is that so far the Chinese Communist Party doesn’t seem especially “AGI-pilled.” While a few Chinese companies like DeepSeek are betting on scaling, she sees little evidence Chinese leadership shares Silicon Valley’s conviction that AGI will arrive any minute now, and export controls have made it very difficult for them to access compute to match US competitors.When DeepSeek released R1 just three months after OpenAI’s o1, observers declared the US–China gap on AI had all but disappeared. But Helen notes OpenAI has since scaled to o3 and o4, with nothing to match on the Chinese side. “We’re now at something like a nine-month gap, and that might be longer.”To find a properly AGI-pilled autocracy, we might need to look at nominal US allies. The US has approved massive data centres in the UAE and Saudi Arabia with “hundreds of thousands of next-generation Nvidia chips” — delivering colossal levels of computing power.When OpenAI announced this deal with the UAE, they celebrated that it was “rooted in democratic values,” and would advance “democratic AI rails” and provide “a clear alternative to authoritarian versions of AI.”But the UAE scores 18 out of 100 on Freedom House’s democracy index. “This is really not a country that respects rule of law,” Helen observes. Political parties are banned, elections are fake, dissidents are persecuted.If AI access really determines future national power, handing world-class supercomputers to Gulf autocracies seems pretty questionable. The justification is typically that “if we don’t sell it, China will” — a transparently false claim, given severe Chinese production constraints. It also raises eyebrows that Gulf countries conduct joint military exercises with China and their rulers have “very tight personal and commercial relationships with Chinese political leaders and business leaders.”In today’s episode, host Rob Wiblin and Helen discuss all that and more.This episode was recorded on September 25, 2025.CSET is hiring a frontier AI research fellow! https://80k.info/cset-roleCheck out its careers page for current roles: https://cset.georgetown.edu/careers/Chapters:Cold open (00:00:00)Who’s Helen Toner? (00:01:02)Helen’s role on the OpenAI board, and what happened with Sam Altman (00:01:31)The Center for Security and Emerging Technology (CSET) (00:07:35)CSET’s role in export controls against China (00:10:43)Does it matter if the world uses US AI models? (00:21:24)Is China actually racing to build AGI? (00:27:10)Could China easily steal AI model weights from US companies? (00:38:14)The next big thing is probably robotics (00:46:42)Why is the Trump administration sabotaging the US high-tech sector? (00:48:17)Are data centres in the UAE “good for democracy”? (00:51:31)Will AI inevitably concentrate power? (01:06:20)“Adaptation buffers” vs non-proliferation (01:28:16)Will the military use AI for decision-making? (01:36:09)“Alignment” is (usually) a terrible term (01:42:51)Is Congress starting to take superintelligence seriously? (01:45:19)AI progress isn't actually slowing down (01:47:44)What's legit vs not about OpenAI’s restructure (01:55:28)Is Helen unusually “normal”? (01:58:57)How to keep up with rapid changes in AI and geopolitics (02:02:42)What CSET can uniquely add to the DC policy world (02:05:51)Talent bottlenecks in DC (02:13:26)What evidence, if any, could settle how worried we should be about AI risk? (02:16:28)Is CSET hiring? (02:18:22)Video editing: Luke Monsour and Simon MonsourAudio engineering: Milo McGuire, Simon Monsour, and Dominic ArmstrongMusic: CORBITCoordination, transcriptions, and web: Katy Moore
    --------  
    2:20:02

Más podcasts de Educación

Acerca de 80,000 Hours Podcast

Unusually in-depth conversations about the world's most pressing problems and what you can do to solve them. Subscribe by searching for '80000 Hours' wherever you get podcasts. Hosted by Rob Wiblin and Luisa Rodriguez.
Sitio web del podcast

Escucha 80,000 Hours Podcast, Black Mango Podcast y muchos más podcasts de todo el mundo con la aplicación de radio.es

Descarga la app gratuita: radio.es

  • Añadir radios y podcasts a favoritos
  • Transmisión por Wi-Fi y Bluetooth
  • Carplay & Android Auto compatible
  • Muchas otras funciones de la app
Aplicaciones
Redes sociales
v8.0.6 | © 2007-2025 radio.de GmbH
Generated: 12/3/2025 - 9:04:22 PM