Inference Scaling, Alignment Faking, Deal Making? Frontier Research with Ryan Greenblatt of Redwood Research

0 Recensioner
0
Episod
219 of 259
Längd
3T 19min
Språk
Engelska
Format
Kategori
Ekonomi & Business

In this episode, Ryan Greenblatt, Chief Scientist at Redwood Research, discusses various facets of AI safety and alignment. He delves into recent research on alignment faking, covering experiments involving different setups such as system prompts, continued pre-training, and reinforcement learning. Ryan offers insights on methods to ensure AI compliance, including giving AIs the ability to voice objections and negotiate deals. The conversation also touches on the future of AI governance, the risks associated with AI development, and the necessity of international cooperation. Ryan shares his perspective on balancing AI progress with safety, emphasizing the need for transparency and cautious advancement.

Ryan's work (with co-authors at Anthropic) on Alignment Faking: https://www.lesswrong.com/posts/njAZwT8nkHnjipJku/alignment-faking-in-large-language-models

Ryan's work on striking deals with AIs: https://www.lesswrong.com/posts/7C4KJot4aN8ieEDoz/will-alignment-faking-claude-accept-a-deal-to-reveal-its

Ryan's critique of Anthropic's RSP work: https://www.lesswrong.com/posts/6tjHf5ykvFqaNCErH/anthropic-s-responsible-scaling-policy-and-long-term-benefit?commentId=NyqcvZifqznNGKxdT

SPONSORS: Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers. OCI powers industry leaders like Vodafone and Thomson Reuters with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before March 31, 2024 at https://oracle.com/cognitive NetSuite: Over 41,000 businesses trust NetSuite by Oracle, the #1 cloud ERP, to future-proof their operations. With a unified platform for accounting, financial management, inventory, and HR, NetSuite provides real-time insights and forecasting to help you make quick, informed decisions. Whether you're earning millions or hundreds of millions, NetSuite empowers you to tackle challenges and seize opportunities. Download the free CFO's guide to AI and machine learning at https://netsuite.com/cognitive Shopify: Shopify is revolutionizing online selling with its market-leading checkout system and robust API ecosystem. Its exclusive library of cutting-edge AI apps empowers e-commerce businesses to thrive in a competitive market. Cognitive Revolution listeners can try Shopify for just $1 per month at https://shopify.com/cognitive

RECOMMENDED PODCAST: 🎙️Check out Modern Relationships, where Erik Torenberg interviews tech power couples and leading thinkers to explore how ambitious people actually make partnerships work. This season's guests include: Delian Asparouhov & Nadia Asparouhova, Kristen Berman & Phil Levin, Rob Henderson, and Liv Boeree & Igor Kurganov. Apple: https://podcasts.apple.com/us/podcast/id1786227593 Spotify: https://open.spotify.com/show/5hJzs0gDg6lRT6r10mdpVg YouTube: https://www.youtube.com/@ModernRelationshipsPod


Lyssna när som helst, var som helst

Kliv in i en oändlig värld av stories

  • 1 miljon stories
  • Hundratals nya stories varje vecka
  • Få tillgång till exklusivt innehåll
  • Avsluta när du vill
Starta erbjudandet
SE - Details page - Device banner - 894x1036

Andra podcasts som du kanske gillar...