Deep Papers | Escuchar podcast en línea gratis

Episodios disponibles

5 de 43

How DeepSeek is Pushing the Boundaries of AI Development
This week, we dive into DeepSeek. SallyAnn DeLucia, Product Manager at Arize, and Nick Luzio, a Solutions Engineer, break down key insights on a model that have dominating headlines for its significant breakthrough in inference speed over other models. What’s next for AI (and open source)? From training strategies to real-world performance, here’s what you need to know.Read a summary: https://arize.com/blog/how-deepseek-is-pushing-the-boundaries-of-ai-development/Learn more about AI observability and evaluation in our course, join the Arize AI Slack community or get the latest on LinkedIn and X.
--------
29:54
Multiagent Finetuning: A Conversation with Researcher Yilun Du
We talk to Google DeepMind Senior Research Scientist (and incoming Assistant Professor at Harvard), Yilun Du, about his latest paper "Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains." This paper introduces a multiagent finetuning framework that enhances the performance and diversity of language models by employing a society of agents with distinct roles, improving feedback mechanisms and overall output quality.The method enables autonomous self-improvement through iterative finetuning, achieving significant performance gains across various reasoning tasks. It's versatile, applicable to both open-source and proprietary LLMs, and can integrate with human-feedback-based methods like RLHF or DPO, paving the way for future advancements in language model development.Read an overview on the blogWatch the full discussionLearn more about AI observability and evaluation in our course, join the Arize AI Slack community or get the latest on LinkedIn and X.
--------
30:03
Training Large Language Models to Reason in Continuous Latent Space
LLMs have typically been restricted to reason in the "language space," where chain-of-thought (CoT) is used to solve complex reasoning problems. But a new paper argues that language space may not always be the best for reasoning. In this paper read, we cover an exciting new technique from a team at Meta called Chain of Continuous Thought—also known as "Coconut." In the paper, "Training Large Language Models to Reason in a Continuous Latent Space" explores the potential of allowing LLMs to reason in an unrestricted latent space instead of being constrained by natural language tokens.Read a full breakdown of Coconut on our blogLearn more about AI observability and evaluation in our course, join the Arize AI Slack community or get the latest on LinkedIn and X.
--------
24:58
LLMs as Judges: A Comprehensive Survey on LLM-Based Evaluation Methods
We discuss a major survey of work and research on LLM-as-Judge from the last few years. "LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods" systematically examines the LLMs-as-Judge framework across five dimensions: functionality, methodology, applications, meta-evaluation, and limitations. This survey gives us a birds eye view of the advantages, limitations and methods for evaluating its effectiveness. Read a breakdown on our blog: https://arize.com/blog/llm-as-judge-survey-paper/Learn more about AI observability and evaluation in our course, join the Arize AI Slack community or get the latest on LinkedIn and X.
--------
28:57
Merge, Ensemble, and Cooperate! A Survey on Collaborative LLM Strategies
LLMs have revolutionized natural language processing, showcasing remarkable versatility and capabilities. But individual LLMs often exhibit distinct strengths and weaknesses, influenced by differences in their training corpora. This diversity poses a challenge: how can we maximize the efficiency and utility of LLMs?A new paper, "Merge, Ensemble, and Cooperate: A Survey on Collaborative Strategies in the Era of Large Language Models," highlights collaborative strategies to address this challenge. In this week's episode, we summarize key insights from this paper and discuss practical implications of LLM collaboration strategies across three main approaches: merging, ensemble, and cooperation. We also review some new open source models we're excited about. Learn more about AI observability and evaluation in our course, join the Arize AI Slack community or get the latest on LinkedIn and X.
--------
28:47

Más podcasts de Ciencias

Podcasts a la moda de Ciencias

Acerca de Deep Papers

Deep Papers is a podcast series featuring deep dives on today’s most important AI papers and research. Hosted by Arize AI founders and engineers, each episode profiles the people and techniques behind cutting-edge breakthroughs in machine learning.

Sitio web del podcast

Escucha Deep Papers, Materia Oscura y muchos más podcasts de todo el mundo con la aplicación de radio.es

Descarga la app gratuita: radio.es

Añadir radios y podcasts a favoritos
Transmisión por Wi-Fi y Bluetooth
Carplay & Android Auto compatible
Muchas otras funciones de la app

Abrir app

Descarga la app gratuita: radio.es

Añadir radios y podcasts a favoritos
Transmisión por Wi-Fi y Bluetooth
Carplay & Android Auto compatible
Muchas otras funciones de la app

Deep Papers

Escanea el código,
Descarga la app,
Escucha.