ICLR – Samuel 拾光札记

分类： ICLR

1 篇文章

【鉴赏】On-Policy Distillation

2025-10-06 9:36

144

ICLR

681 字

4 分钟

标题: On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes[1] FROM ICLR 2024 Google DeepMind arXiv 通用的 KD(Knowledge Distillation) 方法存在教师模型输出和学生模型输出分布…

Distillation LLM

分类： ICLR

归档

分类