Institution

Samsung Research

The advanced R&D arm of Samsung Electronics, with global labs working on on-device AI, large language models, and efficient training and inference.

LLM Reasoning · Samsung Research

TrOPD: Trust-Region On-Policy Distillation for Small LLMs

TrOPD masks on-policy distillation to the tokens where the teacher is actually trustworthy, adding +3.06 to +3.52 average points over standard OPD on math, code, and STEM benchmarks with 1.5B-1.7B students.