Self-distillation (SSD) enables large language models to enhance code generation by using their own raw outputs without the need for a verifier or teacher model. The process involves sampling solutions with specific temperature and truncation settings, followed by fine-tuning.
arxiv.org
2 min
2d ago
Self-distillation (SSD) enables large language models to enhance code generation by using their own raw outputs without the need for a verifier or teacher model. The process involves sampling solutions with specific temperature and truncation settings, followed by fine-tuning.
arxiv.org
2 min
2d ago
Self-distillation (SSD) enables large language models to enhance code generation by using their own raw outputs without the need for a verifier or teacher model. The process involves sampling solutions with specific temperature and truncation settings, followed by fine-tuning.
arxiv.org
2 min
2d ago
No more articles to load