Skip to content

📰 rlhf

A Little Bit of Reinforcement Learning from Human Feedback

A short introduction to RLHF and post-training focused on language models.

weeklyfoo #71 / 2025-02-10
rlhf