Open main menu
Article
Quizzes
Tools
EN
Article
Quizzes
Tools
All quizzes
/
LLM Fundamentals
/
What is RL...
What is RLHF and what problem does it solve in LLM training?
Reinforcement Learning from Human Feedback — it uses human preference ratings to steer the model toward more helpful and less harmful outputs
Rapid Language Hyperparameter Fitting — it automatically optimises temperature and sampling settings
Recurrent Layer Hidden Features — it adds memory to Transformer layers
Retrieval-Linked Hallucination Fix — it reduces hallucination by grounding answers in retrieved documents
Submit answers