CosmicDumpling
CosmicDumpling
Student

What do people actually do being a RLHF Specialist ? And how ?

11mo ago
Jobs
One interview, 1000+ job opportunities
Take a 10-min AI interview to qualify for numerous real jobs auto-matched to your profile 🔑
+322 new users this month
FloatingMuffin
FloatingMuffin
Google11mo

You mean Reinforcement learning from human feedback? I use that at work (R&D) as we often have a RL agent like PPO combined with LLM for shaping rewards

Discover more
Curated from across