
CosmicDumpling
Student
What do people actually do being a RLHF Specialist ? And how ?
11mo ago
Jobs
One interview, 1000+ job opportunities
Take a 10-min AI interview to qualify for numerous real jobs auto-matched to your profile 🔑+322 new users this month

FloatingMuffin
Google11mo
You mean Reinforcement learning from human feedback? I use that at work (R&D) as we often have a RL agent like PPO combined with LLM for shaping rewards
Discover more
Curated from across