By an ML Engineer who’s always learning, across software, finance, consulting, and marketing
I’ve worn many hats in my career — from writing code at a software startup, to crunching numbers in finance, advising clients as a consultant, and even dabbling in marketing analytics. Through it all, one thing has been constant: the need to keep up with the breakneck pace of technology.
As an ML engineer, you’ve likely seen that fine-tuning is popping up all over the place nowadays. The topic of training large language models (LLMs) to do certain jobs or conform to human tastes is now trending. People are talking about Hugging Face TRL and other libraries as the next big thing, along with terms like RLHF and PPO.
Let’s explore, in plain English, what fine-tuning is, how it is done, and how new approaches like Reinforcement Learning from Human Feedback (RLHF) are changing the game. I’ll also introduce Hugging Face’s TRL library, which has helped me tremendously in making these sophisticated fine-tuning techniques easier to understand. My goal is to provide a concise summary (without technical jargon) of why this emerging field is important for engineers to understand.