UCLA Researchers Introduce Group Preference Optimization (GPO): A Machine Learning-based Alignment Framework that Steers Language Models to Preferences of Individual Groups in a Few-Shot Manner
Large Language Models (LLMs) are increasingly employed for various domains, with use cases including creative writing, chatbots, and semantic search. Many of these applications are inherently subjective and require generations catering to different demographics, cultural and societal norms, or individual preferences. Through their large-scale training, current language models are exposed to diverse data that allows…