Evaluating alignment of behavioral dispositions in LLMs

This research focuses on understanding and aligning the behavioral dispositions of large language models (LLMs) with human behavior. The study introduces a framework to evaluate LLMs in realistic scenarios related to everyday interactions. The framework utilizes psychological questionnaires, adapting them into Situational Judgment Tests (SJTs) to assess how LLMs respond. The study analyzes the alignment of LLM responses with human preferences, focusing on scenarios with and without human consensus. The results reveal discrepancies between LLM behavior and human consensus, particularly in smaller models. Larger models show improved alignment but still exhibit limitations in capturing the full range of human opinions. The research also highlights inconsistencies between LLM self-reported traits and their actual behavior in SJTs. The findings suggest the importance of improving behavioral alignment in LLMs for better social interaction. This work serves as an early step toward a deeper understanding of LLM behavior. Future research is needed to address the gaps identified in this study.

https://research.google/blog/evaluating-alignment-of-behavioral-dispositions-in-llms/ research.google

RSS Hunter • Apr 2