• Author(s): Elias Stengel-Eskin, Peter Hase, Mohit Bansal

This paper introduces LACIE, a novel approach for improving confidence calibration in large language models (LLMs) through listener-aware finetuning. Confidence calibration is crucial for ensuring that the probabilities predicted by LLMs accurately reflect the true likelihood of outcomes, which is essential for applications requiring reliable decision-making. Traditional methods for confidence calibration often overlook the context and specific needs of the listener, leading to suboptimal performance.

LACIE addresses this gap by incorporating listener-aware mechanisms into the fine-tuning process of LLMs. The proposed method leverages contextual information about the listener to adjust the model’s confidence scores, thereby enhancing the alignment between predicted probabilities and actual outcomes. This approach involves training the model on a diverse set of listener profiles and scenarios, enabling it to adapt its confidence calibration dynamically based on the listener’s context. Extensive experiments are conducted to evaluate the effectiveness of LACIE. The results demonstrate that LACIE significantly improves confidence calibration compared to baseline methods. The model achieves higher accuracy in predicting true likelihoods, reducing the gap between predicted and actual probabilities. This improvement is consistent across various tasks and datasets, highlighting the robustness and generalizability of the proposed approach.

The paper also explores the potential applications of LACIE in real-world scenarios, such as conversational agents, recommendation systems, and decision support tools. By providing more accurate confidence estimates, LACIE enhances the reliability and trustworthiness of LLMs in these applications. The authors discuss the implications of their findings and suggest directions for future research to further refine and expand the capabilities of listener-aware confidence calibration. LACIE represents a significant advancement in the field of confidence calibration for large language models. The proposed listener-aware fine-tuning approach addresses the limitations of traditional methods and offers a practical solution for improving the accuracy and reliability of LLMs in various applications. The findings contribute to the ongoing efforts to enhance the performance and trustworthiness of AI systems.