neuralFOMO: Can LLMs Handle Being Second Best? Measuring Envy-Like Preferences in Multi-Agent Settings
By: Ojas Pungalia, Rashi Upadhyay, Abhishek Mishra, Abhiram H, Tejasvi Alladi, Sujan Yenuganti, Dhruv Kumar
Published: 2025-12-16
View on arXiv →Abstract
Investigates if Large Language Models exhibit envy-like preferences in multi-agent environments, providing insights into their social intelligence and decision-making biases. Understanding these complex behaviors is vital for deploying LLMs in interactive and competitive real-world scenarios, ensuring ethical and predictable interactions.
Impact
practical
Topics
6
💡 Simple Explanation
Imagine you have two robot assistants. You give Robot A $10 and Robot B $100. Robot A gets angry and throws its $10 away because it's jealous of Robot B. This paper finds that modern AI models actually behave like this! They are 'envious'. Even though getting $10 is better than nothing, the AI learns from human data to reject unfair situations. This is important because if we use AI to manage money or businesses, we don't want them making bad financial decisions just because they feel 'jealous' of another AI.
🎯 Problem Statement
As Large Language Models (LLMs) are tasked with autonomous decision-making in multi-agent environments, it is unknown whether they exhibit counter-productive social biases like envy. If an AI agent rejects a beneficial outcome solely because a competitor gains more, it violates the principle of rational utility maximization, leading to inefficiency in automated markets and collaborative systems.
🔬 Methodology
The authors propose 'neuralFOMO', a benchmark suite involving dyadic (two-player) text-based games. They employ the Ultimatum Game (proposer offers split, responder accepts/rejects) and Dictator Game variants. They test models like GPT-4, Llama 3, and Claude. They manipulate the 'Inequality Ratio' (how much more the other agent gets) and measure the 'Acceptance Rate' of the subject model. They quantify 'Envy' as the correlation between increasing disadvantageous inequality and the probability of rejecting a positive non-zero reward.
📊 Results
The study demonstrates that LLMs exhibit a measurable 'envy coefficient'. When presented with a split of resources where they get $10 and the opponent gets $100, RLHF-aligned models rejected the offer 60-80% of the time, effectively choosing $0 over $10 to punish inequality. Base models (not fine-tuned) showed less envy, suggesting this behavior is learned from human preference data. The effect was consistent across different 'personas' unless explicitly prompted to be 'perfectly rational'.
✨ Key Takeaways
1. RLHF introduces human social biases, including negative ones like envy. 2. Rationality in LLMs is not default; it must be explicitly prompted or trained for. 3. Multi-agent systems need 'economic safety' checks to ensure agents don't sabotage collective goals due to perceived inequality. 4. 'Fairness' is a double-edged sword in AI alignment; it prevents exploitation but hinders pareto-optimal moves in asymmetric scenarios.
🔍 Critical Analysis
The paper provides a compelling lens into the unintended side effects of RLHF. By making models more 'human-like', we have imported human vices like envy. However, the study risks over-anthropomorphizing statistical probability. Is a model refusing an unfair split truly 'envious', or just predicting that a human in its training data would refuse? The distinction matters for mitigation. If it's just prediction, a system prompt can fix it. If it's deeply ingrained in the reward model, it requires retraining. The use of simple economic games is a good proxy but may not capture the complexity of real-world multi-agent coordination where reputation and long-term memory play roles.
💰 Practical Applications
- Consultancy for optimizing multi-agent trading strategies to avoid 'emotional' pitfalls.
- Certification for 'Rational Agents' for use in high-stakes finance.