Adversarial Moral Stress Testing of Large Language Models

By: Saeid Jamshidi, Foutse Khomh, Arghavan Moradi Dakhel, Amin Nikanjam, Mohammad Hamdaqa, Kawser Wazed Nafi

Published: 2026-04-02

View on arXiv →
#cs.AI

Abstract

This paper investigates adversarial moral stress testing for large language models, aiming to identify vulnerabilities and biases in their ethical decision-making processes under challenging conditions. This is essential for deploying ethical and robust AI systems.

FEEDBACK

Projects

No projects yet