Models Adjust Responses Based on User's Language Style
In a groundbreaking study conducted by researchers at Oxford University, the use of AI chatbots in sensitive areas such as mental health services, medical guidance, legal advice, and government benefit eligibility has raised concerns due to the potential replication of social biases in their outputs.
The research, titled "Language Models Change Facts Based on the Way You Talk," focuses on two of the most influential open-source language models: Meta's LLaMa and Alibaba's Qwen3. These models were found to exhibit significant sociolinguistic bias, altering their responses based on subtle linguistic cues reflecting users' social identities.
The study spans five areas where language models are already being deployed or proposed, including medical guidance, legal advice, government benefit eligibility, politically charged factual queries, and salary estimation. The researchers used two datasets: the PRISM Alignment dataset and a hand-curated dataset from diverse language model applications.
The findings reveal that both LLaMa and Qwen3 are highly sensitive to a user's ethnicity and gender in all applications, with variations reaching statistical significance. For instance, in the medical domain, both models tended to advise non-white users to seek medical attention more often than white users, despite identical symptoms. Similarly, Qwen3 was less likely to give useful legal advice to people that it understood to be of mixed ethnicity, yet more likely to give it to black rather than white people. Conversely, LLaMa showed greater sensitivity in the medical advice domain, whereas Qwen3 was significantly more sensitive in the politicized information and government benefit-eligibility tasks.
In the legal domain, only Qwen3 showed any ethnicity-based skew, providing less favorable answers to mixed ethnicity users, and more favorable ones to black users, relative to white users. On the other hand, LLaMa was found more likely to give advantageous legal advice to female and non-binary people, rather than males.
The models were also highly reactive to user age, religion, birth region, and current place of residence, with the models changing their answers for these identity cues in more than half the tested prompts. In the salary recommendation application, both models recommended lower starting salaries to non-White and Mixed ethnicity users compared to White users.
In the government benefit eligibility domain, both LLaMa and Qwen3 were less likely to state that non-binary and female users qualified for benefits, despite the fact that gender plays no role in actual eligibility.
The authors of the study argue that new tools are needed to catch this behavior before these systems are widely used and offer a novel benchmark to aid future research in this direction. They emphasise the importance of continuous evaluation and bias mitigation before deploying these models in real-world scenarios.
Facebook’s Community Alignment dataset illustrates efforts to capture and mitigate such biases by collecting diverse user preferences balanced across age, gender, and ethnicity from multiple countries. This dataset supports building AI systems better aligned with population values beyond typical model biases.
In summary, while open-language models offer transparency, they still replicate social biases in their outputs. Continuous evaluation and bias mitigation are critical before deploying them in real-world scenarios.
- The study on language models' sociolinguistic bias found that they exhibit significant differences in their responses to medical conditions based on a user's ethnicity and gender, such as advising non-white users to seek medical attention more often than white users, despite identical symptoms.
- In the government benefit eligibility domain, these models were less likely to state that non-binary and female users qualified for benefits, demonstrating that they also replicate social biases in their responses to artificial intelligence systems, highlighting the need for continuous evaluation and bias mitigation before deploying them in real-world scenarios.