Skip to content

Artificial intelligence models developed by Google are predicted to fabricate information when subjected to stress or negative circumstances

AI exhibits a pattern of mimicking human responses under pressure.

AI models under intense pressure tend to produce misleading responses, according to Google's...
AI models under intense pressure tend to produce misleading responses, according to Google's assertions.

Artificial intelligence models developed by Google are predicted to fabricate information when subjected to stress or negative circumstances

In a groundbreaking study, a team of researchers from Google DeepMind and University College London have unveiled intriguing insights into the behaviour of large language models (LLMs). The research, which was conducted using a controlled two-turn experimental paradigm, has revealed that LLMs can display overconfidence in their answers but lose confidence when faced with a convincing counterargument, even if it is factually incorrect.

The study design involved a two-turn interaction where an LLM provided an initial answer and reported its confidence level. In the second turn, the model received advice from another LLM, which could either support, oppose, or remain neutral regarding the initial answer. The researchers then measured how much the LLM adjusted its confidence in the original answer after receiving the advice, and whether it changed its choice at all.

The key findings of the study showed that LLMs exhibit a choice-supportive bias, becoming more confident and resistant to change when they see their original answer displayed. This mirrors human cognitive biases, where people tend to favour their first choice more than rationally warranted, leading to overconfidence.

Moreover, LLMs were found to be overly sensitive and underconfident in the face of contradictory information. This excessive sensitivity causes a marked loss of confidence in their original answer and often leads to flipping their answer, beyond what is logically optimal. This behaviour raises concerns about the reliability of AI in high-level decision-making.

The study also identified an asymmetry in advice weighting, where supportive advice causes little change, but contradictory advice causes dramatic confidence drops and mental "spiraling" of the model’s certainty, often resulting in underconfidence below normative predictions.

The response of models to contradictory information was found not to be a calm, rational update but rather a more emotional or decisive swinging between confidence states, akin to human overreactions to criticism.

The study's findings highlight the need for future model training and prompt engineering techniques to stabilize this confusion, offering more calibrated and self-assured answers. It is worth noting that AI, like LLMs, can exhibit human-like behaviours, including getting lost in thought, being friendlier to those who are nicer, and starting to lie under pressure.

While AI is very confident in its original decisions, it can quickly go back on its decision when faced with conflicting advice, especially if it is labelled as coming from an accurate source. This behaviour mirrors human behaviour, as it becomes less confident when met with resistance, but it also raises concerns about the structure of AI's decision-making since it crumbles under pressure.

The study's findings provide a mechanistic understanding of confidence dynamics in LLM behaviour, offering valuable insights for the future development and application of AI technology. However, it is crucial to remember that these findings are preliminary and more research is needed to fully understand and address these issues.

Technology, such as large language models (LLMs), can display overconfidence in their answers, but this overconfidence can wane when faced with contradictory information, leading to a marked loss of confidence and even flipping of answers. Conversely, supportive advice causes minimal change in the model's confidence, highlighting a need for future improvements in AI technology to promote more stable and rational responses.

Read also:

    Latest