[Summary] Mitigating Hallucinations in Multimodal LLMs With Attention Causal Decoding

TL;DR Hallucinations in multimodal LLMs fall into two categories: initial hallucinations, caused by insufficient model knowledge, and snowball hallucinations, where prior errors are reinforced for consistency. FarSight tackles both by redesigning information propagation: (i) sink tokens absorb uninformative signals to prevent downstream pollution and (ii) attention decay grounds the model in early generation tokens, curbing the snowball effect. Motivation Two key observations drive this work: Attention collapse: Models disproportionately attend to low-information tokens (e....

February 21, 2026 · 3 min · 521 words