Information fusion is used widely to improve document classification by the integration of multiple data sources (multimodal) or representations (multiview). However, the field lacks a unified framework, a quantitative synthesis of its effectiveness, and clear guidance for practitioners. This systematic review addresses these gaps by analysing 139 primary studies. It introduces a formal framework to structure the field, presents the results of a qualitative analysis to identify key trends, and performs a random-effects meta-analysis (to our knowledge, the first focused on document classification) to quantify performance gains. Our meta-analysis reveals that multimodal fusion improves accuracy (mean gain of +5.28 percentage points, p = .0016) significantly—the F1-score effect is directionally positive but statistically non-significant in our primary model. Multiview fusion provides consistent but modest gains for accuracy (+4.67%), F1-score (+3.08%), and recall (all p < .05). Critically, our qualitative synthesis uncovers challenges in reproducibility in methodological rigour: only 11.8% (multimodal) and 23.3% (multiview) of the studies use statistical tests to validate their findings, which undermines the reliability of many of their results. This review’s primary contributions are a unifying framework, the first quantitative evidence base, and data-driven guidelines. This review concludes that successful information fusion depends not on algorithmic complexity, but on the strategic alignment of the fusion method with the task context and a commitment to more rigorous validation.
Czytelnik może znaleźć więcej informacji w wersji angielskiej wpisu lub bezpośrednio w artykule.