Language Models Meet Anomaly Detection for Better Interpretability and Generalizability

التفاصيل البيبلوغرافية
العنوان: Language Models Meet Anomaly Detection for Better Interpretability and Generalizability
المؤلفون: Li, Jun, Kim, Su Hwan, Müller, Philip, Felsner, Lina, Rueckert, Daniel, Wiestler, Benedikt, Schnabel, Julia A., Bercea, Cosmin I.
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Computation and Language
الوصف: This research explores the integration of language models and unsupervised anomaly detection in medical imaging, addressing two key questions: (1) Can language models enhance the interpretability of anomaly detection maps? and (2) Can anomaly maps improve the generalizability of language models in open-set anomaly detection tasks? To investigate these questions, we introduce a new dataset for multi-image visual question-answering on brain magnetic resonance images encompassing multiple conditions. We propose KQ-Former (Knowledge Querying Transformer), which is designed to optimally align visual and textual information in limited-sample contexts. Our model achieves a 60.81% accuracy on closed questions, covering disease classification and severity across 15 different classes. For open questions, KQ-Former demonstrates a 70% improvement over the baseline with a BLEU-4 score of 0.41, and achieves the highest entailment ratios (up to 71.9%) and lowest contradiction ratios (down to 10.0%) among various natural language inference models. Furthermore, integrating anomaly maps results in an 18% accuracy increase in detecting open-set anomalies, thereby enhancing the language model's generalizability to previously unseen medical conditions. The code and dataset are available at https://github.com/compai-lab/miccai-2024-junli?tab=readme-ov-file
Comment: 13 pages, 7 figures. 5th International Workshop on Multiscale Multimodal Medical Imaging (MMMI 2024)
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2404.07622
رقم الأكسشن: edsarx.2404.07622
قاعدة البيانات: arXiv