Detail výsledku VaV

Originální název

Depression detection using deep learning and large language models from multimodalities

Anglický název

Depression detection using deep learning and large language models from multimodalities

Druh

Článek WoS

Originální abstrakt

Depression is a complex psychiatric disorder that affects neural functioning, cognition, emotion, and behavior, making objective assessment a persistent clinical challenge. Traditional diagnostic methods depend on subjective interpretation, whereas recent advances in deep learning have enabled automated, data-driven detection across physiological and behavioral modalities. Among unimodal approaches, electroencephalography (EEG) remains the most widely used due to its sensitivity to depression-related neurophysiological alterations. However, EEG models often rely on small, homogeneous datasets and controlled laboratory conditions, limiting their generalizability. Multimodal architectures that integrate speech, facial expression, and EEG features provide richer representations and consistently outperform single-modality systems. Transformer-based fusion mechanisms and attention-guided models effectively capture complementary cross-modal cues, achieving 90%–95% accuracy on controlled laboratory datasets such as SEED-IV, while yielding more conservative F1-scores of approximately 0.80–0.90 on ecologically valid community datasets such as DAIC-WOZ. The emergence of Large Language Models (LLMs) represents a further methodological shift, offering cross-modal alignment, contextual inference, and data-efficient adaptation through unified embedding spaces and few-shot capabilities. This mini-review synthesizes recent advances in EEG-based, multimodal, and LLM-driven depression detection. It evaluates how modality diversity and architectural sophistication enhance performance while critically examining persisting limitations in dataset diversity, standardization, interpretability, and clinical validation. The convergence of multimodal deep learning with LLM reasoning signals a promising direction toward scalable, explainable, and clinically deployable AI systems for the assessment of objective depression.

Anglický abstrakt

Depression is a complex psychiatric disorder that affects neural functioning, cognition, emotion, and behavior, making objective assessment a persistent clinical challenge. Traditional diagnostic methods depend on subjective interpretation, whereas recent advances in deep learning have enabled automated, data-driven detection across physiological and behavioral modalities. Among unimodal approaches, electroencephalography (EEG) remains the most widely used due to its sensitivity to depression-related neurophysiological alterations. However, EEG models often rely on small, homogeneous datasets and controlled laboratory conditions, limiting their generalizability. Multimodal architectures that integrate speech, facial expression, and EEG features provide richer representations and consistently outperform single-modality systems. Transformer-based fusion mechanisms and attention-guided models effectively capture complementary cross-modal cues, achieving 90%–95% accuracy on controlled laboratory datasets such as SEED-IV, while yielding more conservative F1-scores of approximately 0.80–0.90 on ecologically valid community datasets such as DAIC-WOZ. The emergence of Large Language Models (LLMs) represents a further methodological shift, offering cross-modal alignment, contextual inference, and data-efficient adaptation through unified embedding spaces and few-shot capabilities. This mini-review synthesizes recent advances in EEG-based, multimodal, and LLM-driven depression detection. It evaluates how modality diversity and architectural sophistication enhance performance while critically examining persisting limitations in dataset diversity, standardization, interpretability, and clinical validation. The convergence of multimodal deep learning with LLM reasoning signals a promising direction toward scalable, explainable, and clinically deployable AI systems for the assessment of objective depression.

Klíčová slova

Multimodal Depression Detection, Deep Learning Architectures, EEG-based Classification, Large Language Models, Affective Computing

Klíčová slova v angličtině

Multimodal Depression Detection, Deep Learning Architectures, EEG-based Classification, Large Language Models, Affective Computing

Autoři

HUSSAIN, Y.; ZAHEER, M.; KHAN, A.; MALIK, A.

Vydáno

09.03.2026

Nakladatel

Frontiers Media SA

Periodikum

Frontiers in Digital Health

Svazek

8

Číslo

8

Stát

Švýcarská konfederace

Strany počet

10

URL

https://www.frontiersin.org/journals/digital-health/articles/10.3389/fdgth.2026.1759857/full

BibTex

@article{BUT201699,
  author="{} and Yasir {Hussain} and  {} and Muhammad Asad {Zaheer} and  {} and  {} and Aamir Saeed {Malik}",
  title="Depression detection using deep learning and large language models from multimodalities",
  journal="Frontiers in Digital Health",
  year="2026",
  volume="8",
  number="8",
  pages="10",
  doi="10.3389/fdgth.2026.1759857",
  url="https://www.frontiersin.org/journals/digital-health/articles/10.3389/fdgth.2026.1759857/full"
}

VUT

Fakulty a vysokoškolské ústavy

Součásti

Depression detection using deep learning and large language models from multimodalities