Common depression assessment tools differ widely in the symptoms they ask about. How do they differ and what does this mean for depression research?
By latest estimates from the WHO, there are about 300 million people living with depression worldwide. But despite this number putting depression as one of the leading causes of global ill-health and disability, there are still huge challenges with both its diagnosis and treatment.
Clinical diagnosis of depression is traditionally carried out in adherence to standardized classification systems of DSM-5 and ICD-11. However, beyond this there are also a myriad of screening questionnaires which are commonly used to determine the possible presence of depression and assess the severity of depressive symptoms, either within the clinical domain, or by researchers trying to search for biomarkers or exploring alternatives to the often-criticized categorical approach to diagnosis offered by DSM.
See related post The Challenges of Mental Health Diagnosis
280 and counting….
With estimates in 2006 suggesting that there were 280 depression questionnaires developed over the previous eight decades , there is no shortage of assessment tools to choose from, although of course some have stood the test of time more than others. But, given that depression is essentially a heterogeneous classification of symptoms, do the major depression questionnaires that are commonly used in research studies today ask about roughly the same symptoms?
Knowing this is really important in order to decide which depression assessment questionnaire(s) to use in a research study, as it has significant impact on results relating the scores of symptom severity against other behavioral or neuroimaging measures that you might be collecting . It also helps interpret and decipher (in)consistencies in results when comparing across studies, and also when interpreting epidemiological findings (past, present and future).
15 most commonly used depression questionnaires
Here we’ve compared 15 commonly used questionnaires to screen for depression in adults or children selected based on the number of citations in the literature. The list is below
|Beck Depression Inventory-II (BDI-II)||Mood and Feelings Questionnaire (MFQ)|
|Hamilton Depression Rating Scale (HAM-D-21)||Children’s Depression Inventory (CDI2)|
|Patient Health Questionnaire -9 (PHQ-9)||Reynolds Adolescent Depression Scale (RADS2)|
|Center for Epidemiologic Studies Depression Scale Revised (CES-D-R)||Center for Epidemiological Studies Depression Scale for Children (CES-DC)|
|Depression Anxiety Stress Scales (DASS-42)|
|Quick Inventory of Depressive Symptomatology (QIDS SR-16)|
|Inventory of Depressive Symptomatology (IDS-SR)|
|Montgomery-Åsberg Depression Rating Scale (MADRS)|
|Zung Self-Rating Depression Scale (ZDS)|
|Geriatric Depression Scale (GDS-LF)|
|Edinburgh Postnatal Depression Scale (EPDS)|
How inconsistent are these depression questionnaires?
The questionnaires are compared by looking at the % of questions asking about each symptom category. To do this each question within each questionnaire was coded according to the specific symptom(s) it referred to (e.g. sleep difficulties, low mood etc). This coding was done based on a semantic assessment of the question (e.g. ‘sleep difficulties’ and ‘night wakings’ both refer to sleep problems). Similar symptoms were then collapsed into 43* major categories (e.g. low mood, feeling depressed, negative outlook were put into a category called “Mood & Outlook”). (There are other methods to categorize symptoms of depression see  but we note that our categorization took into consideration symptoms across 8 common disorders).
As might be expected based on our general sense of depression, all questionnaires asked at least some questions relating to mood & outlook and confidence & self judgement. However, even here some questionnaires were more dominantly focused on one or the other category. For example the QIDS SR-16 and MFQ dedicated less than 7% to ‘Mood & Outlook’ while the MADRS, GDS-LF and CES-DC dedicated 30%+ to this category. The QIDS instead focuses more dominantly on sleep and appetite while the MFQ focuses more on self-image. The other categories were even more heterogeneously distributed across the questionnaires – in many cases represented only in some and not others. What does this tell us about depression and what it is?
Figure 1: The percentage breakdown of symptoms being assessed across different questionnaires.
Feelings, thoughts or actions?
If we then look more generally at whether a question asks about the way someone feels (emotion), the way someone thinks (cognitive), the way someone acts (behavior), a physical symptom (physical), or about a particular trigger (trigger) or consequence (consequence) of a symptom this gives us a further impression, shown in figure 2, of how depression as an overall disorder is being assessed.
Figure 2: The percentage breakdown of symptom types being assessed across different questionnaires.
Here we see that questions about depression predominantly focus on the patient’s feelings and physical symptoms, with much less emphasis on cognitive functioning or the patient’s behavior. However, again we find that the bias (e.g. emotion vs cognitive, behavior vs physical), varies from one questionnaire to another. For example the HAM-D-21, QIDS-21 and IDS-SR stand out in their much stronger focus on physical symptoms compared to the other questionnaires. This raises the question: although we typically think of depression as an internalizing disorder, is it right that these depression questionnaires (although see ) hardly ask about the actions or behavior of the patient?
A heterogeneous assessment of a heterogeneous disorder.
Where does that leave the understanding of depression? In a bit of a mess, really. Given the heterogeneity of depression itself, it is perhaps not that surprising that different questionnaires are asking about different symptoms. However, since the underlying physiology of depression is yet unknown, a standardized symptom set would be essential for research that is seeking links to depression and its many manifestations.
Take the example of trying to correlate the severity of depressive symptoms against an EEG effect that you’ve found in your data. What it means is that your particular choice of depression questionnaire could make a big difference to the results you find, especially when you consider that the patients with depression in your study will also vary in the particular array of symptoms that they present. With so much variability, it is perhaps not surprising that researchers have struggled for decades to make any sizeable progress in treating depression.
See related posts EEG Frequency Bands across Mental Health Disorders and Frontal Asymmetry in Depression
Despite these concerns, it is still relatively rare for researchers to justify the use of one assessment questionnaire over another in their publication, or to give more weight to the specific symptoms of the patient, rather than the disorder label (although that is changing to some degree with initiatives like RDoC). However, careful consideration of which questionnaire you choose is merited, especially as the research community moves away from seeing depression as a one-size fits all label for a loosely connected cluster of symptoms.
 Santor, D., Gregus, M., & Welch, A. (2006). FOCUS ARTICLE: Eight Decades of Measurement in Depression. Measurement: Interdisciplinary Research & Perspective, 4(3), 135-155. doi: 10.1207/s15366359mea0403_1
 Newson, J.J. and Thiagarajan T.C. (2019) EEG Frequency Bands in Psychiatric Disorders: A Review of Resting State Studies. Front Hum Neurosci. 2019 Jan 9;12:521. doi: 10.3389/fnhum.2018.00521.
 Fried, E. (2017). The 52 symptoms of major depression: Lack of content overlap among seven common depression scales. Journal Of Affective Disorders, 208, 191-197. doi: 10.1016/j.jad.2016.10.019
 Kanter, J., Mulick, P., Busch, A., Berlin, K., & Martell, C. (2006). The Behavioral Activation for Depression Scale (BADS): Psychometric Properties and Factor Structure. Journal Of Psychopathology And Behavioral Assessment, 29(3), 191-202. doi: 10.1007/s10862-006-9038-5