Mental Health Prediction

Predicting Mental Wellbeing from Life Circumstance

How much of our mental wellbeing is circumstantial – the impact of our lifestyle, adversities and traumas rather than something intrinsic to our brain function?

Here we provide a preliminary view of using lifestyle aspects and adversities to predict whether an individual is in a clinical or clinical risk category for a mental health disorder versus in the positive range.  This is done using data from the Mental Health Million Project from 50,000 people where answers to questions about life adversities and traumas, occupation and sleep, social interaction and exercise behaviors were used to predict whether someone was in the clinical/at risk category or not.

The clinical/at risk category in the MHQ

Mental health disorders are essentially groupings of symptoms where symptoms can include emotions, behaviors, cognitive challenges or even physical symptoms like fatigue. The MHQ collects responses to 47 elements that map to each of the symptoms across 10 major mental health disorders as well as extends to include positive assets of mental wellbeing and elements from the NIMH Research Domain Criteria.  Each element is rated on a life impact scale where a rating of severity of negative life impact of a problem at 8 or 9 on a 9 point scale, for example, is considered a clinical ‘symptom’.

Based on these ratings, the MHQ positions individuals on the spectrum from ‘clinical’ to ‘thriving‘, spanning a possible range of scores from −100 to +200 where negative scores indicate clinical risk.  Importantly the MHQ score is not based on a simple averaging of question ratings but rather each individual rating is nonlinearly transformed such that someone whose symptoms map to one or more clinical disorder are scored in the negative range of clinical risk.  The thresholds between negative and positive are optimized and calibrated such that <1% of those in the positive range have severe problems that map to any clinical disorder and >99% of those in the clinical category map to at least one disorder.  Thus someone with three very significant issues which have a significant negative impact to their life and meet a diagnostic criteria based on symptom severity would be classified as clinical even though they may have a high average rating score overall on other dimensions. On the other hand, someone with a low average rating overall but no individual item meeting a threshold of severity for clinical diagnosis would not be classified as clinical but rather in a normal, positive range.

This prediction is thus meant to classify if someone is either likely to be in the negative range: i.e. map to at least one disorder or have multiple clinical symptoms.


How well did the prediction work?

Most methods of prediction from logistic regression to random forests to gradient boosting classifiers worked reasonably well and pretty similarly so we’ll focus on the logistic regression for the sake of simplicity.

Overall accuracy was ~80% and precision was ~60%. What this means is that 80% of people were correctly classified as either in the positive (normal) or negative (clinical) range. And out of those classified as negative or clinical, 65% were actually negative or clinical.

What were the key variables that contributed to prediction?

Based on the information gain, here are the top 10 variables that contributed to the logistic regression classification.  These variables contributed to both the exclusion and inclusion into the negative classification and included an occupation of retired or student, rarely getting adequate sleep or social interaction, experiencing life threatening illness or injury, losing a loved one suddenly or prematurely, being witness to war, experiencing divorce or a family break-up, or on the other hand, having no life trauma or adversity and finally having cancer. Of these being Retired and never having experienced any adversity or trauma enhanced probability of a positive MHQ score while most of the others enhanced probability of a negative or clinical classification.

We note that other models may order these differently but nonetheless most of these ten factors appear consistently in the top ten.

What does this mean?

Essentially, this tells us that major adversities, traumas and life circumstances determine substantially – but not completely – our mental state and whether we will be ‘diagnosed’ with a clinical mental health disorder.  By delving into more specific profiles of clinical risk, such predictions could help us separate those whose mental conditions are circumstantial versus those who have challenges arising from internal factors. Of course, these two are not completely independent but delving deeper will help us determine which symptoms and ‘disorder’ diagnoses are more likely to arise out of circumstantial challenges and which are not.  This can be useful for determining the appropriate course of treatment.


Further, it will be interesting to understand differences physiologically between those who have experienced significant adversity and trauma but have high positive MHQ scores versus those who fall into the negative clinical range.

Leave a Reply