PHQ-9 Score
Patient Health Questionnaire-9 — a validated nine-item screening and monitoring tool for major depressive disorder. Scores range from 0 to 27, mapping to five severity categories from minimal to severe depression, with established treatment action thresholds.
Calculate PHQ-9 Score
For each of the nine items, select the response that best describes how often the patient has been bothered by the problem over the past two weeks. Item 9 screens for suicidal ideation and requires immediate clinical assessment if positive.
⚠ Item 9 Positive — Suicidal Ideation Disclosed.
The patient has endorsed thoughts of self-harm or being better off dead. This requires immediate clinical assessment regardless of the total PHQ-9 score. Conduct a structured suicide risk assessment, evaluate intent, plan, means, and protective factors. Ensure the patient’s safety before concluding the consultation and document the assessment and management plan.
Item 9 screens for suicidal ideation. Any positive response to item 9 (≥1) requires immediate further assessment regardless of the total score. The PHQ-9 is not a suicide risk assessment tool — a positive item 9 is a prompt to conduct a structured safety evaluation, not a substitute for one.
Understanding the PHQ-9
The Patient Health Questionnaire-9 (PHQ-9) was developed by Drs Robert Spitzer, Janet Williams, and Kurt Kroenke and first published in 2001 as part of the broader PHQ suite of screening instruments. It was designed specifically for primary care, where depression is common but frequently underdiagnosed. The nine items correspond directly to the nine symptom criteria for major depressive disorder in the DSM-IV (and subsequently the DSM-5).
Unlike many depression scales that were originally developed in psychiatric populations, the PHQ-9 was designed and validated in primary care from the outset. It serves a dual purpose: it can be used as a screening tool to identify patients who may have depression, and as a severity measure to monitor symptom change over time — making it particularly valuable for tracking treatment response.
Scoring Method
Each of the 9 items is scored 0–3:
0 = Not at all
1 = Several days
2 = More than half the days
3 = Nearly every day
Total score range: 0–27
The recall period is the past two weeks, aligning with the DSM-5 criterion of a two-week symptom duration for a major depressive episode.
Diagnostic Performance
At a cutoff of ≥10 for major depression:
Sensitivity: 88%
Specificity: 88%
The PHQ-9 has been validated in over 40 languages and across diverse clinical populations including primary care, inpatient, obstetric, oncology, and chronic disease settings. A score change of ≥5 points is considered a clinically meaningful difference in treatment monitoring.
The PHQ-9 measures symptom severity, not diagnosis. While it maps to DSM-5 criteria for major depressive disorder, it is a dimensional measure of depressive symptoms, not a categorical diagnostic instrument. A score of 15 does not mean “the patient has major depression” — it means the patient is reporting a moderately severe burden of depressive symptoms that warrants clinical assessment and likely treatment.
Score Interpretation & Treatment Thresholds
The PHQ-9 severity categories guide treatment decisions based on the original validation study by Kroenke, Spitzer, and Williams (2001). Each threshold was associated with distinct levels of functional impairment, disability days, and healthcare utilisation.
| Score | Severity | Proposed Treatment Action |
|---|---|---|
| 0–4 | Minimal or none | No treatment required for depression. Re-screen if clinical suspicion persists or at routine intervals. A score of 0–4 does not exclude depression if the patient is already on treatment (may indicate response/remission). |
| 5–9 | Mild | Watchful waiting with repeat PHQ-9 at follow-up. Consider brief supportive interventions, psychoeducation, lifestyle recommendations (exercise, sleep hygiene). Initiate treatment if symptoms are persistent (≥2 months), functionally impairing, or if the patient has a history of more severe episodes. |
| 10–14 | Moderate | Treatment is recommended. Options include antidepressant pharmacotherapy, evidence-based psychotherapy (CBT, IPT, behavioural activation), or a combination. Patient preference should guide choice. Set a follow-up plan with repeat PHQ-9 to monitor response. |
| 15–19 | Moderately severe | Active treatment with antidepressants, psychotherapy, or both is strongly recommended. Consider psychiatric referral if complexity, comorbidity, or inadequate initial treatment response. Monitor closely with PHQ-9 at 4–6 week intervals. |
| 20–27 | Severe | Immediate initiation of antidepressant treatment is recommended, often combined with psychotherapy. Psychiatric referral is strongly advised. Assess for suicidal ideation (item 9), psychotic features, and safety. Functional capacity and social support should be evaluated. Close follow-up is essential. |
A PHQ-9 score of ≥10 is the most widely used threshold for identifying clinically significant depression warranting treatment. However, treatment decisions should integrate the score with functional impairment (the 10th question), patient preference, episode duration, past psychiatric history, and comorbid conditions. A patient with a score of 8 and severe functional impairment may need treatment just as much as a patient with a score of 12 and preserved function.
Using the PHQ-9 for Monitoring
The PHQ-9 is one of few depression screening tools explicitly validated for monitoring treatment response over time. Key monitoring benchmarks include: a score reduction of ≥50% from baseline suggests treatment response; a final score of <5 suggests remission; and a change of ≥5 points is considered the minimum clinically important difference. Failure to achieve a ≥5-point reduction after 6–8 weeks of adequate treatment should prompt reassessment of the diagnosis, treatment adherence, and treatment plan.
Clinical Use & Comparison With Other Tools
The PHQ-9 occupies a unique position among depression instruments — it is brief enough for routine clinical use, yet comprehensive enough for severity grading and treatment monitoring. The following sections detail its clinical applications, diagnostic algorithm, and how it compares with alternative screening instruments.
In addition to the severity score, the PHQ-9 can be used as a provisional diagnostic tool using a categorical algorithm. A “provisional diagnosis” of major depressive disorder can be considered if:
- At least 5 of the 9 items are scored ≥2 (“more than half the days”), AND
- At least one of the positive items is item 1 (anhedonia) or item 2 (depressed mood)
This mirrors the DSM-5 diagnostic criteria requiring five or more symptoms (including depressed mood or anhedonia) present for at least two weeks. However, the diagnostic algorithm has lower sensitivity than the ≥10 cutoff score and is not recommended as the sole method of screening. It is most useful when a clinician wants to check whether the PHQ-9 symptom pattern is consistent with a formal DSM-5 diagnosis.
Note: Item 9 (suicidal ideation) counts towards the criterion count — a score of ≥2 on item 9 counts as a positive criterion. However, item 9 should always be assessed clinically regardless of its contribution to the algorithm.
The PHQ-2 consists of just the first two items of the PHQ-9 — anhedonia and depressed mood — scored identically (0–3 each, total 0–6). At a cutoff of ≥3, it has a sensitivity of approximately 83% and specificity of 92% for major depression, making it a useful rapid pre-screen.
A common clinical workflow is to administer the PHQ-2 first and then proceed to the full PHQ-9 only if the PHQ-2 score is ≥3. This two-stage approach reduces screening burden while maintaining acceptable diagnostic accuracy. The PHQ-2 is recommended by several guidelines (including the USPSTF) as an initial screening step in primary care.
However, the PHQ-2 alone cannot grade severity, monitor treatment response, or identify the specific symptom domains affected — tasks for which the full PHQ-9 is needed.
Several validated tools exist for depression screening and severity assessment. The choice depends on the clinical setting, purpose, and available time:
- BDI-II (Beck Depression Inventory-II): 21 items, self-report. Well validated and widely used in research and clinical psychology. More comprehensive than the PHQ-9 but longer to administer. Copyrighted and requires purchase, unlike the PHQ-9 which is freely available.
- HAM-D (Hamilton Depression Rating Scale): 17–21 items, clinician-rated. Considered a reference standard in clinical trials. Requires training to administer reliably. Not practical for routine primary care use but important in psychiatric and research settings.
- GAD-7 (Generalised Anxiety Disorder-7): Not a depression tool, but highly relevant because anxiety and depression frequently co-occur. The PHQ-9 and GAD-7 are often administered together for a combined assessment of the two most common mental health conditions in primary care.
The PHQ-9’s advantages are its brevity, free availability, dual use for screening and monitoring, direct correspondence to DSM criteria, and extensive cross-cultural validation. Its main limitation is that it does not assess bipolar disorder, psychotic features, or substance use-related mood symptoms.
The PHQ-9 is central to measurement-based care (MBC) — a systematic approach to psychiatric treatment in which validated rating scales are administered at each visit to track symptom trajectory and inform treatment decisions. Evidence from studies such as STAR*D demonstrates that MBC improves depression outcomes compared to standard care.
Key monitoring targets:
- Response: ≥50% reduction from baseline score (e.g. baseline 18 → ≤9)
- Remission: score <5
- Clinically meaningful change: ≥5-point change from previous score
- Monitoring interval: every 2–4 weeks during acute treatment, extending to 4–8 weeks during maintenance
If the PHQ-9 score has not decreased by ≥5 points after 6–8 weeks of adequate-dose antidepressant therapy, clinicians should reassess the diagnosis, check adherence, consider dose optimisation, augmentation, switching, or adding psychotherapy. Persistent high scores despite adequate treatment should prompt consideration of treatment-resistant depression and specialist referral.
The 10th question — “How difficult have these problems made it for you to do your work, take care of things at home, or get along with other people?” — is not included in the PHQ-9 score but provides essential clinical context. A patient may score 12 on the PHQ-9 (moderate) but report “not at all difficult” functional impairment, suggesting reasonable coping despite symptoms. Conversely, a patient scoring 8 (mild) who reports “extremely difficult” impairment may require more active intervention than the score alone would suggest.
Functional impairment is a core criterion for major depressive disorder in the DSM-5 — symptoms must cause “clinically significant distress or impairment in social, occupational, or other important areas of functioning.” The functional impairment question helps bridge the gap between symptom count and clinical significance.
The PHQ-9 is most powerful when used repeatedly over time, not as a one-off snapshot. A single score tells you where the patient is now; serial scores tell you whether treatment is working. Track the trajectory — a patient whose score drops from 22 to 14 is improving even though they remain in the “moderate” range.
Special Populations & Considerations
Cross-cultural considerations: The PHQ-9 has been translated into over 40 languages and validated across diverse ethnic and cultural populations. However, the expression and experience of depressive symptoms varies across cultures — some populations may emphasise somatic symptoms over emotional ones, or may not endorse “feeling guilty” or “self-harm” items due to cultural or religious contexts. Validated translations should be used where available, and cultural context should inform interpretation.
Common Pitfalls & Limitations
A PHQ-9 score of 15 does not mean the patient has major depressive disorder. It means the patient reports a moderately severe burden of depressive symptoms over the past two weeks. These symptoms could reflect major depression, an adjustment disorder, grief, bipolar depression, depression secondary to a medical condition, medication side effects, or substance use. A positive screen is a starting point for clinical assessment, not an endpoint.
Always follow a positive PHQ-9 screen with a clinical interview that evaluates symptom duration, functional impairment, psychiatric history, medical causes, substance use, and differential diagnosis. The PHQ-9 score informs the conversation — it does not replace it.
Item 9 asks about thoughts of being better off dead or of self-harm. Any positive response (≥1) requires immediate clinical follow-up — regardless of the total score. A patient with a PHQ-9 total of 6 (mild) who endorses item 9 at a frequency of “several days” has disclosed suicidal ideation and needs a safety assessment just as urgently as a patient scoring 25.
The PHQ-9 is not a suicide risk assessment tool. A positive item 9 should trigger a structured evaluation of suicidal ideation (frequency, intensity, duration), intent, plan, access to means, protective factors, and history of prior attempts. Document the assessment and management plan, and ensure the patient’s safety before the consultation ends.
The PHQ-9 measures depressive symptoms but does not screen for mania or hypomania. Patients with bipolar depression may score highly on the PHQ-9 and appear indistinguishable from unipolar depression on the questionnaire alone. Starting an antidepressant without a mood stabiliser in a patient with unrecognised bipolar disorder can precipitate a manic episode.
Before initiating antidepressant treatment based on a positive PHQ-9, always ask about: previous episodes of elevated mood, decreased need for sleep with increased energy, grandiosity, pressured speech, impulsive behaviour, family history of bipolar disorder, and previous adverse reactions to antidepressants (e.g. “feeling wired” or switching to mania). Consider using the Mood Disorder Questionnaire (MDQ) as a screening adjunct.
The functional impairment question is frequently overlooked because it does not contribute to the total score. However, it is one of the most clinically useful components of the PHQ-9. DSM-5 requires functional impairment for a diagnosis of major depressive disorder — a patient with five depressive symptoms but no functional impact does not meet diagnostic criteria.
Functional impairment also predicts treatment need and prognosis. A patient reporting “extremely difficult” impairment at a score of 10 likely needs more aggressive treatment than a patient at the same score reporting “not at all difficult.” Always review this question alongside the total score.
Four of the nine PHQ-9 items assess somatic symptoms — sleep disturbance (item 3), fatigue (item 4), appetite change (item 5), and psychomotor change (item 8). In patients with chronic medical conditions, these symptoms may be attributable to the medical illness rather than depression, potentially inflating the PHQ-9 score.
There is no universally accepted correction for somatic overlap. Some clinicians use the cognitive-affective subscale (items 1, 2, 6, 7, 9) as a more specific depression indicator in medically complex patients. Others interpret the total score in context, recognising that even if somatic symptoms are partly medical, the patient’s overall symptom burden still affects wellbeing and may benefit from treatment.
Quick Reference Summary
| Score Range | Severity | Action |
|---|---|---|
| 0–4 | Minimal | Routine re-screening |
| 5–9 | Mild | Watchful waiting, lifestyle, repeat PHQ-9 |
| 10–14 | Moderate | Treatment recommended (antidepressant or therapy) |
| 15–19 | Moderately severe | Active treatment; consider referral |
| 20–27 | Severe | Immediate treatment; psychiatric referral advised |
The Golden Rule: A single PHQ-9 score is a snapshot. Serial scores are a trajectory. Use the PHQ-9 at every visit to track whether treatment is moving the needle — and always assess item 9 regardless of the total score.
Disclaimer & References
For Educational Purposes Only. This calculator and the accompanying clinical information are intended as educational tools for healthcare professionals. They do not replace clinical judgement. Results should be interpreted in the full clinical context. Lab reference ranges vary by institution — verify with your own laboratory. Drug dosages should be confirmed against current prescribing information.
References
- Kroenke K, Spitzer RL, Williams JB. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med. 2001;16(9):606–613. DOI: 10.1046/j.1525-1497.2001.016009606.x
- Spitzer RL, Kroenke K, Williams JB. Validation and utility of a self-report version of PRIME-MD: the PHQ primary care study. JAMA. 1999;282(18):1737–1744. DOI: 10.1001/jama.282.18.1737
- Löwe B, Kroenke K, Herzog W, Gräfe K. Measuring depression outcome with a brief self-report instrument: sensitivity to change of the Patient Health Questionnaire (PHQ-9). J Affect Disord. 2004;81(1):61–66. DOI: 10.1016/S0165-0327(03)00198-8
- Manea L, Gilbody S, McMillan D. Optimal cut-off score for diagnosing depression with the Patient Health Questionnaire (PHQ-9): a meta-analysis. CMAJ. 2012;184(3):E191–E196. DOI: 10.1503/cmaj.110829
- Kroenke K, Spitzer RL. The PHQ-9: a new depression diagnostic and severity measure. Psychiatr Ann. 2002;32(9):509–515. DOI: 10.3928/0048-5713-20020901-06
- Levis B, Benedetti A, Thombs BD; DEPRESsion Screening Data (DEPRESSD) Collaboration. Accuracy of Patient Health Questionnaire-9 (PHQ-9) for screening to detect major depression: individual participant data meta-analysis. BMJ. 2019;365:l1476. DOI: 10.1136/bmj.l1476
- Siu AL; US Preventive Services Task Force. Screening for depression in adults: US Preventive Services Task Force recommendation statement. JAMA. 2016;315(4):380–387. DOI: 10.1001/jama.2015.18392
- Trivedi MH, Rush AJ, Wisniewski SR, et al. Evaluation of outcomes with citalopram for depression using measurement-based care in STAR*D: implications for clinical practice. Am J Psychiatry. 2006;163(1):28–40. DOI: 10.1176/appi.ajp.163.1.28
- Gilbody S, Richards D, Brealey S, Hewitt C. Screening for depression in medical settings with the Patient Health Questionnaire (PHQ): a diagnostic meta-analysis. J Gen Intern Med. 2007;22(11):1596–1602. DOI: 10.1007/s11606-007-0333-y
- Richardson LP, McCauley E, Grossman DC, et al. Evaluation of the Patient Health Questionnaire-9 Item for detecting major depression among adolescents. Pediatrics. 2010;126(6):1117–1123. DOI: 10.1542/peds.2010-0852