How do you score severity in a heuristic evaluation?

Most teams use a 0 to 4 scale: 0 not a problem; 1 cosmetic; 2 minor (fix if time allows); 3 major (must fix before next release); 4 catastrophe (block the release). Severity is a function of frequency, impact and persistence. Higher numbers mean higher priority in the roadmap.

Heuristic evaluation versus usability testing — which should I do first?

Run heuristic evaluation first. It surfaces the known failure modes cheaply and quickly so the usability test can focus on uncovering the unknowns. Doing them in the other order means you pay for users to discover problems any senior practitioner would have flagged in a morning.

Heuristic Evaluation: Nielsen's 10 Heuristics with Examples

Q: What is heuristic evaluation in UX?

Heuristic evaluation is an expert review method where a senior practitioner walks through a digital product and rates it against a set of established usability principles. Nielsen's 10 heuristics are the most widely used. Each finding cites the heuristic it violates and is scored for severity. It is fast, cheap and the most common method inside a UX audit.

Q: What are Nielsen's 10 heuristics?

Visibility of system status; Match between system and the real world; User control and freedom; Consistency and standards; Error prevention; Recognition rather than recall; Flexibility and efficiency of use; Aesthetic and minimalist design; Help users recognise, diagnose and recover from errors; Help and documentation. Defined by Jakob Nielsen in 1994 and updated by the Nielsen Norman Group.

Q: How many evaluators do you need for a heuristic evaluation?

Three to five experienced evaluators is the empirical sweet spot. A single evaluator catches around 35 percent of issues; three catch around 60 percent; five catch around 75 percent. Beyond five the return diminishes sharply.

What heuristic evaluation is

Heuristic evaluation is an expert review method. A senior practitioner walks through a product and rates it against a set of established usability principles. The output is a list of findings, each citing the heuristic it violates, each scored for severity, each paired with a recommended fix.

It was defined by Jakob Nielsen and Rolf Molich in the early 1990s. The Nielsen Norman Group still maintains the canonical list. Three decades on, it remains the single most useful framework in UX because it covers the structural failure modes of digital products with very little overlap and very few gaps.

Heuristic evaluation is the cheapest credible way to find out what's wrong with a product. The usability test comes after, to find the things the heuristics don't see.

When to use it

The right moments to run a heuristic evaluation, in rough order of return on time invested.

Before a usability test. Catch the obvious failures so the test isn't dominated by them.
Before a redesign or replatforming. Baseline the original to avoid regressing the working parts.
After a conversion drop. Heuristic review surfaces the qualitative failures analytics can't see.
As a quarterly product health check. Bake it into the team's cadence. Every quarter, a senior practitioner walks the core flows against the heuristics.
As part of a wider UX audit. The methodology spine; most audits include a heuristic review as the first pass. See the full UX audit guide for context.

What follows is the canonical ten, in Nielsen's original order. Each entry covers the principle, the failure patterns we see most often, an example, and how to score it.

1. Visibility of system status

The system always keeps users informed about what is going on, through appropriate feedback within reasonable time.

Failure patterns

Loading states absent or misleading.
Submitting forms with no acknowledgement.
Multi-step flows that don't show progress.
Async background actions (file uploads, payments) with no status surface.
"Saved" indicators that never appear or never disappear.

★

Example

A banking app that takes seven seconds to confirm a payment shows no indicator during that time. Users tap twice. Some pay twice. Diagnosed as a Visibility of System Status failure, fixed with a single inline progress confirmation.

2. Match between system and the real world

The system should speak the user's language, with words, phrases and concepts familiar to the user. Real-world conventions, information in a natural and logical order.

Failure patterns

Internal jargon leaking into the UI ("entity", "object", "instance").
Date formats that don't match the user's locale.
Icons that mean something in the design system but nothing to the user.
Information architectures organised by the company's org chart rather than the user's mental model.
Error codes shown without translation.

The most common version of this failure is shipping the engineering team's vocabulary unchanged. Audit fix: a UX writing pass on every system-facing label.

3. User control and freedom

Users often choose system functions by mistake. They need a clearly marked emergency exit to leave the unwanted state without going through an extended dialogue. Support undo and redo.

Failure patterns

Destructive actions with no confirmation step.
Multi-step flows with no obvious way to back out.
Modal dialogs that trap focus and offer only one path forward.
No undo on delete, cancel, or unsubscribe actions.
Saved drafts that can't be recovered after navigating away.

This heuristic overlaps with WCAG 3.3.4 (error prevention for legal, financial or data submissions). Where they overlap, cite the WCAG criterion in the audit — it carries more weight than the heuristic alone.

4. Consistency and standards

Users should not have to wonder whether different words, situations or actions mean the same thing. Follow platform and industry conventions.

Failure patterns

Two different button styles for the same action across the product.
Save buttons in different positions in different flows.
Drift between the design system and the production code.
Custom controls that mimic but don't behave like native ones (dropdowns that don't open on space, sliders that don't move on arrow keys).
Different terminology for the same concept across pages.

5. Error prevention

Better than good error messages is a careful design that prevents a problem from occurring in the first place.

Failure patterns

Free-text inputs where a constrained input would do (postcodes, phone numbers, dates).
No inline validation; errors only surface on submit.
Identical or near-identical buttons placed close together ("Save" next to "Delete").
Time-bounded actions with no warning before timeout.
Auto-correcting input that silently changes user data without confirmation.

6. Recognition rather than recall

Minimise the user's memory load by making objects, actions and options visible. The user should not have to remember information from one part of the dialogue to another.

Failure patterns

Multi-step flows that surface key context only on the first step.
Form fields where the label disappears when the user starts typing.
Sub-menus that require the user to remember the path to find them again.
Reference codes shown briefly then never again.
Search results that strip the original query from the page.

7. Flexibility and efficiency of use

Accelerators, unseen by the novice user, may speed up interaction for the expert user. Allow users to tailor frequent actions.

Failure patterns

No keyboard shortcuts for repetitive actions.
Bulk actions absent in lists that obviously need them (selecting many items, sending many invites).
No way to save common configurations as templates or favourites.
Workflows that force novice steps on expert users with no opt-out.

This heuristic is the one most often deprioritised in audits because expert users are a smaller cohort. Worth scoring honestly; the lifetime value of expert users is disproportionate.

8. Aesthetic and minimalist design

Dialogues should not contain information that is irrelevant or rarely needed. Every extra unit of information competes with the relevant units and diminishes their relative visibility.

Failure patterns

Form fields that ask for information the company won't use.
Dashboards with twenty metrics where four would do.
Marketing chrome around transactional moments (signup, checkout, password reset).
Notifications, banners and cookie modals stacked on the same page.
Decorative animation that competes with the primary action.

"Aesthetic" here doesn't mean prettier; it means quieter. The audit fix is almost always to remove, not to redesign.

9. Help users recognise, diagnose and recover from errors

Error messages should be expressed in plain language, precisely indicate the problem, and constructively suggest a solution.

Failure patterns

"Something went wrong" with no specific guidance.
Validation messages that name the rule violated but not the fix ("Invalid input").
Error codes shown without explanation.
Blame-laden language ("You entered an incorrect password" rather than "That password didn't match").
Errors that disappear before the user can read them.

This is the heuristic most directly improved by good UX writing. The UX writing generator produces compliant variants for any error context.

10. Help and documentation

Even though it is better if the system can be used without documentation, it may be necessary to provide help. Any such information should be easy to search, focused on the user's task, list concrete steps to be carried out, and not be too large.

Failure patterns

Help links that lead to generic marketing pages.
Documentation organised by feature rather than by user task.
"Contact us" as the only available support path.
In-product help that opens a new tab and loses the user's context.
Tooltips and microcopy missing on complex controls.

Severity scoring

Findings without severity scores get ignored. Use a 0 to 4 scale, applied honestly. Most teams default to scoring everything as severity 3 because it feels safe; this dilutes the prioritisation and the roadmap becomes a flat list.

Severity rubric · 0–4 scale

How to score each finding

0 — Not a usability problem. The finding was an opinion, not an issue against the heuristic. Drop it from the report.
1 — Cosmetic. Need not be fixed unless extra time is available. Visual polish, minor inconsistencies.
2 — Minor. Fix if time allows. Users encounter the problem occasionally and work around it.
3 — Major. Must be fixed before the next release. Users encounter the problem frequently and it slows or blocks them.
4 — Catastrophe. Imperative to fix. Users cannot complete the task, or the failure causes data loss, financial harm, or accessibility exclusion.

Severity is a function of three factors: frequency (how often it occurs), impact (how badly it affects the user), and persistence (whether it gets worse with repeated use). Score honestly; the roadmap is the document the team will actually act on.

Operational workflow

A defensible heuristic evaluation runs in five steps. Same shape for a one-person review and a five-person panel.

Five-step workflow

Running the evaluation

Define the scope. Which screens, which flows, which user roles. Write it down in a single paragraph.
Two passes per evaluator. First pass: get a feel for the product. Second pass: methodically check each screen against the ten heuristics. The two-pass rule materially improves catch rate.
Capture findings as they appear. Each finding: screenshot, heuristic violated, severity score, recommended fix, estimated effort. Don't wait until the end to write them up.
Consolidate across evaluators. If you used more than one evaluator, merge duplicates, reconcile severity scores, and rank.
Present, don't just send. Walk the team through the top findings live. The deliverable is behaviour change, not a document.

Three to five evaluators is the empirical sweet spot. A single evaluator catches around 35 percent of issues in the typical product; three evaluators catch around 60 percent; five catch around 75 percent. Beyond five, the return diminishes sharply. If you're auditing alone, score everything more conservatively — your blind spots are larger than you think.

Frequently asked questions

What is heuristic evaluation in UX?

An expert review method where a senior practitioner rates a product against established usability principles. Nielsen's 10 heuristics are the most common. The output is prioritised, severity-scored findings.

What are Nielsen's 10 heuristics?

Visibility of system status; match between system and the world; user control and freedom; consistency and standards; error prevention; recognition over recall; flexibility and efficiency; aesthetic and minimalist design; help users recover from errors; help and documentation.

How many evaluators do I need?

Three to five. A single evaluator catches around 35 percent of issues; three catch around 60 percent; five catch around 75 percent. Beyond five, the return diminishes.

How do I score severity?

Use a 0 to 4 scale. 0 not a problem; 1 cosmetic; 2 minor; 3 major; 4 catastrophe. Severity is a function of frequency, impact and persistence. Score honestly.

Heuristic evaluation vs usability testing — which first?

Heuristic evaluation first. It catches the known failure modes cheaply so the usability test can focus on unknowns. Reversing the order means paying users to discover problems a senior practitioner would have flagged in a morning.

Continue in the cluster

Jamie Pow

Associate Director, Experience Design at JD.com · Previously Head of UX at Selfridges & Co · Building UX Companion

What heuristic evaluation is

When to use it

1. Visibility of system status

Failure patterns

2. Match between system and the real world

Failure patterns

3. User control and freedom

Failure patterns

4. Consistency and standards

Failure patterns

5. Error prevention

Failure patterns

6. Recognition rather than recall

Failure patterns

7. Flexibility and efficiency of use

Failure patterns

8. Aesthetic and minimalist design

Failure patterns

9. Help users recognise, diagnose and recover from errors

Failure patterns

10. Help and documentation

Failure patterns

Severity scoring

How to score each finding

Operational workflow

Running the evaluation

The UX audit checklist

Frequently asked questions

The complete UX audit guide

How to run a UX audit

UX audit checklist

UX audit report example