PDEC
Back to checker

Methodology

How risk scores, severity, and privacy are calculated.

Need help or want to report an issue? Contact us.

Overall Risk Score (Email Checks)

For an email address, the overall risk score is a deterministic function of three factors. Each factor is computed independently and summed, then capped at 100:

riskScore = min(frequency + recency + sensitivity, 100)
frequencymax 40
min(numBreaches × 8, 40)

Number of distinct breaches the email appears in.

recencymax 30
< 6 months → 30 · < 1 year → 20 · < 2 years → 10 · older → 5

Based on the most recent breach date.

sensitivitymax 30
(avg(sensitivityWeight(breach.dataClasses), capped at 15) ÷ 15) × 30

Average sensitivity of exposed data classes across all breaches.

The qualitative bucket is then derived from the numeric score:

None
0
Low
1 – 29
Medium
30 – 59
High
60 – 100
Password Risk Score

Password scoring is based on how often the password has been seen in breach corpora collected by Have I Been Pwned (Pwned Passwords). It is a single-dimension score, so the factor breakdown is hidden in the UI (the API still returns a factorsobject with all three contributions equal to zero, for schema consistency).

Independent strength estimate (zxcvbn). Alongside the breach lookup, a second, independent signal is shown: a local estimate of password strength produced by zxcvbn (Wheeler, 2016). zxcvbn returns an integer score from 0 (Very Weak) to 4 (Very Strong) based on dictionary matches, keyboard patterns, and entropy, and yields an estimated offline crack-time. This computation runs entirely in the browser — the password is never sent to the server for strength analysis (the breach lookup separately uses k-anonymity, sending only a 5-char SHA-1 prefix). A password can be Strong by zxcvbn but still appear in HIBP, or Weak yet not (yet) appear in any leak — the two signals answer different questions and are best read together.

Times seenScoreBucket
≥ 100,00095High
≥ 10,00080High
≥ 1,00060Medium
≥ 10040Medium
1 – 9925Low
00None
Per-Breach Severity Score

Each individual breach also receives its own 0–100 severity score so you can prioritise which breach to act on first. The formula is:

severity = min(sensitivity + recency + scale, 100)
  where:
    sensitivity = (sensitivityWeight(dataClasses) / 15) × 50    // 0 – 50
    recency     = age-bucketed points (see "recency" above)     // 0 – 30
    scale       = PwnCount-bucketed points                      // 0 – 20
Accounts in breach (PwnCount)Scale points
≥ 100,000,00020
≥ 10,000,00015
≥ 1,000,00010
≥ 10,0005
< 10,0000
Data-Class Sensitivity Weights

Every data class exposed in a breach contributes a weight to that breach's sensitivity sum (capped at 15 per breach so a single breach cannot dominate the average):

High
×3.0
  • Passwords
  • Password hints
  • Security questions and answers
  • Credit cards
  • Banking details
  • Payment histories
  • Social security numbers
  • Government issued IDs
Medium
×1.5
  • Phone numbers
  • Physical addresses
  • Dates of birth
  • Financial data
  • Health & fitness data
  • Medical records
  • Bank account numbers
  • IP addresses
Low
×0.5
  • Everything else (e.g. usernames, email addresses, names)
k-Anonymity for Password Checks

Password lookups never transmit the raw password — or even its full hash — to any external service. Following HIBP's published k-anonymity protocol, only the first 5 characters of the SHA-1 hash leave the server. The remote service returns the list of all hash suffixes that share that prefix (typically ~800 entries per range). The match check happens locally:

k-Anonymity password lookup flowBrowser sends raw password to PDEC over TLS. PDEC computes SHA-1 locally, transmits only the first 5 hex characters of the hash to the HIBP range API, then matches the returned hash suffixes locally. The remote service never sees the password or its full hash.Browserraw passwordStep 1PDEC serverSHA-1(pw)Step 2Send prefixfirst 5 hex charsStep 3HIBP API/range/{prefix}Step 4Local matchcompare suffixesStep 5over TLSno full hash leaves PDEC~800 suffixes returned
  1. 1. Browser submits the password to PDEC over TLS — it never reaches a third party.
  2. 2. PDEC computes SHA-1(password) locally (40 hex characters).
  3. 3. Only the first 5 hex characters of the hash are sent to HIBP (GET /range/AABBC).
  4. 4. HIBP returns ~800 hash suffixes that share that prefix, each with a count.
  5. 5. PDEC compares the suffix locally — if it matches, return the count to the user; otherwise no exposure.

This guarantees that the breach API never sees the user's password or its full hash, and cannot tell which specific password was checked from any single request — only that someone checked something whose hash begins with that 5-character prefix.

Data Sources
  • HIBP Public Breach List (/api/v3/breaches) — curated metadata for all known breaches: name, date, affected account count, exposed data classes. Cached server-side for 1 hour to minimise upstream load.
  • XposedOrNot (/v1/check-email/<email>) — free per-email lookup that returns the names of breaches the address appears in. Combined with the HIBP metadata above to produce the breach details shown.
  • HIBP Pwned Passwords (/range/<5-char prefix>) — k-anonymity range lookup over a corpus of more than 850 million previously breached passwords.

All external HTTP calls have an 8–10 second timeout to fail fast under upstream degradation.

Privacy Guarantees
  • No identifier (email or password) is persisted to disk or any database.
  • Server logs explicitly redact request bodies and identifiers — only method, URL path, status, and duration are recorded.
  • For password checks, only the first 5 characters of a SHA-1 hash are transmitted externally.
  • The frontend masks the checked identifier after the lookup completes (passwords become the literal string "a password" in client memory).
  • Helmet sets safe HTTP headers; per-IP rate limiting protects the upstream APIs and the user from abuse.
References
  1. Ali, J. (2018). Validating leaked passwords with k-Anonymity. blog.cloudflare.com/validating-leaked-passwords-with-k-anonymity
  2. Hunt, T. Have I Been Pwned API v3 documentation. haveibeenpwned.com/API/v3
  3. National Institute of Standards and Technology (2017). NIST Special Publication 800-63B — Digital Identity Guidelines: Authentication and Lifecycle Management. pages.nist.gov/800-63-3/sp800-63b.html
  4. European Parliament & Council (2016). Regulation (EU) 2016/679 (General Data Protection Regulation), Article 32 — Security of processing. gdpr-info.eu/art-32-gdpr

Academic Final Year Project • Personal Data Exposure Checker

Passwords use k-anonymity - only a partial hash is transmitted. No identifier is stored or logged.

Contact Us·Methodology