Why credentials fail, what actually predicts job performance, and how structured proof of work changes the signal-to-noise ratio in hiring decisions.
Degrees, job titles, and CV bullet points share a fundamental flaw: they measure proximity to capability, not capability itself.
A degree from a prestigious university tells you that someone was admitted, paid tuition, and completed the required coursework — not that they can do the work you're hiring them to do. A job title tells you what someone was called, not what they built. A CV bullet point claiming "increased revenue by 40%" is unverifiable, uncontextualised, and entirely self-reported.
The hiring research literature has been clear on this for decades. A landmark meta-analysis by Schmidt and Hunter (1998), covering 85 years of selection research, found that unstructured interviews and educational credentials are among the weakest predictors of job performance — yet they remain the dominant hiring tools in most organizations.
The validity coefficient of unstructured interviews is approximately 0.18 on a 0-1 scale, where 1.0 would be perfect prediction. By comparison, work sample tests score 0.54, and structured behavioral interviews reach 0.51. Most hiring still relies on methods in the 0.10-0.20 range.
The problem is compounded by reference inflation — the practice of providing only hand-picked referees, who provide almost uniformly positive assessments. Research shows that reference checks have a predictive validity of around 0.26, but this drops sharply when references are self-selected (which they almost always are in standard hiring).
Studies show 56-75% of CVs contain at least one significant inaccuracy. There is no standard for what "led", "owned", or "drove" means.
Candidates choose their own referees. The system is structurally incapable of providing independent verification.
A degree proves institutional access, not capability. The correlation between degree prestige and job performance is consistently weak across research.
Knowing how to answer a case study is not the same as having shipped under real constraints. Abstract tests miss applied capability entirely.
The research is unambiguous: the closer a selection method is to actual work, the better it predicts performance.
The Schmidt & Hunter hierarchy of predictive validity places work sample tests at the top (validity ~0.54), followed by structured behavioral interviews (~0.51) and cognitive ability tests combined with structured assessment (~0.63). What all of these have in common is that they require candidates to demonstrate capability, rather than describe it.
The implication for hiring is direct: the best evidence of future performance is structured past performance — specific work, in specific contexts, with measurable outcomes, verified by people who were there.
| Selection Method | Predictive Validity (r) | Used in Veryfy |
|---|---|---|
| CV screening (unstructured) | ~0.18 — low | — |
| Reference checks (self-selected) | ~0.26 — moderate | — |
| Unstructured interview | ~0.18 — low | — |
| Structured interview | ~0.51 — strong | — |
| Work sample test | ~0.54 — strong | — |
| Structured past performance (Veryfy) | ~0.55–0.62 (estimated) | ✓ Core model |
Veryfy's model is designed around structured past performance — not CV claims, but structured work entries with defined context, contribution, measurable outcome, and multi-source verification. This is the only selection input that combines the predictive validity of work samples with the breadth of a full career history.
Each work entry in a Veryfy Passport is structured around five mandatory fields. This structure is what makes entries comparable, searchable, and verifiable.
Company, team, time period, scope of role. Places the work in a verifiable situation.
Specific role in the work. What the candidate personally did vs. the team.
Linked evidence: repos, designs, documents, platform integrations.
Measurable result. Tagged to outcome categories for searchability.
Third-party signals confirming the entry is accurate.
This 5-layer structure means that every entry in a Veryfy Passport answers the same set of questions. It's not a portfolio — it's structured evidence. A hiring manager reviewing 20 passports is comparing apples to apples, not sifting through 20 different formats of self-promotion.
The outcome tagging layer is particularly powerful. Tags like +18pp conversion, 0-to-1 product, or team of 8 are machine-readable, enabling semantic search across the full candidate pool without requiring a human to read every entry.
Not all verification is equal. Veryfy weights signals according to their structural independence from the candidate.
The core principle is simple: the less the candidate can influence or select the verifier, the higher the trust weight of that signal. Platform Pull — automated data from GitHub, Figma, or analytics platforms — carries the highest weight because the candidate cannot fabricate a commit history or inflate a conversion rate retroactively.
Automated verification from integrated platforms (GitHub, Figma, Jira, Analytics). Cannot be coached or staged.
Named manager confirms the role, contribution, and outcome via one-click confirmation. Manager's own trust score modulates the weight.
Collaborators on the same project confirm involvement. Cross-reference checks prevent multiple people claiming sole ownership.
The baseline layer — the candidate's own structured claim. Carries low weight alone, but acts as the foundation all other signals build on.
Signal weights are not static. They decay over time (older verifications from people who've since left the company carry less weight), and they are subject to anomaly detection that flags sudden coordinated verification spikes. The system is designed to be expensive to game and cheap to be honest on.
The LeaDe Score is not a rating — it's a weighted composite of five distinct capability dimensions, each independently measured.
Depth captures the breadth and quality of a candidate's strongest domain, weighted by the significance of outcomes. Recency applies temporal decay — a 90-day active profile scores higher than identical work done three years ago, reflecting the reality that skills and context change. Verification Density measures what percentage of entries are independently verified, not just self-declared.
The score is designed to be informative, not decisive. A score of 84 vs. 86 is not meaningful; a score of 84 vs. 54 is. The real value is in the composition of the score — a candidate with high Depth but low Recency tells a very different story than one with high Recency but low Verification Density.
Veryfy explicitly discourages using the score as a filter threshold. The score is a navigation tool, not a gate. It directs attention to the structured evidence beneath it — the actual work entries, artifacts, and verification signals that form the real basis of a hiring decision.
Browse verified candidates, explore real Passports, and see what structured proof of work looks like on a live profile.