Methodology
Every index, formula, input, source, assumption, and caveat — written to withstand review by economists, statisticians, labor and policy researchers, and AI researchers. The computation is open and deterministic.
What this is — and is not
What this is. Cognitive Coefficient is a composite-indicator and scenario platform. It measures, as transparently as possible, the access to, utilization of, and leverage from cognitive infrastructure — AI systems, compute, data, education, digital skills, innovation capacity, and knowledge networks — across economies and occupations.
What this is not. It does not measure intelligence, human worth, or destiny. It does not claim causality. It does not predict outcomes. Every index is a model construct over curated data, and every projection is a scenario with explicit assumptions and uncertainty — not a forecast of actual outcomes.
Measurement vs. prediction. We deliberately separate:
- Measurement — present-day composite indices computed from observed/curated indicators (CC, CDI, CII, CSI, AIR, AIX, CRI).
- Exposure & capability — structural characteristics (e.g. occupational AI exposure for OFI).
- Forecasts & scenarios — bounded-growth and diffusion projections under alternative assumptions (CV/CM are forward
signals; the scenario engine and OFI projections are explicitly scenario-based).
Standing disclaimer. Throughout the platform: these are model outputs and scenarios, not forecasts of actual outcomes.
Data & provenance
v1 data is curated and source-tagged. Each indicator value carries a source and year and an estimated flag. Where a precise current value was unavailable, a best-calibrated estimate is used and flagged estimated: true. Anchor indicators (internet use, R&D/GDP, tertiary enrollment, GDP) draw on well-documented public statistics (World Bank, ITU, UNESCO, OECD, IMF); AI-specific indicators draw on the Stanford AI Index, Tortoise Global AI Index, Oxford Insights Government AI Readiness, WIPO, and Top500. This is an illustrative-but-anchored v1 panel, not an official statistical release. A live ETL layer (documented under Architecture) is the roadmap to continuously-ingested official data.
Normalization. Every indicator is min-max normalized across the country panel to a 5–100 range (a 5-point floor avoids a single worst performer collapsing to a degenerate zero). Heavy-tailed indicators (GDP, GDP per capita, private AI investment, patents, researchers, scientific publications, Top500 systems, data-center MW, cloud regions, notable models, AI publication share) are log-transformed (log1p) before normalization. All indicators are oriented so that higher = more cognitive infrastructure. Missing cells are excluded from their pillar's mean rather than imputed.
Cross-sectional, single-vintage. The v1 panel is a single cross-section. There is no historical time series, so the temporal indices (CV, CM, CEV) are forward heuristics derived from present-day structure — not estimated trends. This is stated explicitly in each temporal index below and in the code's own docstrings.
CC · Cognitive Coefficient
Measures: Overall access to, utilization of, and leverage from cognitive infrastructure for an economy. 0–100
CC is the arithmetic mean of 8 normalized pillars: AI Access, AI Adoption, Compute, Data, Education, Digital Literacy, Innovation, and Knowledge Infrastructure.
Each pillar is a weighted arithmetic mean of its member indicators (member weights shown in the data layer; e.g. R&D/GDP and data-center capacity carry weight 1.5). Each indicator is first min-max normalized across the panel to 5–100 (log1p-transformed first for heavy-tailed indicators). The pillars are then combined with equal weight:
CC = (1/8) · Σ pillar_p
Missing indicators are dropped from their pillar's mean; missing pillars are dropped from CC.
Inputs: AI Access (internet, broadband, cloud regions), AI Adoption (AI-skill penetration, national AI capacity, AI talent), Compute (Top500, data-center MW, semiconductor), Data (data governance, sci. publications, digital skills), Education (tertiary, schooling, human capital, STEM share), Digital Literacy (digital skills, AI-skill penetration, internet), Innovation (R&D/GDP, researchers, patents, GII, AI investment), Knowledge Infrastructure (publications, notable models, AI-pub share, gov AI readiness)
Sources: World Bank WDI, ITU, UNESCO UIS, OECD, WIPO GII, Stanford AI Index, Tortoise Global AI Index, Oxford Insights, Top500
Assumptions
- Equal weighting across the 8 pillars is a transparent default, not an optimized weighting.
- Min-max (5–100) normalization makes CC a relative, panel-dependent measure.
- Log transforms tame heavy-tailed indicators before normalization.
Caveats
- CC is relative to the 48-economy panel, not an absolute capacity.
- Sensitivity to weighting choices is not yet published.
- Curated v1 data includes flagged estimates.
CDI · Cognitive Development Index
Measures: Balanced cognitive development, penalizing imbalance across dimensions (HDI-style). 0–1
CDI is the geometric mean of 5 normalized dimension indices, each scaled to [0,1]:
CDI = (D_ai · D_digital · D_education · D_innovation · D_knowledge)^(1/5)
Each dimension is the mean of its normalized (0–100) components ÷ 100. The geometric mean (as in the HDI since 2010) penalizes uneven development: a low score in any dimension drags the whole index down. Component floor of 0.001 prevents zero-collapse.
Inputs: AI readiness (gov AI readiness, national AI capacity, AI governance), Digital capability (internet, digital skills, mobile broadband), Education (tertiary, mean schooling, human capital), Innovation (R&D/GDP, patents, GII), Knowledge production (sci. publications, AI-pub share, notable models)
Sources: UNDP HDI (method), World Bank, UNESCO, OECD, Stanford AI Index
Assumptions
- Five equally-weighted dimensions.
- Geometric aggregation across dimensions; arithmetic within a dimension.
Caveats
- Panel-relative (0–1 is relative to observed min/max).
- Missing dimensions are dropped, which can affect cross-country comparability.
CII · Cognitive Inequality Index
Measures: Inequality in access to cognitive infrastructure across the world's people. 0–1
CII is the population-weighted Gini coefficient of the cross-country CC distribution. Each economy contributes its CC value weighted by population:
Gini = Σᵢ Σⱼ wᵢ wⱼ |CCᵢ − CCⱼ| / (2 · W² · μ_w)
where wᵢ is population, W = Σ wᵢ, and μ_w is the population-weighted mean CC. We also report the Theil-T index and the Lorenz curve (cumulative share of people vs. cumulative share of cognitive capital).
Inputs: CC per economy, Population per economy
Sources: Gini (1912), Theil (1967), Lorenz (1905)
Assumptions
- Each economy's CC is treated as the cognitive-capital level of its people (no within-country variation in v1).
Caveats
- Measures inequality BETWEEN economies, not within them.
- Sensitive to the set of economies included.
CSI · Cognitive Separation Index
Measures: How far the cognitively richest pull ahead of the rest. ratio (e.g. 2.9×)
CSI is the ratio of the population-weighted mean CC of the top decile of people to that of the bottom half:
CSI = mean_CC(top 10% of people) / mean_CC(bottom 50% of people)
Economies are sorted by CC; population is accumulated to form the slices (a partial country is split proportionally), and the top and bottom slices are drawn from disjoint ends so no person is double-counted. We also report top-1% ÷ bottom-50% and top-10% ÷ bottom-40%.
Inputs: CC per economy, Population per economy
Sources: Top/bottom share ratios (income-distribution literature)
Assumptions
- People within an economy share its CC.
Caveats
- A ratio, not a 0–1 index.
- Cross-country only.
CV · Cognitive Velocity
Measures: A forward heuristic for an economy's rate of cognitive-infrastructure improvement. %/yr (model heuristic)
CV is a forward heuristic, not a measured trend (v1 has no historical panel). It scales a baseline rate by momentum and catch-up headroom:
CV = 2.0 · (0.5 + 1.6·CM/100) · (1 + 0.9·(1 − CC/100))
Higher Cognitive Momentum (CM) raises the rate; lower CC adds catch-up headroom. The code's own docstring states this is not a measured time series.
Inputs: Cognitive Momentum (CM), CC (for headroom)
Sources: Internal heuristic
Assumptions
- Momentum and headroom proxy near-term improvement in lieu of a fitted trend.
Caveats
- Not estimated from history; no standard error.
- A structural heuristic, not a forecast.
CM · Cognitive Momentum
Measures: Forward signal that improvement is likely to be sustained or accelerate. 0–100
CM is the mean of four normalized forward signals:
- Investment intensity — private AI investment ÷ GDP (log-scaled, normalized)
- Education pipeline — tertiary enrollment, STEM share, AI-skill penetration
- Research intensity — R&D/GDP, researchers per million
- Infrastructure pipeline — data-center capacity, cloud regions
CM = mean(invest, pipeline, research, infrastructure)
Inputs: AI investment ÷ GDP, Tertiary / STEM / AI-skill, R&D/GDP, researchers, Data-center, cloud regions
Sources: Stanford AI Index, OECD, UNESCO
Assumptions
- Equal weight across the four signals.
Caveats
- A present-day composite of forward-leaning inputs, not a fitted acceleration.
AIR · AI Readiness Index
Measures: Institutional and infrastructural preparedness to deploy AI. 0–100
AIR averages a governance/readiness block with an infrastructure block:
- Governance/readiness = weighted mean of gov AI readiness (×1.5), AI governance framework, data governance, AI talent.
- Infrastructure = mean of normalized Top500, data-center MW, semiconductor capability.
AIR = mean(governance_readiness, infrastructure)
Inputs: Gov AI readiness, AI governance framework, Data governance, AI talent, Top500 / data-center / semiconductor
Sources: Oxford Insights, OECD.AI, Top500
Assumptions
- Readiness and compute infrastructure weighted equally at the block level.
Caveats
- 'Governance framework' scores maturity, not restrictiveness.
AIX · Augmentation Index
Measures: Actual utilization of AI — augmentation depth — rather than mere access. 0–100
AIX is a weighted mean of utilization proxies (NOT an O*NET task pipeline — that is used for OFI):
AIX = wmean( AI-skill penetration ×2, AI talent ×1, national AI capacity ×1 )
Weighting AI-skill penetration most heavily reflects that AIX is about use, not readiness or access.
Inputs: AI-skill penetration, AI talent, National AI capacity score
Sources: LinkedIn / Stanford AI Index, Tortoise
Assumptions
- Workforce AI-skill penetration is the best available proxy for utilization depth in v1.
Caveats
- A proxy for utilization; direct per-firm adoption telemetry is a roadmap item.
- Distinct from AIR (readiness) by construction.
CRI · Cognitive Resilience Index
Measures: Resilience to cognitive concentration — self-reliance plus a broad capability base. 0–100
CRI blends domestic capability with breadth:
CRI = 0.6 · domestic_capability + 0.4 · breadth
- domestic_capability = mean normalized [national AI capacity, notable models, semiconductor, AI talent]
- breadth = mean normalized [tertiary enrollment, human capital, data governance, internet]
Higher CRI = more able to rely on home-grown capability across a broad base, hence less fragile to externally-concentrated frontier infrastructure. (There is no HHI or shared Monte-Carlo machinery — CRI is this deterministic blend.)
Inputs: National AI capacity, notable models, semiconductor, AI talent, Tertiary, human capital, data governance, internet
Sources: Stanford AI Index, World Bank HCI, OECD
Assumptions
- Self-reliance (60%) and breadth (40%) jointly proxy resilience.
Caveats
- A structural proxy; does not model specific supply-chain or vendor dependencies.
CEV · Cognitive Escape Velocity
Measures: An economy's trajectory relative to the frontier — closing the relative gap, holding, or slipping. category + signed %
CEV classifies on proportional growth pace vs. the frontier (the highest-CC economy):
ratio = CV(country) / CV(frontier)
- Frontier — the highest-CC economy itself
- Accelerating — ratio ≥ 1.05
- Advancing — ratio ≥ 0.97
- Maintaining — ratio ≥ 0.82
- Falling behind — ratio < 0.82
The reported score is (ratio − 1)·100 — percent pace relative to the frontier. Because CV is a heuristic, CEV is a relative-trajectory classification, not a probability.
Inputs: CV(country), CV(frontier)
Sources: Internal (derived from CV)
Assumptions
- Relative position improves only if you grow faster in %/yr than the frontier.
Caveats
- Inherits CV's heuristic nature; not fitted to history.
OFI · Occupational Future Index
Measures: Task-transformation pressure on an occupation from AI — not job loss. 0–100
For each occupation:
pressure = exposure · (0.4·adoption_risk + 0.3·economic_pressure + 0.3·agentic_exposure) friction = (1 − 0.5·regulatory_friction) · (1 − 0.45·trust_requirement) OFI = 100 · clamp₀¹(pressure · friction)
Task shares (automate / augment / human) come from the occupation panel and are projected to 2030/35/40 via Bass diffusion of the agentic-automation ceiling, damped by regulatory/trust friction. Scores are grounded in published exposure frameworks.
Inputs: exposure, adoption_risk, economic_pressure, agentic_exposure, regulatory_friction, trust_requirement, task_shares
Sources: Felten–Raj–Seamans AIOE, Eloundou et al. 2023 (GPTs are GPTs), OECD AI exposure, O*NET
Assumptions
- Exposure measures task overlap with AI capability, not realized automation.
- Bass diffusion governs adoption of the agentic ceiling.
Caveats
- Models task transformation, NOT employment outcomes.
- Occupation-level, US-anchored employment/wage context.
Limitations & ethics
Relative, not absolute. CC, AIR, AIX, CRI and the pillars are min-max normalized within the current panel. A country's score is its standing relative to the other 47 economies, not an absolute physical quantity. Adding or removing economies can shift scores. CDI (geometric mean of normalized dimensions, 0–1) is likewise panel-relative.
Equal weights are a choice. Pillar weights within CC are an explicit, equal-ish default (member-indicator weights are shown per pillar). We do not claim weights are empirically optimal or derived from PCA/expert elicitation. Sensitivity to weighting is a known limitation; a published weight-sensitivity analysis is on the roadmap, not in v1.
Temporal indices are heuristics. CV, CM and CEV are computed from a single cross-section. They encode plausible forward structure (momentum, headroom, pace vs. the frontier) but are not fitted to historical data and carry no standard errors. Historical trajectories shown in charts are explicitly labelled illustrative reconstructions.
Scenarios are not probabilities. The 8 scenarios are parameterized assumption-sets. The Monte-Carlo bands quantify parameter uncertainty within a scenario; they are not calibrated probabilities of real-world futures, and the scenarios are not assigned likelihoods.
Occupational estimates are structural, not predictive. OFI and its task-share projections are grounded in published AI-exposure frameworks (AIOE; Eloundou et al. 2023; OECD), but exposure is not job loss. Adoption depends on cost, workflow, regulation, and trust — which we model coarsely.
Aggregation hides within-country inequality. CII/CSI use cross-country, population-weighted distributions of CC. They capture inequality between economies, not within them (v1 lacks subnational microdata).
Ethics. The platform exists to make cognitive-infrastructure inequality visible so it can be addressed. The "cognitive aristocracy" framing names a risk to monitor, not a desired or predicted outcome. Personal assessments are reflective instruments over self-reported inputs, computed in-session with no storage — never a judgement of a person.