ACI · Decision Track · DT-006 · Domain D-3 · D-4 · Open Log
HDCI Pilot · Pohjois-Savo
Empirical validation of the Health Data Continuity Index — preparatory phase
Version 0.7 · 20 April 2026  ·  Preparatory  · Basis: WP-016
Tracking window: 2026–2027 · Updated upon licensing milestones
v0.7 — HDCI v1 three-component core (RKI+IAI+RVI). DCI and PAI removed from composite. H4 topological hypothesis added. §8 system behaviour framing added. Pre-Findata licensing. document. Pre-Findata licensing. No registry data has been accessed. Document serves as internal planning instrument and basis for Findata licence application and stakeholder engagement. Will be updated to v0.7 upon licence submission, v0.9 upon data access, v1.0 upon preliminary results.
Core rationale: WP-016 establishes HDCI as a framework for measuring health data integration failure. Its primary vulnerability is that weights are theoretical and the calculation remains unvalidated against outcome data. DT-006 is the empirical response: one welfare area, existing registry data, Findata licensing pathway. The objective is not to fix the system — it is to direct scarce public health resources toward the integration gaps where they produce the most measurable effect.

1 · Pilot Rationale

Finnish health registries are among the most comprehensive in the world. The integration failure documented in WP-016 is not a data absence problem — it is an allocation problem. HDCI is a diagnostic instrument for identifying where integration failure is highest, enabling resource allocation decisions under conditions of fiscal constraint.

Pohjois-Savo is selected as the pilot welfare area for three structural reasons:

ReasonJustification
Registry coverageKuopio University Hospital (KYS) serves as the tertiary referral centre for the region. THL and Kela registry coverage is comprehensive and documented.
Multi-morbidity profilePohjois-Savo has one of Finland's highest age-standardised morbidity indices (Sotkanet ind. 5642). Elevated multi-morbidity makes integration failure structurally more detectable.
Institutional readinessPohjois-Savo welfare area is operational. KYS has existing research infrastructure and THL collaboration pathways.

2 · Data Access Pathway

All required data exists within Finnish health registries. Patient-level access requires formal secondary use licensing through Findata (the Health and Social Data Permit Authority).

Required datasets

DatasetControllerHDCI use
THL AvohilmoTHLDCI (expected entries), IAI (specialty contacts), PAI (preventive codes), RVI (finding timestamps)
THL Terveys-HilmoTHLIAI (care episodes), RVI (procedure timestamps), validation: preventable hospitalisations
Kela lääkekorvausrekisteriKelaRKI (prescription data, prescriber source)
Kanta care plan registerKela / THLRKI (shared care plan entry verification)
Sotkanet ind. 5642THLDCI expected entry baseline — open data, no licence required

Licensing timeline

StepDescriptionEstimated duration
1. Pre-application consultationContact Findata to confirm dataset availability and application requirements2–4 weeks
2. Research plan finalisationCohort definition, variables, linkage method, data protection protocolConcurrent with step 1
3. Submit Findata applicationFormal secondary use licence with THL and Kela as data controllers1–2 weeks preparation
4. Findata processingStatutory processing time2–3 months
5. Data deliverySecure research environment access (Kapseli or equivalent)1–2 months post-approval
Total estimated4–6 months

Current status: Step 1 not yet initiated. This document is preparatory material for the pre-application consultation.

3 · Cohort Definition

Inclusion criteria

CriterionDefinition
ResidencePohjois-Savo welfare area municipalities
Multi-morbidity≥2 chronic conditions from THL chronic disease classification (diabetes, cardiovascular, respiratory, mental health, musculoskeletal, neurological)
Age≥18 years at index date
Observation period1 January 2024 – 31 December 2025 (24 months)

Exclusion criteria

CriterionRationale
Palliative care (Z51.5)Integration needs differ structurally; distinct care coordination protocols apply
Active oncology (C00–C97, treatment within 6 months)Protocolised through oncology services; integration failure manifests differently
End-stage renal disease (N18.5–N18.6 with dialysis)Care concentrated in nephrology; generalisable integration metrics less applicable
Long-term care facility (>90 days)Data capture patterns differ; coordination occurs within facility

Expected cohort size: 30,000–40,000 patients (based on ~250,000 population, ~15–20% adult multi-morbidity prevalence).

4 · Calculation Protocol

DCI — Data Capture Index

DCI_i = actual_Kanta_entries_i / expected_Kanta_entries_i
expected = f(morbidity_index, age, sex)
Component value: mean DCI_i across cohort

IAI — Integration Achievement Index

IAI = Σ(multi_specialty_episodes) / Σ(total_episodes)
Episode window: 30 days from index contact
Multi-specialty: ≥2 distinct THL specialty codes within episode

RVI — Response Velocity Index

RVI_i = 1 − (latency_days_i / 365)
latency = first_care_action_date − first_recorded_finding_date
Capped: 0 (latency ≥365 days) to 1 (same-day response)

RKI — Risk Accumulation Index

RKI = N_risk / N_poly
N_poly: patients with ≥5 medications (ATC level 5) in any 6-month window
N_risk: N_poly with ≥2 prescriber sources AND no shared Kanta care plan within 30 days

PAI — Prevention Activity Index

PAI = Σ(preventive_contacts) / Σ(total_contacts)
Preventive: screenings, risk assessments, lifestyle counselling (THL SPAT codes)

HDCI v1 — Core Index (v0.7)

HDCI_v1 = 0.40·RKI_adjusted + 0.35·IAI + 0.25·RVI

Three-component core. DCI and PAI removed from composite (see §6 for rationale).
RKI_adjusted = RKI_raw / Kanta_care_plan_adoption_rate
Weights: Bayesian priors subject to outcome-constrained re-estimation.

DCI — Diagnostic side-indicator (not in composite)

DCI = actual_Kanta_entries / expected_Kanta_entries
expected = f(morbidity_index, age, sex, Avohilmo_contact_rate)
Reported separately as documentation completeness indicator. Excluded from HDCI_v1 due to endogeneity.

PAI — Dashboard metadata (not in composite)

PAI = Σ(preventive_contacts) / Σ(total_contacts)
Input-effort indicator only. Reported as system activity metadata. Not a performance signal.

5 · Validation Hypotheses

HypothesisOutcome measureSource
H1: Higher HDCI_v1 predicts lower preventable hospitalisation rateAmbulatory care sensitive conditions / 1,000 patient-years (THL definition)THL Terveys-Hilmo
H2: Higher RKI_adjusted predicts higher adverse medication event rateAdverse medication events / 1,000 patient-years (ICD-10 Y40–Y59, T36–T50)THL Terveys-Hilmo
H3: HDCI_v1 varies across municipalities within Pohjois-SavoMunicipality-level HDCI vs. morbidity indexCalculated HDCI
H4 (Topological): RKI–IAI–RVI form stable clusters in state space; the high-RKI/high-IAI/low-RVI cluster predicts elevated harm outcomes not captured by conventional metricsCluster membership (k-means / latent class analysis) vs H1 and H2 outcomes AND vs conventional metrics (cost per patient, queue length, morbidity index). H4 predicts: cluster 3 elevated on H1/H2 but not distinguishable from cluster 2 on conventional metrics.THL Terveys-Hilmo · Kela · Calculated HDCI

H4 is the primary research contribution of this pilot. If supported, it demonstrates that HDCI reveals a high-risk patient state that conventional metrics systematically miss. If unsupported, it constrains the interpretation of HDCI to a measurement instrument without topological properties — which remains valid and useful.

Statistical methods: H1–H3: multilevel regression, patient-level predictors, outcome measures as dependent variables, controlling for age, sex, morbidity index. H4: k-means clustering (k=3) and latent class analysis in RKI–IAI–RVI space; cluster outcome comparison via ANOVA and logistic regression.

6 · Analogy: WEM → HIM

DT-006 pilot results will populate a conceptual dashboard structurally analogous to the Winter Endurance Monitor (WEM). The Health Integration Monitor (HIM) is a design target for post-pilot development.

WEM componentHIM equivalent
EPP — system endurance pressureHDCI composite — integration pressure index
SP — stress persistence fractionRKI — polypharmacy risk fraction
FS(p) — probabilistic firm capacityIAI — integration achievement fraction
NVE hydro_RF — external bufferMorbidity index — external demand pressure
TRR — transmission realisation rateRVI — response velocity (institutional throughput)

6 · Measurement Validity and Identifiability — v0.6

The following validity threats are identified in advance of data access. Each is addressed in the pilot design. This section is a prerequisite for a credible Findata application.

DCI — Endogeneity and exogenous instrument

DCI has a structural endogeneity problem: morbidity drives care contacts, which drive Kanta entries — the same causal structure produces both predictor and outcome. Without correction, DCI risks measuring documentation intensity rather than data capture quality. The pilot adds Avohilmo contact rate as an exogenous instrument:

expected_entries = f(morbidity_index, age, sex, Avohilmo_contact_rate)

This separates documentation completeness (Kanta entries per contact) from care frequency (contacts per patient). Negative binomial regression; sensitivity vs Poisson and stratum-mean.

IAI — Structural proxy, not clinical truth-metric

IAI is a structural proxy for integration, not a clinically invariant measure. The 30-day episode window is a heuristic that treats all care episodes identically — false in clinical reality across diabetes, orthopaedics, and psychiatry. IAI measures whether the system produces multi-specialty contact within a defined window: a system-level observation, not a patient-level truth claim. Sensitivity analysis: 15, 30, 90-day windows. Unstable rankings require disease-stratified analysis.

RVI — Institutional throughput, not clinical appropriateness

RVI measures institutional response velocity — how quickly the system converts a recorded finding into a recorded action. It does not measure clinical appropriateness. "First finding" is not neutral: it depends on testing frequency and coding practices. RVI is a system-level throughput indicator; this scope must be stated explicitly in all publications.

RKI — Adjusted for Kanta adoption rate (v0.6 revision)

RKI is the strongest HDCI component. v0.6 introduces an explicit adjusted measure:

RKI_raw = N_risk / N_poly
RKI_adjusted = RKI_raw / Kanta_care_plan_adoption_rate

RKI_adjusted isolates coordination failure from documentation infrastructure variation. Composite HDCI uses RKI_adjusted. Both values reported. Cross-validated against H2 (adverse medication events).

Weight revision — v0.6

HDCI = 0.15·DCI + 0.30·IAI + 0.25·RVI + 0.25·RKI_adjusted + 0.05·PAI

RKI: 0.20 → 0.25 (strongest component, direct outcome validation). PAI: 0.10 → 0.05 (input-effort indicator only; does not measure prevention effectiveness). All weights remain Bayesian priors subject to outcome-constrained re-estimation.

HDCI as composite proxy index

HDCI does not model "integration" as a unitary latent variable. Integration is multidimensional — information flow, responsibility coordination, and clinical decision consistency are distinct constructs. HDCI is a composite proxy index: a diagnostic signal about where integration-relevant observables are weakest, not a measurement of an underlying integration quantity.

Causal structure — DAG

EXOGENOUS
  Morbidity · Age/Sex · Avohilmo contact rate · Kanta adoption rate

  Morbidity + Age/Sex → Care Contacts ← Avohilmo rate (instrument)
  Care Contacts → Kanta entries (DCI) · Multi-spec episodes (IAI)
                → First finding → First action (RVI)
  Prescriber multiplicity + Kanta adoption → RKI_raw → RKI_adjusted
  Preventive contacts / total → PAI

  HDCI composite → H1 Preventable hospitalisations
  RKI_adjusted  → H2 Adverse medication events
  HDCI components → H3 Inter-municipality variation
Findata note: Each variable is justified by direct correspondence to an HDCI component or validation hypothesis. No exploratory variables requested. Data minimisation: cohort restricted to multi-morbid patients, single welfare area, 24-month observation window. Full application requires: pseudonymisation protocol, column-level variable list per registry, necessity test per variable, DPIA.

8 · From Topology to System Behaviour

HDCI does not prescribe interventions. It reveals where the system's own coordination logic produces elevated risk without corresponding visibility in current metrics. This section describes what the RKI–IAI–RVI space shows about system behaviour — not what should be done about it.

8.1 Core principle: HDCI is a friction map, not a control system

The three-dimensional RKI–IAI–RVI space identifies:

HDCI does not tell the system what to do. It tells the system where its own coordination logic is producing the highest unobserved friction.

8.2 Intervention framing: transitions, not patient classification

Incorrect framingCorrect framing
"Group 3 is the target"Transitions between groups are the signal
"Fix Group 3"Prevent drift into Group 3
Patient classificationSystem dynamics observation

The diagnostic value of HDCI lies not in labelling patients but in identifying the transition zones where patients move from coordinated care into high-risk invisibility. Any interventions, if undertaken, address the drift mechanism — not individual patients.

8.3 Three observable system behaviours

1. RKI → IAI coupling — responsibility diffusion before fragmentation. When RKI rises without IAI decrease, medication responsibility diffuses across prescribers before the care pathway fragments into multiple uncoordinated specialties. The system accumulates polypharmacy risk silently. Minimal feedback: a Kela + Kanta medication consolidation view — not a new workflow, only increased visibility of prescriber multiplicity.

2. IAI → RVI coupling — coordination without decision. When IAI is high but RVI remains low, the system makes contact across multiple specialties, records findings, but does not convert findings into timely action. The system sees everything but does not react. Minimal feedback: organisational-level latency reporting — an escalation log, not automated decision support.

3. RKI × RVI risk zone — silent high-risk drift. When RKI is high and RVI is low simultaneously, responsibility is maximally diffused and response velocity is minimal. No current metric (cost, queue, morbidity) flags this state. Minimal feedback: a review flag — not a clinical decision, only a signal that the coordination profile warrants attention.

8.4 Why this framing is necessary

If HDCI is presented as "Group 3 requires intervention," it becomes a clinical decision model — which it is not — and loses standing as a measurement framework. Presented as "HDCI reveals structural friction zones that current metrics cannot see," it remains a diagnostic instrument. This distinction determines whether DT-006 remains a research design or becomes an unimplementable policy proposal.

8.5 Institutional hyväksyttävyys — viisi periaatetta

PeriaateToteutus
Ei yksilötason julkista raportointiaKlusterit raportoidaan väestötasolla — kunnittain, alueittain. Yksittäistä potilasta ei luokitella julkisesti.
Ei "huono / hyvä" -arvottamistaRyhmät kuvataan rakenteellisina tiloina: lineaarinen, hallittu kompleksisuus, hidas sirpaloituminen.
Ei suoria toimenpidesuosituksiaHDCI kertoo mitä näkyy, ei mitä pitäisi tehdä. Toimenpiteet ovat hyvinvointialueen ja THL:n päätöksiä.
Avoin metodologiaKaikki laskentakaavat, klusterointimenetelmät ja validointitulokset julkaistaan. Läpinäkyvyys on hyväksyttävyyden tae.
Vertaisarvioitu julkaisu ennen operatiivista käyttöäMetodologia ja ensimmäiset tulokset julkaistaan vertaisarvioidussa kanavassa ennen kuin HDCI:tä tarjotaan käytettäväksi ohjaustyökaluna.
Ydinviesti instituutioille: HDCI ei kerro teille mitä teidän pitää tehdä. Se kertoo teille missä kohtaa teidän järjestelmänne lakkaa toimimasta koordinoituna kokonaisuutena ja alkaa tuottaa rinnakkaisia, koordinoimattomia prosesseja — ilman että se näkyy nykyisissä mittareissa. Päätös siitä, mitä havainnolla tehdään, on teidän.

7 · Milestones and Status

StepTargetStatus
Pre-application consultation with FindataQ2 2026Not initiated
Contact Pohjois-Savo welfare authorityQ2 2026Not initiated
Contact THL / Kela research unitsQ2 2026Not initiated
Finalise research plan and data protection protocolQ2 2026Not initiated
Submit Findata licence applicationQ3 2026Not initiated
Data access establishedQ4 2026Pending licence
Preliminary HDCI results for Pohjois-SavoQ1–Q2 2027Pending data access

Document versioning

VersionTrigger
v0.7 (current)Preparatory. Pre-licensing.
v0.7Licence application submitted. Research plan finalised.
v0.9Licence granted. Data access established.
v1.0Preliminary results. HDCI calculated for Pohjois-Savo cohort.
ACI · Decision Track · DT-006 · Version 0.7 · 20 April 2026 · Preparatory
Basis: WP-016 · Status: Pre-licensing · Next update: upon Findata pre-application consultation
This document is an internal planning instrument. It does not imply current access to registry data.
Aether Continuity Institute · aethercontinuity.org