Czech Democracy Index
← Methodology

Data sources

Where the index pulls from — 8 Czech newsrooms, open data from parliament and the courts, watchdog organisations, international news. Live table generated from config/sources.yaml.

Data sources

The Czech Democracy Index draws from two independent data layers — structural (slow, refreshed annually/quarterly) and weekly (fast, event-driven). This page maps both.

The source of truth is config/sources.yaml. The tables below are generated from it at build time — they aren’t edited by hand, so they can’t drift.

Structural baseline (quarterly)

The structural score is built on six established international indices, mapped onto the project’s six pillars (elections, governance, judiciary, media, civil rights, corruption):

Index Publisher Frequency Pillar(s)
V-Dem Democracy Report University of Gothenburg annual (spring) overall + electoral + civil
EIU Democracy Index Economist Intelligence Unit annual overall
Freedom in the World Freedom House annual (March) electoral + civil
RSF World Press Freedom Index Reporters Without Borders annual (May) media
TI Corruption Perceptions Index Transparency International annual (Jan/Feb) corruption
WJP Rule of Law Index World Justice Project annual (Oct/Nov) judicial + governance

The detail of how the six indices combine into 0–100 pillar scores is in structural mapping. The current baseline is in data/structural/2026-Q2.json in the repo.

Weekly sources (event monitoring)

The weekly pipeline (weekly-pipeline GitHub Actions workflow, Mondays at 06:00 UTC) goes through every active source, pre-filters articles via Claude Haiku 4.5, classifies them via Claude Sonnet 4.6, and proposes events with reasoning per the severity rubric. The "active" status means the source has a working adapter — either an RSS feed (read via rss-parser) or a dedicated TypeScript adapter in src/lib/. The "not wired up" status is a placeholder — the source is registered in the yaml for a future adapter, but isn’t currently read.

Czech media (8/8 active)

Status Source Type Note
✓ active Deník N RSS feed Some articles are paywalled; the pipeline works with the headline and lede.
✓ active iROZHLAS RSS feed
✓ active ČT24 RSS feed Verify the URL — Czech Television occasionally moves the feed.
✓ active Hospodářské noviny RSS feed
✓ active Aktuálně.cz RSS feed
✓ active Investigace.cz RSS feed Investigative journalism, slower cadence but very high per-item relevance.
✓ active A2larm RSS feed
✓ active Seznam Zprávy RSS feed

Open data (5/8 active)

Status Source Type Note
✓ active Hlídač státu — sponzoring API Adapter in src/lib/hlidac.ts (fetchPartyDonationsAsArticles).
✓ active Hlídač státu — smlouvy s issues API Adapter in src/lib/hlidac.ts (fetchWatchlistSmlouvyAsArticles).
✓ active Hlídač státu — dotace pro watchlist API Adapter in src/lib/hlidac.ts (fetchWatchlistDotaceAsArticles).
✓ active Poslanecká sněmovna PČR HTML scraper The Chamber of Deputies has no RSS feed (probed 2026-04-28).
⏸ not wired up Senát PČR HTML scraper
✓ active Ústavní soud ČR RSS feed Official RSS feed (undocumented but stable — returns the 30 most recent items: rulings, plenum and senate session overviews, press releases).
⏸ not wired up Nejvyšší soud HTML scraper
⏸ not wired up Nejvyšší správní soud HTML scraper

Watchdog (1/3 active)

Status Source Type Note
✓ active Transparency International ČR RSS feed
⏸ not wired up Rekonstrukce státu HTML scraper
⏸ not wired up Frank Bold HTML scraper

International (5/9 active)

Status Source Type Note
✓ active POLITICO Europe RSS feed Joint venture of US Politico and Axel Springer (DE).
✓ active BBC News Europe RSS feed Public broadcaster (UK).
✓ active Euronews RSS feed Pan-European TV/web outlet (owned by NBC + Naguib Sawiris and Mediaset).
✓ active Visegrad Insight RSS feed Niche policy outlet focused exclusively on V4 (CZ + SK + PL + HU), published by the Res Publica Foundation (PL).
✓ active Brno Daily RSS feed English-language news about Czechia (founded in Brno, gradually expanding to the whole country).
⏸ not wired up GRECO (Council of Europe) HTML scraper
⏸ not wired up Venice Commission HTML scraper
⏸ not wired up European Commission Rule of Law Report HTML scraper
⏸ not wired up European Court of Human Rights — ČR HTML scraper

Why these sources

Czech media. Deliberately broad ideological spectrum: from the more left-leaning A2larm through the centrist Deník N to the more conservative Hospodářské noviny. Investigace.cz has a slower cadence but very high per-item relevance. The public-service ČT24 and iROZHLAS serve as an independent reference point. The point of diversity is anti-bias — no single outlet may dominate the source mix beyond 50 % of weekly events.

Open state data. Structural events (an interrupted Chamber of Deputies session, a Constitutional Court ruling, a paid sponsorship contract) are more valuable than media commentary — they can be verified directly at source. Hlídač státu (free after registration at hlidacstatu.cz/api) opens up databases of sponsorship, anomaly contracts and subsidies. The Chamber of Deputies has no RSS, so we read its session overview via an HTML scraper. The Constitutional Court has an undocumented but stable RSS feed.

Watchdog organisations. Transparency International CZ and Frank Bold are domestic experts on corruption and rule of law. They serve primarily as a sanity-check for the corruption and judicial pillars.

International sources. Five editorial outlets (POLITICO Europe, BBC News Europe, Euronews, Visegrad Insight, Brno Daily) plus four institutions (GRECO, Venice Commission, EC Rule of Law, ECtHR) — the institutional ones are not yet wired up, awaiting adapter implementation. The point: an outside-in perspective, often emphasising the CEE context that local media sometimes overlook. Visegrad Insight has the highest per-item Czech relevance among foreign sources.

How sources change

Adding a new source is a two-minute commit to config/sources.yaml. If the feed is RSS, no code changes — just add it to the --sources default in weekly-pipeline.yml. For a non-RSS source (HTML scrape or API), a dedicated adapter in src/lib/ needs to be written and registered in src/pipeline/fetch-sources.ts.

A source is removed once its feed stops working (typically HTTP 5xx for more than two weeks). Example: Iuridicum Remedium was dropped on 2026-04-29 after a persistent HTTP 500 — the entry stays in the yaml as a comment so it can be quickly restored when its feed is fixed.

Source changes are not logged in the methodology CHANGELOG, which is reserved for methodology adjustments (weights, rubric, pillars, governance). For source history, run git log -- config/sources.yaml.