Methodology

How ClearMarket selects, sources, and grades prediction-market data across Kalshi and Polymarket.

This document is the source for the public /methodology page and the canonical internal record. Last updated 2026-07-03.

1 · UNIVERSE & INSTITUTIONAL FILTER

ClearMarket ingests the full Kalshi + Polymarket universe and filters to the institutionally relevant set.

Three filter inputs

Category whitelist — 9 institutional categories IN: economics, financials, crypto, companies, technology, health, politics, geopolitics, climate. OUT: sports, entertainment, mentions.
Volume threshold — venue-specific minimum (tuned empirically).
Resolution horizon — resolves within 24 months.

Category derivation differs by venue. Kalshi ships a clean category field, so we map it directly. Polymarket has no category field, so we derive it from tags[]: a deterministic map keyed on top-level tags (weighting Polymarket's forceShow flag), with a constrained LLM fallback for events whose tags carry no category signal.

2 · VERIFIED-SOURCE LAYER

Every market's resolution source is captured from platform-provided data, never invented by a model. The mechanism differs by venue.

Kalshi — structured (direct)

The resolution source and URL live in a structured API field (series.settlement_sources). We read it directly; no extraction. Across 734 unique series (2026-06-12 vintage), Kalshi self-tags 98% "authoritative" — but we do not take the venue's word for it. Our independent commitment classification treats a multi-outlet "credible-reporting menu" (a list of general-news outlets — ABC, Fox News, MSNBC, WSJ — with no single controlling authority) as an uncommitted placeholder, exactly as the commitment axis below requires, even though Kalshi self-tags it authoritative. A menu is committed only when the list also names a real non-news authority (SEC, EIA, CF Benchmarks) — in which case that authority is the source of record.

~86% committed (series-level, independent) — a single named authority (Federal Reserve, BLS, BEA, NWS, ICE, CF Benchmarks), or a menu that also names a real non-news authority.
~14% uncommitted — multi-outlet news menus Kalshi self-tags authoritative (88 series) plus loose placeholders ("For example, Google Finance", 18 series).
0% missing.

Polymarket — deterministic capture, gated judgment

The sources are embedded in the market description free-text. We capture them without ever letting a model mint an identifier:

Regex extracts every candidate URL in the description — including bare-domain citations ("as posted on tesla.com") — and ALL candidates are kept in the market's resolution_source_list (the deterministic floor; nothing is collapsed away).
One per-event commitment judgment (an LLM read of the full source list plus the rules prose, against the versioned rubric below) surfaces authorities the venue names in words with no URL ("the official CME settlement price") and selects the controlling URL by number among the extracted candidates.
Verbatim gate: every identifier the judgment emits — a prose-surfaced source name, the source of record — must appear in the venue's own text as a whole token (word-boundary match), or it is discarded. A "named" classification whose evidence does not survive the gate is demoted to uncommitted, never shipped uncapped. Two extraction rules keep the gate honest in both directions: unfilled venue template tokens ("official representatives of <person>") are never accepted as evidence, and URL-only source entries are named by their domain (federalreserve.gov) so a real authority cited by link alone is visible to the gate.

Zero hallucinated URLs remains structural: URLs enter the dataset only via regex extraction, and the judgment can only pick among them by index. Prose-surfaced names are tagged clearmarket_editorial and rendered as such. Extraction-mix statistics will be republished with the first enrichment under this pipeline.

Provenance tiers

Every value carries its origin. direct: platform API, structured field, or verbatim-extracted URL. editorial: LLM-drafted interpretation, e.g. underlying_reference. subjective: resolution by consensus, no named data source. The LLM is confined to the editorial tier; any verifiable identifier (source name, URL) traces to platform data.

Source commitment

Capturing a source is not the same as the venue committing to one. Our editorial engine classifies every market's resolution source on a commitment axis, reading the venue's full source list and rules against a published, version-stamped rubric (every judgment records the rubric version that produced it, so a grade change between vintages is always attributable to a rubric change or a venue-text change) — independent of the venue's own quality label — because a hedge, a placeholder, or a menu of interchangeable outlets is not a definitive source:

named — a concrete, resolvable authority (Federal Reserve, BLS, the NYC Rent Guidelines Board, a deep-link .gov / official-electoral citation), or a defined mechanism over multiple sources: a stated order of precedence, or a quorum ("all three sources must call it") — recorded as source_mechanism. A primary source with only an optional alternative ("a consensus of credible reporting may also be used"), with no rule for which controls on conflict, is NOT a mechanism — that is the Venezuela failure shape, classified placeholder. No grade cap.
committed_secondhand — the venue commits to exactly one concrete, checkable source that is not an authority for the quantity: a commercial data aggregator (Fiscal.ai, Google Finance) or a single news/sports outlet as sole controlling source. A real commitment — but the source re-publishes numbers whose authority lies elsewhere, with no published settlement methodology and no rule for what controls if the two disagree. The line: a data provider is an authority only when it is the designated administrator or calculator of the quantity itself with a published methodology (CF Benchmarks); a platform transcribing someone else's numbers is not. Grade capped C.
uncommitted — a source is gestured at but not committed to. Two sub-cases: illustrative ("For example, Google Finance" — a candidate, not a commitment) and placeholder ("a consensus of credible reporting" — names no concrete authority).
none — no source language at all.

The principle: "for example" means the venue did not commit to a definitive source. Source commitment is a separate axis from outcome objectivity — an objective outcome resolved against an uncommitted source still carries resolution risk, because which report decides is left unspecified. This is the gap the CFTC comment documents: regulated venues commit to a named authority; unregulated markets frequently do not. The commitment classification feeds a grade ceiling (see below), and the venue's literal source field is shown verbatim on each event page — including when it does not match the resolution rules — so the reader judges the source in the venue's own words, not ours.

3 · RESOLUTION CLARITY GRADE (RCG) · v2

A per-market A/B/C grade of how unambiguously a market will resolve. Per market, not per event: one event can hold a Kalshi market that resolves clearly and a Polymarket market on the same question that resolves subjectively. The grade is a weighted score (0–100) banded into A/B/C, then capped — not a single rule. It rolls seven factors grounded in documented resolution failures (a16z's analysis of prediction-market resolution failures + the CFTC ANPRM comment): the failures were not absent sources but specific structural defects, so the grade scores those defects directly.

Seven factors (weights sum to 100)

Factor	Weight	What it measures
Trigger objectivity	28	Objective measurable trigger vs. discretionary criterion (the Zelensky-"suit" defect). Heaviest because it stays hard after venues clean up citations.
Contested reality	22	Whether the underlying fact is controlled or disputed by an interested party (Venezuela, Ukraine). Most saturation-resistant.
Source clarity	18	Authoritative named source + usable link. Necessary but commoditizes fastest.
Arbiter incentive	12	Dispute resolver's capture risk: regulated/automated vs. permissionless oracle (the UMA flip).
Source-conflict rule	8	If 2+ genuinely competing sources (independent authorities that could give contradictory answers) are named, is there an explicit precedence/fallback rule? A named primary with an informal fallback is not a conflict (Venezuela). Situational — the only one.
Temporal precision	7	Controlling timestamp vs. source-update lag (OPM shutdown).
Source mutability	5	Editable/tamperable source without a snapshot rule (Ukraine map, Météo-France sensor).

Scoring rules

Always-on factors (trigger, contested, source clarity, arbiter, temporal, mutability — 92 of 100) score every market. The one situational factor (source-conflict rule) scores only when 2+ distinct sources are named and is excluded from the denominator otherwise (score re-normalized over applicable factors).
Score = 100 × earned / applicable, banded A ≥ 80, B 55–79, C ≤ 54.
Hard caps (ceilings, applied after banding): source uncommitted-illustrative goes to B; committed-secondhand, uncommitted-placeholder or none goes to C (section 2 — a hedge is not a committed source; the binding constraint on most unregulated markets); 2+ genuinely competing sources with no precedence rule goes to C; discretionary trigger with no named source goes to C; permissionless oracle + discretionary trigger goes to C; adversarial ground truth (contested reality) goes to B. Caps only ever lower a grade, never raise it.
The applicable-factor count is surfaced with the grade ("B — scored on 6 of 7 factors") for comparability across markets of different complexity.

Five LLM-judged factors — trigger objectivity, contested reality, source-conflict rule, temporal precision, source mutability — are scored in one per-event call (enhance.llm_rcg_factors), which reads the full source list and count. Source commitment is a second per-event LLM judgment (section 2), and source clarity derives deterministically from its stamped result — one judgment, no second-guessing. Arbiter incentive is purely deterministic (it reads the venue's arbitration model). The final score and every cap are deterministic arithmetic over the stamped factor values, so any grade re-derives from its stored audit object. The full weighted model, scoring rules, and caps are in rcg-scoring-model.csv (repo root) — the canonical spec for this section.

The source-conflict rule was deterministic (URL-count + regex) through 2026-05-26; it moved to the LLM on 2026-05-27 because URL counting mis-read one source across multiple pages as "multiple sources" (false C) and missed prose-named sources entirely (false na). It is now a reading-comprehension judgment.

Validation

The model reproduces the four prominent prediction-market resolution failures of the past 24 months as C / C / B / C — each a real dispute that traced to a rule naming a source category instead of a specific authority:

Venezuela election — C. The rule named two sources that conflicted ("official information from Venezuela" vs. "a consensus of credible reporting"). On election night the Maduro-controlled authority and the opposition's tallies disagreed, and nothing said which controlled.
Zelensky suit — C. Over $200M traded on whether his NATO-summit outfit was a "suit," with no named source to decide. It resolved YES, then UMA flipped it to NO — the dispute was about whose interpretation controlled, not what happened.
OPM shutdown — B. Resolved against the OPM website's status; funding was signed November 12 but the site did not update until November 13, so traders right on the event lost on the source's lag.
Ukraine map — C. Resolved against a particular online map, with no rule for what happens if it is edited — and it was allegedly modified mid-contract.

Beyond these four tail cases, the grade has been tested at scale against the public dispute record: across 7,166 Polymarket markets with public dispute records, C-rated contracts were formally disputed at 20 times the rate of A- and B-rated contracts (1.59% vs 0.08%); the three challenges to A-rated contracts all failed against their committed sources. See the full resolution-clarity validation backtest.

Full-universe distribution is pending the per-event LLM rater's calibration run against a sample of ordinary markets — four tail failures are not a calibration set.

The cross-venue resolution-governance contrast (objective, named-authority Kalshi markets vs. the subjective-consensus share on Polymarket) is the signal the venues themselves do not surface.

4 · EDITORIAL ENRICHMENT

On top of the verified-source layer, ClearMarket adds editorial fields (editorial_notes, underlying_reference, tags, canonical question) drafted by an LLM and tagged editorial. These interpret and gloss the verified data — e.g. naming the authoritative publisher (S&P Dow Jones Indices) where a venue cites only a placeholder (Google Finance) — but never serve as the source of record. Enrichment follows an event→market inheritance model (the option-chain class→series pattern): the resolution rule and source are written once at the event as a generic, subject-free ontology (resolution_reference) and inherited by every child, while each child's underlying_reference is composed from that ontology plus its own subject (group_item_title). A multi-outcome bundle therefore names each candidate or company on its own market and never broadcasts one child's subject across its siblings.

5 · CROSS-VENUE LINKING

The same bet often trades on both venues under different wording. ClearMarket links them at the market grain, not the event grain, because the two are different things.

Three grains

market_id (CM-MKT-…) — one contract on one venue. The atomic unit.
event_id (CM-EVT-…) — a venue's bundle of one or more markets under a canonical question. Every event is single-venue, and one event can hold many distinct questions (a price ladder, a basket).
question_id — the venue-independent identity of the bet. Markets sharing a question_id are the same question across Kalshi and Polymarket, and across events within a venue.

Because every event is single-venue and the cross-venue groups span different event_ids, the link cannot be the event — it is the question_id. The matching key is subject + threshold + settlement_style + window; polarity is normalized, so an "above" and a "below" on the same threshold are the same question with complements flipped.

also_on — the cross-venue surface

Each market carries an also_on field: an array of { venue, market_id, price } for the same question priced on other venues, or null when the question trades on only one. It is the only field that asserts a cross-venue twin — a market advertises one only when it genuinely has one. The linker is deliberately high-precision, low-recall: a missing link is preferred to a wrong one.

6 · ELIGIBILITY SCREENS

A separate layer on top of the reference data: parameterized rule-sets screened per contract, stamping every market eligible / review / not_eligible against a named distribution rule-set. Statuses describe fit against that one rule-set — they are not market quality (quality is the Resolution Clarity Grade above). First regime: CIRO Administrative Bulletin 26-0076 (Canadian dealer distribution); index at /screens/. Screening data supporting a dealer’s own determination — not a compliance opinion.