SummerEyes — Full Documentation

Formal uncertainty reasoning engine. Auditable verdicts from conflicting sources.

Overview

SummerEyes is a reasoning engine that analyzes claims from multiple sources to detect contradictions, assess confidence, evaluate timelines, and identify what evidence is missing. It produces auditable verdicts using formal mathematics: source weighting, temporal decay, opinion fusion, and conflict resolution.

The engine is available as an MCP tool for AI agents and as a REST API.

Connecting

MCP (Model Context Protocol)

Endpoint: POST https://api.summereyes.vip/mcp/v1
Transport: Streamable HTTP (MCP 2025-03-26 spec), JSON-RPC 2.0
Auth: x-api-key header
Tool name: submit_investigation

REST API

Endpoint: POST https://api.summereyes.vip/api/v1/investigations/analyze
Auth: x-api-key header
Content-Type: application/json

References

Interactive API docs (Scalar UI): https://api.summereyes.vip/docs
OpenAPI spec: https://api.summereyes.vip/openapi.json
Get an API key: https://summereyes.vip/dashboard/connection

How the Engine Works

1. Source Weighting

Each source (actor) gets an effective credibility score based on:

base_reliability (0.01–0.99) — overall trustworthiness
source_type — expected reliability range (Regulator > Expert > Analyst > Journalist > Insider > Institutional > Anonymous > SocialMedia > Troll)
competence — per-predicate topic expertise (0–1), multiplied with reliability
Conflicts of interest are reflected in lower reliability scores

Guidelines for base_reliability:

0.80–0.95: regulators, auditors, established experts
0.50–0.80: analysts, journalists, domain experts
0.10–0.30: anonymous sources, social media
0.01–0.30: trolls, known bad actors

Credibility floor: When effective reliability (base_reliability x competence) falls below ~0.15, the claim is acknowledged but its evidence counts are zeroed in fusion.

2. Temporal Decay

Older claims lose weight automatically. The decay rate is computed from:

Domain half-life: Finance (180 days), News (30 days), Technology (365 days), Geopolitics (730 days), Medicine (1825 days), Science (9125 days), Legal (1460 days), General (365 days)
Epistemic status multiplier: conjecture (0.5x = decays 2x faster), hypothesis (1x), theory (2x = decays 2x slower), law (10x = near-immortal), superseded (0.25x = decays 4x faster), retracted (zero weight)
Source authority: institutional/regulatory sources decay slower
Claim type: predictive claims decay faster than factual claims

Corroboration resets the clock: Independent corroboration (different citation chain) fully resets temporal decay. Dependent-source corroboration provides partial refresh controlled by dependent_refresh_factor (default 0.5).

Supersession: When a newer claim replaces an older one (via supersedes_claim_id), the older claim's epistemic status is automatically set to "superseded" and decays 4x faster.

3. Opinion Fusion

Claims are fused in three levels:

Within-source: Deduplicate — one source making the same claim twice doesn't double-count
Within-group: Combine claims from sources in the same citation chain
Cross-group: Fuse independent source groups — corroboration from separate sources strengthens confidence

The result is a three-part opinion: belief + disbelief + uncertainty = 1.0

4. Conflict Resolution

When claims contradict each other, the engine:

Builds a formal argumentation framework with attack and support edges
Computes all coherent interpretations — each is a consistent subset of claims that survives all challenges
Scores each interpretation for coherence (0–1) based on evidential coverage, internal support density, and mean credibility
Falls back to a single canonical interpretation when frameworks exceed 25 arguments (full enumeration uses exhaustive search)

Scope handling: Claims with different scope values on the same predicate (e.g., "global" vs "US" revenue) are NOT treated as contradictions.

Numeric proximity: Financial values within 5% of each other are not treated as contradictions.

Input Schema

The submit_investigation tool accepts the following JSON structure:

{
  "research_question": "string (required) — The main question driving the investigation",
  "domain": "string (optional) — Finance | News | Technology | Geopolitics | Medicine | Science | Legal | General",
  "decay_half_life_days": "number (optional) — Explicit temporal half-life override in days",
  "dependent_refresh_factor": "number (optional, 0.0-1.0, default 0.5) — Controls dependent-source corroboration refresh",
  "summary_mode": "boolean (optional, default false) — Compact output: top findings, contested claims, evidence gaps",
  "actors": [
    {
      "id": "string (required)",
      "name": "string (required)",
      "source_type": "string (required) — Analyst | Journalist | Expert | Insider | Regulator | Institutional | Anonymous | SocialMedia | Troll",
      "base_reliability": "number (optional, 0.01-0.99) — Clamped to range. Inconsistent values produce warnings",
      "competence": "object (optional) — Map of predicate name to expertise score (0-1)",
      "default_competence": "number (optional, 0-1, default 1.0) — Fallback for unlisted predicates"
    }
  ],
  "subjects": [
    {
      "id": "string (required)",
      "name": "string (required)",
      "subject_type": "string (required) — Freeform label (e.g., Company, Person, Financial Instrument)"
    }
  ],
  "claims": [
    {
      "id": "string (required)",
      "actor_id": "string (required) — Must match an actor id",
      "subject_id": "string (optional) — Must match a subject id (omit for meta-claims)",
      "target_claim_id": "string (optional) — For meta-claims about other claims",
      "predicate": "string (required) — What aspect this claim is about (e.g., revenue, solvency)",
      "value": "string (required) — The claimed value (e.g., fabricated, legitimate, 100M)",
      "content": "string (required) — Full text of the claim",
      "valence": "string (required) — Supports | Refutes | Neutral — CRITICAL for contradiction detection",
      "claim_type": "string (optional) — Factual | Predictive | Evaluative | Causal | Procedural | Attribution | Methodological",
      "assertion_time": "string (optional, ISO 8601) — When the claim was made. Always provide this",
      "event_start_time": "string (optional, ISO 8601) — Start of the event window. Must be before event_end_time",
      "event_end_time": "string (optional, ISO 8601) — End of the event window",
      "decay_half_life_days": "number (optional) — Per-claim half-life override",
      "epistemic_status": "string (optional) — conjecture | hypothesis | theory | law | superseded | retracted",
      "scope": "string (optional) — Disambiguates same-predicate claims (e.g., global, US, nominal)",
      "supersedes_claim_id": "string (optional) — ID of claim this one replaces",
      "corroboration_events": [
        {
          "time": "string (required, ISO 8601)",
          "actor_id": "string (required) — Must differ from the claim's actor_id",
          "description": "string (optional)"
        }
      ],
      "relationships": [
        {
          "relationship_type": "string (required) — complements | refines | qualifies | supersedes | challenges_methodology",
          "target_claim_id": "string (required)"
        }
      ]
    }
  ],
  "evidence": [
    {
      "id": "string (required)",
      "claim_id": "string (required) — Must match a claim id",
      "content": "string (required) — Description of the evidence",
      "valence": "string (required) — Supports | Refutes | Neutral",
      "weight": "number (optional, default 1.0) — Evidence strength multiplier"
    }
  ]
}

Required fields

Only research_question, actors, subjects, and claims are required at the top level. Within each claim, id, actor_id, predicate, value, content, and valence are required.

Output Schema

subject_results

Per-subject verdicts:

belief (0–1): evidence supporting the dominant interpretation
disbelief (0–1): evidence opposing it
uncertainty (0–1): how much is unknown
expected_probability (0–1): single bottom-line number (posterior)
truth_status: True | False | Both | Neither (four-valued logic)

ranked_claims

All claims sorted by net evidence strength (belief - disbelief). Five-level gradient:

strong: high belief, very low disbelief
moderate: good belief, low disbelief
contested: significant evidence both ways
weak: low evidence or high uncertainty
refuted: low belief, low uncertainty

conflict_analysis

Per subject+predicate pair, a full argumentation analysis with interpretation trees. Each InterpretationTree shows:

Which claims are accepted/rejected
Who made each accepted claim and their credibility scores
Attack/support relationships between claims
coherence_score (0–1): how well accepted claims hang together as a narrative

claim_analyses

Per-claim breakdown:

belief, disbelief, uncertainty
temporal_decay_factor (1.0 = fresh, 0.5 = half-life reached)
epistemic_status and suggested_epistemic_status
freshness_score (0–1 composite)
below_credibility_floor (when the actor's effective reliability is too low)

sensitivity_analysis

Per-subject evidence gaps ordered by potential impact:

potential_shift: how much one additional confirming evidence would change the probability
is_contested: whether multiple interpretations exist
discriminating_claims: claims in some but not all interpretations — targeting these best resolves disagreement
suggestion: human-readable recommendation for what to investigate next

temporal_analysis

stale_claims: claims past their half-life
recently_corroborated: claims with recent independent confirmation
supersession_chains: claim evolution history
overall_freshness (0–1): average freshness across all claims

warnings

Input validation issues: source-type/reliability mismatches, self-corroboration, pre-assertion corroboration. These are soft warnings — the engine never hard-rejects input. Address them to improve analysis quality.

Worked Example: Sherlock Holmes Fraud Investigation

Scenario

Moriarty Enterprises Ltd is a London-based consulting firm run by Professor James Moriarty. It also manages the Reichenbach Investment Fund. Multiple sources have raised concerns about fabricated revenue and fund insolvency.

Research question: "Is Moriarty Enterprises engaged in financial fraud?"

Input

{
  "research_question": "Is Moriarty Enterprises engaged in financial fraud?",
  "domain": "Finance",
  "dependent_refresh_factor": 0.5,
  "actors": [
    { "id": "holmes", "name": "Sherlock Holmes", "source_type": "Expert", "base_reliability": 0.92 },
    { "id": "watson", "name": "Dr. John Watson", "source_type": "Journalist", "base_reliability": 0.75 },
    { "id": "lestrade", "name": "Inspector Lestrade", "source_type": "Regulator", "base_reliability": 0.80 },
    { "id": "adler", "name": "Irene Adler", "source_type": "Insider", "base_reliability": 0.65 },
    { "id": "moriarty", "name": "Prof. James Moriarty", "source_type": "Institutional", "base_reliability": 0.30 },
    { "id": "times", "name": "The Times of London", "source_type": "Journalist", "base_reliability": 0.70 },
    { "id": "telegraph", "name": "The Daily Telegraph", "source_type": "Journalist", "base_reliability": 0.65 }
  ],
  "subjects": [
    { "id": "moriarty-ent", "name": "Moriarty Enterprises Ltd", "subject_type": "Company" },
    { "id": "reichenbach-fund", "name": "Reichenbach Investment Fund", "subject_type": "Financial Instrument" }
  ],
  "claims": [
    {
      "id": "c1", "actor_id": "holmes", "subject_id": "moriarty-ent",
      "predicate": "revenue", "value": "fabricated",
      "content": "After examining the ledgers obtained from the Diogenes Club, the reported revenue of Moriarty Enterprises bears no correlation to actual client payments.",
      "valence": "Supports", "claim_type": "Factual", "epistemic_status": "theory",
      "assertion_time": "2026-02-15T10:00:00Z",
      "corroboration_events": [
        { "time": "2026-02-16T14:00:00Z", "actor_id": "watson" },
        { "time": "2026-02-28T06:00:00Z", "actor_id": "times" }
      ]
    }
  ],
  "evidence": [
    { "id": "e1", "claim_id": "c1", "content": "Ledger photocopies from the Diogenes Club", "valence": "Supports", "weight": 1.2 }
  ]
}

See /llms-full.txt for the complete worked example with all 11 claims and 4 evidence items.

Expected Output Summary

Subject: Moriarty Enterprises — revenue
High belief that revenue is fabricated. Holmes (Expert, 0.92) and Watson (Journalist, 0.75) both support this. Lestrade (Regulator, 0.80) confirms an active investigation. Moriarty's denial carries little weight at 0.30. Truth status: Both.

Subject: Reichenbach Fund — solvency
Very high belief the fund is insolvent. Adler (Insider, 0.65) provides firsthand testimony. Holmes (Expert, 0.92) corroborates with court-ordered bank records. Lestrade confirms the fund is frozen. Truth status: Both.

Ranked claims (top to bottom)

Strong — Holmes: Reichenbach Fund is a Ponzi scheme (bank statements)
Strong — Holmes: Revenue is fabricated (ledger photocopies)
Strong — Lestrade: FCA has frozen the Reichenbach Fund
Strong — Adler: Fund assets overstated by 60%
Strong — Lestrade: Scotland Yard inquiry is active
Moderate — Watson: Circular billing pattern consistent with fraud
Moderate — The Times: Revenue difficult to reconcile with known engagements
Moderate — The Telegraph: Investors unable to withdraw since November
Moderate — The Telegraph: Former associates distancing
Weak — Moriarty: Milverton & Associates audit found no irregularities
Refuted — Moriarty: Revenue is legitimate
Refuted — Moriarty: Fund returned 18% and maintains reserves

Best Practices for LLM Agents

Always set valence on every claim. Without it, the engine cannot distinguish affirmation from denial.
Always provide assertion_time. Temporal analysis requires knowing when claims were made.
Don't duplicate claims from the same actor. Duplicates are dropped to prevent evidence amplification.
Use consistent numeric formats with K/M/B/T suffixes for financial values.
Set scope when claims measure different things on the same predicate (e.g., "global" vs "US").
Use competence scores when a source has uneven expertise across topics.
Use corroboration_events for long-running investigations where early claims might otherwise appear stale.
Use summary_mode: true when you need a quick assessment rather than full analysis.
Read the warnings in the response — they tell you how to improve your input.
Use sensitivity_analysis to decide what to investigate next — it tells you which missing evidence would most change the conclusions.