How We Keep UFC Cards in Sync Without Shipping Bad Data
Project: Ultimate Fight IQ (UFIQ) Link: https://ultimatefightiq.com
How We Keep UFC Cards in Sync Without Shipping Bad Data
Project: Ultimate Fight IQ (UFIQ) Link: https://ultimatefightiq.com
Case study type: Product build The task: Ingest UFC cards from the web, detect lineup changes, and apply updates without corrupting member picks, locked schedules, or published event data. What we learned: Automate discovery and diffing, but route uncertain or destructive changes through a human approval queue with evidence quotes before anything touches production. Last updated: June 2026
Case study at a glance
| The task | Build a pipeline that scrapes UFC.com and UFCStats, extracts structured card data, and keeps Postgres in sync through the fight week lifecycle |
| Who it was for | UFIQ admins and members who need accurate cards, lock times, and live results |
| Main constraint | UFC cards change constantly; blind overwrites break picks, schedules, and trust |
| What we built | UFIQ Card Ingestion Pipeline: Firecrawl scrape agents, Lovable AI extraction, candidate_updates review queue, Mission Control inbox, and candidate-update-action approval |
| Outcome | High-confidence scalars auto-apply with field locks; fighter swaps, fight add/remove, and low-confidence diffs wait for admin approval with cited evidence |
Background
Ultimate Fight IQ runs on event data: fights, fighters, lock times, odds, and results. Early on, card maintenance was manual. That does not scale across a UFC season where cards swap fighters, move times, cancel bouts, and finish live on staggered clocks.
We needed automation, but automation alone is dangerous. A scraper that deletes a fight row breaks pick foreign keys. A time change after members lock picks creates support nightmares. A false-positive fighter swap from a stale page would be worse than no update at all.
The product already had pg_cron and a scheduler (ufc-scheduler). The missing piece was a governed ingestion layer: scrape and extract with AI, diff against Postgres, auto-apply only what we trust, and queue everything else for a human who can read the evidence.
The task
Give the platform a repeatable way to:
- Discover upcoming UFC events and ingest card structure before fight week.
- Refresh existing cards when UFC.com changes matchups, times, or bout order.
- Poll live results during fight night and run heavy verification after the card.
- Never DELETE fights on removal; preserve pick integrity.
- Show admins a review queue with headlines, summaries, and verbatim evidence quotes from the source page.
One sentence version: turn UFC card maintenance into scrape → diff → approve, not scrape → overwrite.
Constraints
- Secrets required. Fetch and refresh agents need
FIRECRAWL_API_KEYandLOVABLE_API_KEY. - Content locks. When
content_locked_atis set, initial fetch skips fight sync; refresh funnels all fight diffs to candidates. - Schedule locks. When
schedule_locked_atis set or times were manually set within 30 days, time fields never auto-apply. - No fight DELETE on removal.
fight_removedsetsstatus = 'cancelled'to preserve pick references. - Live vs heavy agents. While
status = 'live', the scheduler runs leanufc-agent-fetch-live-results(~3 min) and skips heavyufc-agent-fetch-results(max one per tick on fight day/final). - Evidence gating. Refresh agent caps confidence when AI
evidence_quoteis missing or not found in scraped markdown.
Our approach
We split card maintenance into four agent roles plus one approval gate:
flowchart TD
A[pg_cron → ufc-scheduler] --> B{Event phase}
B -->|scheduled / fight_week| C[ufc-agent-fetch-event]
B -->|fight_day / final| D[ufc-agent-fetch-results]
B -->|live + auto| E[ufc-agent-fetch-live-results]
C --> F[Firecrawl scrape ufc.com]
D --> G[Firecrawl scrape ufcstats]
F --> H[Lovable AI extraction]
H --> I{Confidence + field locks}
I -->|high + unlocked| J[Direct apply + field_locks]
I -->|medium / low or locked| K[candidate_updates pending]
L[Mission Control / Run / Agent Ops] --> M[candidate-update-action]
M -->|approve| N[Apply to events / fights]
M -->|approve| O[event_change_articles + content_tasks]
- Discovery agent.
ufc-agent-fetch-eventscrapesufc.com/events, picks an event URL, extracts structure, upserts fighters, and syncs fights viasyncFightsForEvent. - Refresh agent.
ufc-agent-refresh-eventre-scrapes an existing card, matches fights by fighter-pair key, and queues structural diffs. - Results agents. Heavy results from UFCStats with tiered auto-apply; lean live poll from UFC.com with optional auto-apply when
live_results_auto = true. - Shared diff helper.
applyProposals()in_shared/diff.tsauto-applies high-confidence fields unless locked; everything else insertscandidate_updates. - Human gate.
candidate-update-actionapplies approved changes, writesfield_locks, and insertsevent_change_articles.
How we solved it
Step 1: Seed source lists and cap tracked events
What we did: Seeded source_lists with slug ufc-default pointing at UFC.com and UFCStats URLs. Fetch agent defaults to tracking up to two upcoming events (upcoming_cap).
Decision: Cap discovery instead of ingesting every URL on the index.
Why: Fight week ops need depth on the next cards, not a warehouse of stale pages.
Step 2: Scrape with Firecrawl, extract with AI tool calling
What we did: Both fetch and refresh agents call Firecrawl /scrape for markdown plus links, then Lovable AI returns structured events and fights via a report_ufc_event schema with per-field confidence.
Decision: Keep extraction in the agent, not hand-written parsers for every UFC.com layout shift.
Why: UFC pages change markup often. AI extraction with a strict schema beats brittle CSS selectors.
Step 3: Sync fights transactionally on initial ingest
What we did: syncFightsForEvent upserts fights by bout_order and prunes removed orders on first ingest. New events land with review_state: 'draft'.
Decision: Direct sync on first ingest; defer ongoing fight mutations to refresh + candidates when content is locked.
Why: Admins need a complete card quickly on discovery. After publish, destructive sync must not bypass review.
Step 4: Diff refresh cards into proposals with evidence
What we did: Refresh agent detects event scalars, per-fight scalars, fighter_swap, fight_added, and fight_removed. AI attaches change_notes with evidence_quote. evidenceMatches() requires quotes ≥8 chars and substring-found in markdown (case-insensitive). Unverified swaps get "Unverified:" headlines and downgraded confidence.
Decision: All fight-level refresh diffs go to candidate_updates; no auto-apply on fight fields in refresh.
Why: Swaps and removals affect member picks. Admins must see the quote before approving.
Step 5: Tier auto-apply for results with field locks
What we did: applyProposals() auto-applies high-confidence result fields (winner_fighter_id, method, verified bundles) and writes field_locks with source auto_high_confidence. Medium-confidence live progress fields can auto-apply during heavy results. Locked fields skip re-scrape churn.
Decision: Lock fields after apply so agents stop re-proposing the same winner.
Why: Live polling runs every few minutes. Without locks, the queue fills with duplicate proposals.
Step 6: Build Mission Control as the admin inbox
What we did: useMissions.ts loads pending candidate_updates and builds priority cards: time disagreement → fighter swap → fight added → other. Approve calls candidate-update-action; stale refresh missions can re-fire ufc-agent-refresh-event.
Decision: One inbox across Mission Control, Run tab (PendingChangesQueue), and Agent Ops (LiveResultsPanel).
Why: Fight week is time-sensitive. Admins should not hunt three UIs for the same swap proposal.
Step 7: Approve with downstream articles and content tasks
What we did: On approve, candidate-update-action applies the mutation, writes field_locks (admin_approval), inserts event_change_articles when notes.headline is set, and queues content_tasks for structural changes (swap, add, remove).
Decision: Tie approval to member-visible card updates and social content presets.
Why: When the card changes, picks, news strip, and content pipeline should move together.
What we built
| Piece | Role |
|---|---|
ufc-scheduler | pg_cron tick, lifecycle derivation, agent dispatch |
ufc-agent-fetch-event | Discovery ingest + syncFightsForEvent |
ufc-agent-refresh-event | Non-destructive diff + evidence quotes |
ufc-agent-fetch-results | Heavy UFCStats verification |
ufc-agent-fetch-live-results | Lean live poll with markdown hash short-circuit |
candidate_updates | Pending proposal queue with notes jsonb |
field_locks | Per-field seal after auto-apply or admin approval |
candidate-update-action | Approve/reject mutations |
| Mission Control + Run + Agent Ops | Admin review surfaces |
event_change_articles | Evidence-backed card update strip on event pages |
End-to-end flow:
- Scheduler or admin fires fetch for missing upcoming cards.
- Firecrawl + AI writes draft events and fight structure.
- Refresh agent queues uncertain diffs with evidence.
- Admin approves in Mission Control.
- Live agent polls during
status = 'live'; heavy agent verifies after the card. - Approved changes publish to event pages and content tasks.
Results
Before
- Card updates were manual or all-or-nothing scrapes.
- Fight removals risked breaking pick foreign keys.
- No evidence trail when a fighter swapped overnight.
- Live results and post-card verification shared one blunt path.
After
- Discovery, refresh, live poll, and heavy verify are separate agents with clear dispatch rules.
- Structural removals cancel fights instead of deleting rows.
- Admins approve swaps with verbatim quotes from UFC.com markdown.
field_locksstop duplicate proposals after apply.- Approved changes feed
event_change_articlesand social content tasks.
How we know it worked
- Refresh agent suppresses swaps when both old fighters still appear in source markdown.
content_locked_atforces refresh-only diffs through candidates.- Live agent returns early when
last_results_md_hashis unchanged (no LLM spend). - Scheduler excludes heavy results while status is
live. - Mission Control priority ordering matches fight-week urgency.
What you can learn
- Automate discovery, gate mutation. Scraping is cheap; corrupting production data is expensive.
- Evidence quotes are your audit trail. Require substring matches in source markdown before high confidence.
- Never DELETE domain rows users reference. Cancel, void, lock, and preserve foreign keys.
- Split live poll from heavy verify. Fight night needs speed; post-card needs accuracy.
- Field locks beat queue spam. Seal approved fields so agents stop re-proposing winners.
- One inbox, many surfaces. Same
candidate_updatestable powers Mission Control, Run, and Agent Ops.
Next step
Open /admin/mission-control during fight week and work the pending queue. For a stale card, trigger refresh from Quick Actions and compare proposals in the Run tab Before panel. Read docs/guides/mission-control.md for mission types and approve flows.
For developers extending the pipeline: add diff types in ufc-agent-refresh-event, never bypass candidate-update-action for fight mutations on locked content, and keep evidence gating aligned with evidenceMatches().