Ground-up replacement console

WebScraper Console v2

A CRM-focused extraction cockpit for turning pages into ranked, explainable, normalized profile data without falling back to legacy drilldowns.

Checking API Waiting for localhost scanner status...

Product recommendations from the prompt

Build direction, not legacy inheritance

Automated extraction firstScan, render, extract, normalize, rank, and export from the UI with no manual code parsing.

CRM-ranked profilesShow which profiles match, why they rank, and how criteria changed the outcome for any timeframe.

Schema on commandLet operators conjure fields with natural language, then promote stable fields into backend schemas.

Diagram-like pipelinesMake stages editable, reorderable, and visible so failures can be remediated and rerun.

BI and scheduled reportsDashboards, exports, scan history, scheduled workloads, and report readiness belong in the same product shell.

Reliability over nostalgiaLegacy status conflicts, stale drilldowns, and duplicate state sources should be hidden, healed, or removed.

Scan command

Launch and observe

State: unknown
Rows: 0
Started: -
Finished: -
Sites: -
Mode: -

No scan outcome loaded yet.

Ranking criteria

Fully editable scoring workbench

Must include Exclude Timeframe

Require at least one include term Exclude rows when exclude terms match Hide duplicate profile repeats, keep first seen

Include term weight Phone signal weight Source link weight Fresh today weight Recent 3-day weight Exclude term penalty Duplicate penalty Missing-date penalty Minimum score Result limit

Import / export JSON

Ranking criteria are fully editable here; the old ranking drilldown is no longer required for operator scoring changes.

Recovery and version history Ranking criteria changes can be viewed, restored, or deleted from this panel.

Product completeness

All prompt capabilities, visible and accountable

Accuracy Control Plane

100% auditable UX/UI control, not impossible accuracy claims

Evidence, replay, quality gates, regression tests, drift detection, review queues, lineage, SLOs, compliance, schema, AI governance, integrations, and audit logs are tracked here with honest statuses.

Pipeline designer

Editable stages and drag order

Drag stages to reorder the workflow. Connections are persisted locally until the backend pipeline adapter is promoted.

Recovery and version history Pipeline stage order has local recovery points with warning-gated restore and delete.

Extraction Fix Lab

Diagnose missed fields from the UI

Available from CRM cards and Data Studio so profile fields, phone, photo, social, and classification misses can be fixed without code.

I reviewed the repair JSON and want to promote safe selectors to the site schema.

Paste a URL, choose expected fields, then diagnose. The console will show extracted values, missing fields, remediation steps, and schema repair actions.

Field evidence ledger

Explain every diagnosed field

Shows source URL, capture timestamp, aliases/selectors, confidence, normalization trace, and adapter evidence when the diagnosis payload provides it.

Partial

No diagnosis yet Run Diagnose URL to populate field-level evidence. Missing adapter provenance is labeled honestly instead of marked Live.

Replay / trace viewer

Replay the diagnosis path

Shows the operator action, job status, extraction decisions, payload evidence, and missing browser-trace artifacts for the latest diagnosis.

Partial

No replay trace yet Run Diagnose URL to capture the local action trace. Full screenshots, DOM snapshots, network events, and console errors require a promoted replay manifest adapter.

Regression tests

Golden fixture lab

Run known-good extraction fixtures before promoting repairs. This local runner validates field aliases and extracted values while backend fixture promotion remains pending.

No fixture run yet Use this before promoting schema or selector changes. Backend golden-dataset enforcement is still required before this becomes Live.

Canary / Shadow

Local promotion comparison

Compares fixture baselines and the latest diagnosis payload without creating production canary jobs.

No shadow comparison yet Backend canary jobs, sampled production shadow runs, and persisted diff payloads remain required before Live.

Local canary/shadow checks use browser-held fixture and diagnosis evidence only.

Drift detection

DOM drift signals

Flags missing selected fields, absent selector/DOM provenance, and fallback extraction so operators know when a site likely drifted.

No drift check yet Run a diagnosis first, then check drift signals. Backend DOM baselines and alert policy are still required before this becomes Live.

Visual drift detection

Screenshot and media signals

Checks latest diagnosis screenshots, recovered media, and screenshot crop evidence before a repair is promoted.

No visual drift check yet Run a diagnosis first, then check visual evidence. Screenshot diff baselines remain required before this becomes Live.

Volume and anomaly detection

Run volume signals

Checks loaded rows, scan history, duplicate pressure, missing critical fields, and latest diagnosis completeness before export or promotion.

No volume anomaly check yet Load scan results or run a diagnosis, then check local anomaly signals. Time-series thresholds and backend anomaly scoring remain required before this becomes Live.

Human review queue

Hold risky changes for review

Queue diagnosis gaps and export gate blockers locally before selectors, schemas, repairs, or exports are promoted.

Local review queue is empty. Backend queue persistence, assignments, and resolution audit remain required before Live.

No local review items Queue diagnosis gaps or export gate blockers before promoting repairs, schema changes, or exports.

Data lineage

Source to field to export trace

Connects source URL, extracted fields, normalization aliases, quality gate state, and export readiness from the current local evidence.

No lineage evidence loaded Run Diagnose URL or load scan results to build a local lineage trace. Backend lineage graph payloads and export manifests remain required before Live.

Local lineage uses browser-held diagnosis, loaded rows, field aliases, quality gates, and export state only.

Accuracy scorecard

Auditability score, not a 100% accuracy claim

Summarizes local completeness, confidence, regression fixture, review, drift, and quality-gate signals so operators know what is safe to promote.

No scorecard evidence loaded Run Diagnose URL or load scan results to build a local accuracy-control scorecard. Backend metric history, labeled ground truth, and promotion gates remain required before Live.

Local scorecard uses browser-held evidence only and must not be treated as guaranteed scraping accuracy.

Repair JSON preview

Recovery and version history Promoted repair recipes are recoverable before a selector change becomes a dead end.

Schema studio

Conjure fields without code

Natural language field command

Schema Studio fields are editable, removable, versioned, and warning-gated for recovery.

Recovery and version history Schema field edits and deletes can be inspected, restored, or removed from history.

Export center

JSON, CSV, spreadsheet output

Exports are governed by editable quality gates so low-evidence datasets do not leave the workspace unnoticed.

Export quality gate Waiting for ranked results.

Enforce gate

Minimum rows Source URL % Date % Phone % Photo % Block duplicate repeats

Gate status will update when results are loaded or thresholds change.

Exports use the currently loaded ranked results. Reload Results first for freshest data.

Workspace recovery

Version every editable object

Ranking criteria, pipeline stages, schema fields, extraction repairs, and reports are recoverable here. Use View to inspect exactly which fields, schema, or settings a point contains; Restore and Delete both require warning confirmation before anything changes.

Saved operator views

Return to repeat workflows

View name

Saved views capture the current page, ranking criteria, timeframe, table heights, and report settings locally.

BI dashboard

Operational metrics

Automation and reports

Schedules, retention, remediation

Report Center

Scheduled report definitions

Report name Cadence Hour UTC Minute UTC Export format Timeframe days Result limit Include scan health Include ranked profiles Include failures/remediation Start scan before report Request enabled schedule I reviewed the payload and want to create a local schedule job.

Report definitions are local, versioned, and safe to preview before a schedule job is created.

Recovery and version history Report definition CRUD has local recovery points with View, Restore, and Delete warnings.

Report output

Definitions, jobs, and payload

Schedule payload preview

Day 1 imports

Use data assets, not old architecture

Sites list Waiting for adapter... Imported through GET /api/scraper/sites.

Credential manager Reference boundary Live secrets stay in the credential manager. This app should use inventory/health adapters only.

GV / SMS / Twitch dial info Reference boundary Allowed as workflow reference. No legacy dialer UI or transport code carries over.

SLO / Error Budget

Reliability gate signals

Local SLO signals use loaded scan history, rows, and adapter probes. Backend burn-rate windows remain required before this can be Live.

Audit Log

Local activity trail

Local activity rows summarize browser-held recovery, review, saved-view, report, and scan-history evidence. Append-only backend audit storage and search remain required before Live.

Site access

Configured sources and recovery

Add or edit source sites here. Every save is versioned so the site list can be restored if needed.

Domain	Name	URL	Status	Pages	Action
Loading sites...

Scan history

Previous runs and outcomes

Started	Status	Rows	Duration	Sites	Failure / Remediation	Log
Loading scan history...

Remediation recipes

Validate URLs and decide the next automated fix

Audit scope Target limit Probe URLs now

Run an audit to see reachability, missing snapshot/media obligations, duplicate groups, and the remediation recipe order for new or broken sites.

Capability coverage

What is visible now

Visible nowScan launch, scan status, scan history, site add/edit/delete with version recovery, ranking criteria, profile ranking, normalized dates, exports, pipeline ordering, schema command surface, BI metrics, and adapter health.

Next build lanePromote adapter-required cards into write-capable workflows: scheduled report creation, credential unlock UX, CAPTCHA remediation, and communication actions.

Reference onlyOld GV/SMS/Twitch dial info and credential manager internals are data/reference boundaries, not inherited UI tech.

Profile intelligence

WebScraper Console v2

Build direction, not legacy inheritance

Launch and observe

Fully editable scoring workbench

All prompt capabilities, visible and accountable

100% auditable UX/UI control, not impossible accuracy claims

Editable stages and drag order

Diagnose missed fields from the UI

Explain every diagnosed field

Replay the diagnosis path

Golden fixture lab

Local promotion comparison

DOM drift signals

Screenshot and media signals

Run volume signals

Hold risky changes for review

Source to field to export trace

Auditability score, not a 100% accuracy claim

Conjure fields without code

JSON, CSV, spreadsheet output

Version every editable object

Return to repeat workflows

Operational metrics

Schedules, retention, remediation

Scheduled report definitions

Definitions, jobs, and payload

Use data assets, not old architecture

Reliability gate signals

Local activity trail

Configured sources and recovery

Previous runs and outcomes

Validate URLs and decide the next automated fix

What is visible now

Ranked extracted records