1. Anomaly Detection & Deduplication
Summary: AI helps reviewers catch bad data before Proof of Verification. It highlights implausible values, strange patterns, and duplicates, then routes windows to pass, quarantine, or reject. The PoV Gate still decides; AI is an assistive layer that makes verification faster and cheaper.
What this module looks at
It reads the already-canonical payload and a small history around it. The inputs are the canonical JSON fields device_id, start_ts, end_ts, quantity_wh, batch_id, nonce, plus derived fields like duration_s and rate_wh_per_s. It also sees the computed evidenceHash, the last N windows for that device_id, and (optionally) peer devices on the same grid_node
The core signals it computes
1) Value sanity — basic checks: non-negative quantity_wh, reasonable duration_s, and device-specific upper bounds. If any is obviously wrong, return REJECT (for example NEGATIVE_QUANTITY, OUT_OF_BOUNDS).
2) Rate-of-change — compare rate_wh_per_s to a rolling baseline. Sudden jumps (for example > 150% of recent median) are quarantine candidates, not immediate rejects.
3) Robust outliers — compute a robust score (median/MAD) across recent windows. If |z| is high (for example > 5), flag QUARANTINED for attestor review.
4) Temporal pathologies — detect flatlines (zero output across many windows), impossible sawtooths, or “staircases” inconsistent with method; these usually become QUARANTINED with a reason code.
5) Cross-device correlation — if many devices on the same grid_node disagree with this reading (for example a site-wide drop that only one meter missed), downgrade or upgrade the risk accordingly.
6) Deduplication — enforce that two payloads with the same batch_id must have identical bytes; any reuse of nonce for the same device_id is replay; and the tuple (device_id, start_ts, end_ts) cannot be submitted twice unless bytes match exactly.
Clear outcomes
If everything looks normal, return PASS and proceed toward attestation. If it looks suspicious but not certainly wrong, return QUARANTINED with a reason like SPIKE_RATE, FLATLINE, or BASELINE_MISMATCH; attestors will see the evidence and either approve or fail. If it’s clearly invalid, return REJECT with a precise code such as OVERLAPPING_WINDOW, DUPLICATE_BATCH, REPLAY_NONCE, NON_CANONICAL_JSON, or TIMESTAMP_SKEW.
Simple rules that catch most issues
Prefer a few transparent rules to black-box models. Teams and regulators should be able to understand why a window was quarantined.
Bounds — reject if
quantity_wh < 0; rejectif quantity_wh > rated_wh_per_window × 1.15.Rate jump — compute
rate_wh_per_s = quantity_wh / duration_s; quarantine ifrate_wh_per_s exceeds median_rate × 1.5(or drops below median_rate / 1.5) over a sliding horizon.Robust outlier — compute median m and MAD over the last K windows; define z = 0.6745 × (rate_wh_per_s − m) / (MAD + ε). Quarantine if |z| > 5; toughen to reject only with corroborating evidence (for example device alarm).
Flatline — quarantine if quantity_wh = 0 across N consecutive windows during expected solar hours.
Dedup — reject if
batch_idis seen with different bytes; reject if nonce repeats for the same device_id; reject a second submission of (device_id,start_ts,end_ts) unless bytes match exactly.
Reference snippets
Python (robust outlier on rate):
TypeScript (dedup and tuple idempotency):
In practice the tupleKey is a stable encoding of (device_id,start_ts,end_ts), and you also track per-device nonce to block replays.
How this feeds PoV and ESG
Windows marked PASS move forward to attestations and the PoV Gate. QUARANTINED windows are shown first to attestors with clear reasons and context (sparkline of recent rates, cross-device comparison), which speeds up reviews and reduces false negatives. Because ESG reports are rendered from on-chain lineage, fewer quarantines and clearer reasons mean cleaner ESG outputs and fewer audit back-and-forths.
Conformance (simple and clear)
Compute the signals on the canonical JSON. Keep the rules readable (document thresholds like median×1.5 and |z|>5). Return one of three outcomes with a precise reason: PASS, QUARANTINED (SPIKE_RATE, FLATLINE, BASELINE_MISMATCH…), or REJECT (DUPLICATE_BATCH, REPLAY_NONCE, OVERLAPPING_WINDOW…). Store the decision alongside evidenceHash so reviewers and auditors always see why it happened.
Bottom line: a few transparent signals—bounds, rate-of-change, robust outliers, and dedup—catch most problems. They make verification faster, reduce attestor load, and keep PoV focused on evidence, not guesswork.
Last updated
