DA failures

What a DA failure is—and what it cannot do

“DA failure” means the rollup cannot get its batch data into L1 blobs or readers cannot fetch those blobs in a timely way. That affects anchoring and reconstructability, not the law of settlement. Contracts still enforce: PoV PASS → EMT → Locked EDSD flips to Unlocked → fee → 50% burn. No batch or blob issue can mint an EMT, flip funds early, bypass One-Claim, or skip burns.

1. Risk surface

  • Posting failure: sequencer cannot post a batch’s EIP-4844 blob; L1 congestion, pricing spikes, misconfiguration

  • Retrieval failure: blobs posted but temporarily unfetchable; client/provider outage, indexing lag

  • Extended lag: batches post, but anchoring is delayed far beyond normal; minutes to hours

  • Blob expiry risk: catastrophic delay pushes blob publication close to the retention window; blobs are pruned after a limited time

  • Split-brain clients: readers see inconsistent DA endpoints; one mirrors a blob, another lags

  • Operator error: bad upgrade or credentials prevent blob posting; wrong fee settings starve inclusion

What these do: delay economic finality and audit replay. What they don’t: create money, destroy lineage, or let unproven facts release funds.

2. Guardrails already in place

  • Optimistic rollup with blobs: batches include PoV hashes, EMT ids, Locked→Unlocked EDSD deltas, fee/burn lines, One-Claim updates

  • Forced inclusion path: if the sequencer stalls or censors, anyone can push critical txs via L1 inbox; next batch must include them

  • Permissionless exits: EDM/EDSD can be withdrawn to L1 after the challenge window even if the sequencer is offline

  • Application brakes: Gate PASS + EMT + One-Claim + must-fund block early releases regardless of DA

  • Status & SLOs: we publish block delay, batch lag, blob fetch SLO, and forced-inclusion events; receipts are marked “awaiting L1 anchor” until their batch lands

3. Operating modes

  • Normal (less than 10 min): operate normally; receipts show burn hash; Explorer links L1 blob once posted

  • Degraded (10–60 min): operate normally, badge receipts as “awaiting L1 anchor”; alerts fire; forced-inclusion offered in UI/API for critical releases

  • Anchor-guarded (greater than 60 min, governed threshold): new EMT→release calls auto-route through forced inclusion, or pause opening new gates until at least one batch lands, or allow only capped releases per order until DA resumes (governed cap)

We announce the mode on status pages and webhooks. Why: keep business moving within minutes, but guard against extended DA outage risk with clear rails.

4. Detection → response → recovery

  • Detection: infra.batch_lag alert above 10 min, anchor-guarded threshold default 60 min; infra.block_delay L2 p95 above 5s; infra.blob_fetch_slo fetch failures above SLO; infra.inbox_forced_inclusion count spikes

  • Response: Apps/ops retry with idempotency; if lag above threshold or a critical release is needed, click forced inclusion; receipts are flagged “awaiting L1 anchor”. Sequencer adjusts blob fee parameters, failovers to hot standby, or routes forced-inclusion queue; publishes incident ETA. Governance or Sec Council enters anchor-guarded mode if needed; changes are public and timelocked to exit

  • Recovery: When blobs land, receipts auto-link L1 blob indices; badges clear. A post-mortem shows cause and fix; no application-level replays were allowed to bypass proof

5. Business impact matrix

Failure mode

What you feel

What does not change

What to do

Posting failure (short)

Receipts show “awaiting L1 anchor”

PoV/EMT/One-Claim, Locked→Unlocked, 50% burn

Continue; consider forced inclusion for critical releases

Posting failure (extended)

Anchor-guarded mode

Same brakes; funds remain safe

Use forced inclusion; expect gate-open pauses or capped releases until blob posts

Retrieval outage

Explorer link delayed

Receipts/burns issued on L2

Pull blob later; auditors can reconcile after provider recovers

Blob expiry risk

Ops banner with countdown

No early release

Governance raises blob fee or prioritizes batches; forced inclusion for critical queue

Operator misconfig

Temporary lag

Settlement law intact

Ops failover; config hot-fix; public ETA

6. KPIs & SLOs

  • Batch lag median/p95: less than or equal to 5 min / 15 min

  • EMT→release latency p50/p95: less than or equal to 5s / 15s (unchanged)

  • Blob fetch SLO: greater than or equal to 99.5% within 5 min

  • Forced-inclusion resolution: included in less than or equal to 30 min

  • Anchor-guarded time/month: 0 (target), alerts at any non-zero

  • Blob expiry headroom: more than 72 h between oldest unanchored event and pruning horizon

KPIs are on the status page; misses ship with root cause and remediation.

7. Governance knobs

  • Anchor-guarded threshold: when to switch to forced-inclusion/pauses; default 60 min; band 30–120 min

  • Blob fee policy: max fee multipliers for DA posting under congestion

  • Forced-inclusion auto-routing: enable after N minutes for critical endpoints

  • Release caps in guarded mode: per-order/per-hour caps to spread risk

  • Status transparency: require real-time batch/lag charts and inbox depth

They cannot allow money to release without EMT/PASS, reuse evidence, or discount the 50% burn.

8. Hardening techniques we use

  • Multi-provider DA reads: fetch blobs from multiple archival/providers; verify root against L1

  • Pre-funded blob gas buffer: governed reserve for DA surges; alert when drawn

  • Batch sizing/timing: adaptive target sizes to avoid blob overflows; shorter bursts under load

  • Auto-fallback to inbox: queue critical calls for forced inclusion when lag exceeds threshold

  • Replica explorers: mirrors of proof pages so audits don’t depend on one front-end

9. Operator checklist

  • If you see “awaiting L1 anchor”: proceed; receipts remain valid—just note the badge for audit

  • For time-critical releases during lag: use forced inclusion in UI/API (idempotent)

  • Don’t resubmit without an Idempotency-Key: duplicates are no-ops and clog queues

  • Plan cross-chain withdrawals around the challenge window: DA lag doesn’t affect on-rail cash safety

  • Reconcile later: when the blob lands, your receipt auto-links the L1 index

Drawing

Plain recap

DA failures delay anchoring, not admissibility. When blobs lag, settlement law still holds—PoV PASS → EMT → Locked→Unlocked EDSD → fee → 50% burn—and forced inclusion keeps critical flows alive. If a lag becomes extended, we switch to an anchor-guarded mode (forced inclusion/pauses/caps) until blobs land. Worst case, you wait with receipts—you never pay early or lose the trail. No EMT, no funds.

Last updated