View as:

For Wissal — pipeline status & Wasabi buckets

One page. Where each pipeline is, what each bucket holds, what is yours this week.

Pipelines

PipelineStateNext move
Databento — XNAS ITCH✅ ingested. 1.43 TiB on maqi-databento, 3 lots, vendor SHA-256 verified per lot.none — corpus is canonical.
S&P Global — Xpressfeed🟡 streaming. SFTP sftp2.spglobal.com \(\to\) Wasabi via rclone copy. 10.5 MiB landed, ~3.75 TiB expected. Buckets created.continue stream ; document Products/ packages once landed.
GDELT✅ ingested. 47.78 GiB on maqi-gdelt, MD5 + filesizes vendor manifests present.none.
CausalityLink✅ ingested as a single dated snapshot (2021-08-13). 186.91 GiB on maqi-causalitylink. No vendor checksum — size-only reconciliation.none.
RavenPack✅ ingested. 249.21 GiB on maqi-ravenpack, one zip per year. 2020 missing at source.none.

Wasabi buckets — operational state

Snapshot 2026-04-14, region eu-central-1 (Frankfurt). Source: docs/wasabi/state.md.

BucketSizeObjectsState
maqi-databento1.430 TiB3 042full, SHA-256 reconciled
maqi-spglobal10.528 MiB6preview, stream in flight
maqi-ravenpack249.214 GiB14full, size-only reconciliation
maqi-causalitylink186.909 GiB21 860snapshot 2021-08-13, frozen
maqi-gdelt47.781 GiB4 658full, MD5 reconciled

Total \(\approx\) 5.7 TiB on six buckets (the test bucket maqi is not counted).

Three next actions Wissal-side

  1. Document the S&P Products/ packages as they land — schema inspection on each .xffmt.zip (manifest <table>.cnt + pipe-delimited <table>.txt). Output goes to a per-package fiche under docs/providers/.
  2. Re-run scripts/wasabi-state.sh once the S&P stream is complete and commit the diff. The totals table is auto-generated ; do not edit by hand.
  3. Validate use-case scope in scenario-matrix §1 — the 5 UCs are a projection of anticipated usage, not a pedagogical contract. Push back if any UC under-states real demand.

Reading the buckets

The notebook of record is notebooks/maqi-data-demo.ipynb — end-to-end examples per bucket (Avro, DBN+Zstandard, ZIP+CSV, ZIP+XFFMT). Engine matrix: docs/wasabi/state.md.

Anomalies log

Vendor-side defects worth knowing before a student lab opens a notebook:

Detail: docs/wasabi/anomalies.md.

Sources