Skip to content

refactor(datasets-raw): consolidate raw dataset type across extractors#1883

Merged
LNSD merged 1 commit intomainfrom
lnsd/feat-datasets-raw-dataset-type
Mar 2, 2026
Merged

refactor(datasets-raw): consolidate raw dataset type across extractors#1883
LNSD merged 1 commit intomainfrom
lnsd/feat-datasets-raw-dataset-type

Conversation

@LNSD
Copy link
Contributor

@LNSD LNSD commented Mar 2, 2026

Replace per-extractor Dataset structs with the canonical datasets_raw::dataset::Dataset, eliminating duplicate implementations and centralizing raw dataset identity.

  • Delete dataset.rs from evm-rpc, firehose, and solana extractors; their dataset() fns now return RawDataset
  • Add DatasetStore::get_raw_dataset() with downcast to Arc<RawDataset> and GetRawDatasetError
  • Simplify resolve_raw_dataset_from_dependencies to return Arc<RawDataset> directly
  • Pass Arc<RawDataset> into amp_worker_datasets_raw::dump instead of resolving inside
  • Assert in RawDataset::new that all table networks match the dataset network

@LNSD LNSD requested review from mitchhs12, shiyasmohd and sistemd March 2, 2026 14:59
@LNSD LNSD self-assigned this Mar 2, 2026
@LNSD LNSD changed the title refactor(datasets-raw): consolidate raw dataset type across extractors feat(datasets-raw): consolidate raw dataset type across extractors Mar 2, 2026
@LNSD LNSD changed the title feat(datasets-raw): consolidate raw dataset type across extractors refactor(datasets-raw): consolidate raw dataset type across extractors Mar 2, 2026
@LNSD LNSD force-pushed the lnsd/feat-datasets-raw-dataset-type branch from 06619af to 78a67bb Compare March 2, 2026 15:15
Copy link
Contributor

@shiyasmohd shiyasmohd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM.
Only issue is handling GetRawDatasetError::NotARawDataset being handled as retryable.

Replace per-extractor `Dataset` structs with the canonical `datasets_raw::dataset::Dataset`, eliminating duplicate implementations and centralizing raw dataset identity.

- Delete `dataset.rs` from evm-rpc, firehose, and solana extractors; their `dataset()` fns now return `RawDataset`
- Inline dataset resolution into raw `dump`, matching derived `dump` pattern: both now take `&HashReference` and resolve internally via `DatasetStore::get_dataset`
- Simplify `resolve_raw_dataset_from_dependencies` to downcast `Arc<RawDataset>` directly without async store call
- Assert in `RawDataset::new` that all table networks match the dataset network

Signed-off-by: Lorenzo Delgado <lorenzo@edgeandnode.com>
@LNSD LNSD force-pushed the lnsd/feat-datasets-raw-dataset-type branch from 78a67bb to a86ac08 Compare March 2, 2026 15:53
@LNSD LNSD merged commit 9d1a5d5 into main Mar 2, 2026
9 checks passed
@LNSD LNSD deleted the lnsd/feat-datasets-raw-dataset-type branch March 2, 2026 15:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants