Skip to content

Conversation

@tac0turtle
Copy link
Contributor

@tac0turtle tac0turtle commented Dec 16, 2025

Closes: #2936

Summary

  • Use DA layer header timestamps instead of time.Now() for deterministic blob retrieval responses
  • Add timeout protection to header API calls to prevent indefinite hangs
  • Integrate header RPC namespace into the DA client

Changes

  • Add HeaderAPI to the DA client for fetching block timestamps from the DA layer
  • Replace time.Now() with header-derived timestamps in Retrieve responses for determinism
  • Add getBlockTimestamp helper with proper timeout handling
  • Extend jsonrpc.Client to connect to both blob and header namespaces
  • Add HeaderModule interface and mock for testing

@github-actions
Copy link
Contributor

github-actions bot commented Dec 16, 2025

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed⏩ skipped✅ passed✅ passedDec 18, 2025, 8:48 PM

@claude
Copy link
Contributor

claude bot commented Dec 16, 2025

Claude finished @tac0turtle's task in 3m 14s —— View job


Code Review Complete

I've completed a comprehensive review of this PR. The implementation successfully addresses issue #2936 by replacing time.Now() with DA header timestamps for deterministic blob retrieval. Overall, the changes are well-structured and the approach is sound. Below are my findings organized by focus area.


1. Code Quality & Architecture

Strengths

  • Clean separation of concerns: The new HeaderAPI mirrors the existing BlobAPI pattern, maintaining consistency with the codebase architecture
  • Proper interface design: HeaderModule interface in pkg/da/jsonrpc/module.go enables testability and follows Go best practices
  • Minimal coupling: The getBlockTimestamp helper (client.go:186-196) is well-encapsulated and reusable
  • Idiomatic Go: Code follows standard Go conventions with proper error wrapping using %w

⚠️ Issues

Critical: Fallback to time.Now() breaks determinism guarantee (client.go:227, 265)

The core purpose of this PR is to ensure deterministic timestamps, but the fallback logic defeats this goal:

blockTime, err := c.getBlockTimestamp(ctx, height)
if err != nil {
    c.logger.Error().Uint64("height", height).Err(err).Msg("failed to get block timestamp")
    blockTime = time.Now()  // ❌ Non-deterministic fallback
    // TODO: we should retry fetching the timestamp. Current time may mess block time consistency for based sequencers.
}

Impact: When the header API fails, different nodes will record different timestamps for the same DA height, causing state divergence in based sequencers.

Recommendation: As @julienrbrt correctly noted in the review, return time.Time{} (zero time) instead and fail fast. This forces callers to handle the error explicitly rather than silently introducing non-determinism:

blockTime, err := c.getBlockTimestamp(ctx, height)
if err != nil {
    c.logger.Error().Uint64("height", height).Err(err).Msg("failed to get block timestamp")
    return datypes.ResultRetrieve{
        BaseResult: datypes.BaseResult{
            Code:      datypes.StatusError,
            Message:   fmt.Sprintf("failed to get deterministic block timestamp: %s", err.Error()),
            Height:    height,
            Timestamp: time.Time{}, // Zero time signals error
        },
    }
}

Minor: Inconsistent variable naming (client.go:214-215)

blobCtx, blobCancel := context.WithTimeout(ctx, c.defaultTimeout)
defer blobCancel()

While this is an improvement over the previous getCtx, consider using consistent naming throughout. The pattern opCtx (e.g., blobCtx, headerCtx) is good, but cancel without prefix is equally clear and more common in Go codebases.


2. Error Handling & Resilience

⚠️ Issues

Critical: No retry mechanism for transient failures

The TODO comment acknowledges this (client.go:228, 266):

// TODO: we should retry fetching the timestamp. Current time may mess block time consistency for based sequencers.

Impact: Transient network failures, temporary celestia-node unavailability, or rate limiting will cause permanent failures even though the header data exists and is retrievable.

Recommendation: Implement exponential backoff retry logic:

func (c *client) getBlockTimestamp(ctx context.Context, height uint64) (time.Time, error) {
    var lastErr error
    maxRetries := 3
    
    for attempt := 0; attempt < maxRetries; attempt++ {
        if attempt > 0 {
            backoff := time.Duration(1<<uint(attempt-1)) * 100 * time.Millisecond
            select {
            case <-time.After(backoff):
            case <-ctx.Done():
                return time.Time{}, ctx.Err()
            }
        }
        
        headerCtx, cancel := context.WithTimeout(ctx, c.defaultTimeout)
        header, err := c.headerAPI.GetByHeight(headerCtx, height)
        cancel()
        
        if err == nil {
            return header.Time(), nil
        }
        lastErr = err
        
        // Don't retry on context cancellation
        if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) {
            break
        }
    }
    
    return time.Time{}, fmt.Errorf("failed to get header timestamp for block %d after %d attempts: %w", height, maxRetries, lastErr)
}

Moderate: Parent context used for header fetch may timeout prematurely

The blob fetch uses a timeout context (client.go:214), but the header fetch uses the parent context (client.go:224, 262). If the parent context has a tight deadline, the header fetch might fail even though it should succeed.

Recommendation: Consider whether getBlockTimestamp should use its own timeout or respect the parent context's deadline. The current implementation creates its own timeout, which is good, but document this behavior.

Minor: Error messages don't distinguish failure modes

When GetByHeight fails, we don't know if it's due to:

  • Network failure (retryable)
  • Height doesn't exist yet (retryable with backoff)
  • Authentication failure (non-retryable)
  • Malformed request (non-retryable)

Recommendation: Inspect error types and add appropriate error handling for different failure modes.


3. Security

No Critical Issues Found

  • No credential exposure in logs
  • Proper context propagation prevents goroutine leaks
  • Authorization headers handled securely in jsonrpc/client.go:33
  • No untrusted input directly used in RPC calls without validation

💡 Minor Observations

Defense in depth: Header timestamp validation

While celestia-node is trusted, consider validating that header.Time() returns a reasonable timestamp:

blockTime := header.Time()
if blockTime.IsZero() {
    return time.Time{}, fmt.Errorf("DA header for block %d returned zero timestamp", height)
}
// Optional: Check timestamp is not in the future by more than clock skew tolerance
if blockTime.After(time.Now().Add(5 * time.Minute)) {
    c.logger.Warn().Uint64("height", height).Time("timestamp", blockTime).Msg("DA header timestamp is in the future")
}

4. Performance & Resource Efficiency

Good Practices

  • Proper use of timeouts prevents resource exhaustion
  • Contexts are properly canceled with defer cancel()
  • No unnecessary allocations in hot paths

⚠️ Issues

Moderate: Sequential header fetches add latency

Currently, for every Retrieve call that succeeds or returns "not found", we make a separate GetByHeight RPC call. This adds network round-trip latency.

Measurement: Assuming 20ms RTT to celestia-node:

  • Before: 1 RPC call (GetAll) = 20ms
  • After: 2 RPC calls (GetAll + GetByHeight) = 40ms
  • Impact: 2x latency for all DA retrievals

Recommendation: Consider batching or caching strategies:

  1. Header caching (simple, high impact):
type client struct {
    // ... existing fields
    headerCache *lru.Cache // Cache recently fetched headers
    cacheMu     sync.RWMutex
}
  1. Parallel fetching (moderate complexity):
// Fetch header and blobs concurrently
var (
    blobs   []*blobrpc.Blob
    header  *blobrpc.Header
    blobErr error
    hdrErr  error
    wg      sync.WaitGroup
)

wg.Add(2)
go func() {
    defer wg.Done()
    blobCtx, cancel := context.WithTimeout(ctx, c.defaultTimeout)
    defer cancel()
    blobs, blobErr = c.blobAPI.GetAll(blobCtx, height, []share.Namespace{ns})
}()
go func() {
    defer wg.Done()
    headerCtx, cancel := context.WithTimeout(ctx, c.defaultTimeout)
    defer cancel()
    header, hdrErr = c.headerAPI.GetByHeight(headerCtx, height)
}()
wg.Wait()

Minor: Timeout constant duplicated

The c.defaultTimeout is used consistently, but consider making timeouts configurable per operation type (blob fetch vs header fetch) for finer control.


5. Testing & Reliability

Strengths

  • Good test coverage with table-driven tests
  • Mock infrastructure properly generated via mockery
  • Tests verify timestamp determinism (client_test.go:129, 155)
  • Edge cases covered (invalid namespace, not found, errors)

⚠️ Issues

Critical: Missing test for header API failure

There's no test covering the critical path where GetByHeight fails. This is the exact scenario where the non-deterministic fallback would be triggered.

Recommendation: Add test case:

func TestClient_Retrieve_HeaderFailureFallback(t *testing.T) {
    ns := share.MustNewV0Namespace([]byte("ns")).Bytes()
    blobModule := mocks.NewMockBlobModule(t)
    b, _ := blobrpc.NewBlobV0(share.MustNewV0Namespace([]byte("ns")), []byte("data"))
    blobModule.On("GetAll", mock.Anything, uint64(10), mock.Anything).Return([]*blobrpc.Blob{b}, nil)
    
    headerModule := mocks.NewMockHeaderModule(t)
    headerModule.On("GetByHeight", mock.Anything, uint64(10)).Return(nil, errors.New("header service unavailable"))
    
    cl := NewClient(Config{
        DA:            makeBlobRPCClient(blobModule, headerModule),
        Logger:        zerolog.Nop(),
        Namespace:     "ns",
        DataNamespace: "ns",
    })
    
    res := cl.Retrieve(context.Background(), 10, ns)
    
    // With current implementation, this succeeds with time.Now() - bad!
    // After fix, should return StatusError
    require.Equal(t, datypes.StatusError, res.Code)
    require.True(t, res.Timestamp.IsZero())
}

Moderate: Test coverage regression

Codecov shows patch coverage dropped to 12.10% with 138 lines missing coverage. While generated mock files account for 116 lines (acceptable), the remaining gaps should be addressed:

  • pkg/da/jsonrpc/header.go: 0% coverage (10 lines) - New file needs tests
  • block/internal/da/client.go: 65% coverage - Missing header failure paths

Recommendation: Add unit tests for:

  1. HeaderAPI.GetByHeight, LocalHead, NetworkHead methods
  2. Header.Time() method with both Header.Time and BlockTime scenarios
  3. Header API timeout behavior
  4. Concurrent blob + header fetches

Minor: Fixed timestamp in tests may hide bugs

The test helper uses a fixed time (client_test.go:43):

fixedTime := time.Date(2024, 1, 1, 12, 0, 0, 0, time.UTC)

While this is good for determinism verification, consider also testing with varied timestamps to ensure no implicit assumptions about time values.


6. Documentation & Developer Experience

Strengths

  • Clear function documentation for getBlockTimestamp (client.go:185)
  • Updated Retrieve comment mentions determinism (client.go:200)
  • TODO comments acknowledge technical debt (client.go:228, 266)

⚠️ Issues

Moderate: Missing package-level documentation for header API

The new pkg/da/jsonrpc/header.go file lacks package comments explaining:

  • Purpose of the header API integration
  • Relationship to celestia-node's header module
  • When/why developers would use these types

Recommendation: Add package comment:

// Package jsonrpc provides a client for celestia-node's JSON-RPC API.
// 
// The HeaderAPI provides access to block header information from the DA layer,
// primarily used to retrieve deterministic block timestamps for blob operations.
// This ensures that based sequencers maintain consensus on block timing.

Minor: Error message clarity

Error messages like "failed to get header timestamp for block %d" (client.go:192) could be more actionable. Consider:

return time.Time{}, fmt.Errorf("failed to fetch DA header timestamp at height %d (ensure celestia-node header service is available): %w", height, err)

Minor: Configuration documentation

The Config struct (client.go:19-27) doesn't document that DA.Header must be non-nil for the client to function correctly. Add field comment:

DA *blobrpc.Client // Required: Must have both Blob and Header APIs initialized

Summary & Recommendations

Critical Issues (Must Fix Before Merge)

  1. Remove time.Now() fallback (client.go:227, 265) - Return time.Time{} and error status instead to maintain determinism guarantee
  2. Add test for header API failure - Verify error handling when GetByHeight fails
  3. Implement retry logic - Add exponential backoff for transient failures

High Priority (Should Fix)

  1. Improve test coverage - Add unit tests for new header API methods
  2. Add header timestamp validation - Ensure returned timestamp is non-zero and reasonable
  3. Consider performance optimization - Evaluate header caching or parallel fetching to reduce latency

Nice to Have

  1. Better error messages - Make errors more actionable for operators
  2. Package documentation - Add context about header API integration
  3. Varied timestamp tests - Test with different time values, not just fixed time

Architectural Trade-offs

The dual-namespace RPC client design (client.go:38-58) is solid, but introduces coupling to celestia-node's API surface. Consider:

  • Pro: Mirrors celestia-node's architecture, making the integration obvious
  • Con: Two RPC connections per client, two namespaces to maintain
  • Alternative: Could use a single connection with dynamic namespace routing, but current approach is clearer

The decision to fetch headers separately rather than extending blob responses is correct - it maintains backward compatibility and separation of concerns.


Alignment with Based Sequencing Requirements

The PR correctly identifies that based sequencers require deterministic timestamps for DA epochs (see issue #2936 and sequencers/based/sequencer.go:104-110). However, the fallback to time.Now() undermines this requirement. Once the fallback is removed, this PR will properly enable deterministic DA timestamp handling.


Code Quality Score: 7/10

Breakdown:

  • Architecture & Design: 8/10 (clean design, minor fallback issue)
  • Error Handling: 6/10 (missing retries, problematic fallback)
  • Testing: 6/10 (good coverage of happy paths, missing failure cases)
  • Documentation: 7/10 (adequate, could be more comprehensive)
  • Performance: 7/10 (functional, could optimize latency)
  • Security: 9/10 (no significant issues)

This is a solid foundation that addresses the core requirement. Fixing the determinism fallback and adding retry logic will bring it to production readiness.


@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @tac0turtle, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the Data Availability (DA) client by integrating with the DA layer's header module to retrieve accurate and deterministic block timestamps. Previously, DA operations like blob retrieval relied on the local system's time.Now(), which could lead to inconsistencies. By introducing a dedicated HeaderAPI and updating the RPC client to connect to the "header" namespace, the system can now fetch the canonical block timestamp directly from the DA layer, improving the reliability and determinism of DA responses. This change also includes necessary mocking and test updates to ensure the new functionality is robust.

Highlights

  • Deterministic Timestamps: The Data Availability (DA) client now fetches block timestamps directly from the DA layer's header module instead of relying on time.Now(), ensuring deterministic timestamps in Retrieve operation results.
  • Header Module Integration: A new HeaderAPI and HeaderModule interface have been introduced to facilitate interaction with the celestia-node's header JSON-RPC namespace.
  • Enhanced RPC Client: The DA JSON-RPC client (pkg/da/jsonrpc/client.go) has been updated to establish connections to both the "blob" and "header" namespaces simultaneously, managing their respective closers.
  • Robust Timestamp Retrieval Logic: A getBlockTimestamp helper function was added to safely retrieve block timestamps, including fallback mechanisms to time.Now() if the header API is unavailable, fetching fails, or the returned timestamp is zero.
  • Comprehensive Testing: Unit tests for the DA client (block/internal/da/client_test.go) were updated to include mocking for the new HeaderModule and verify the correctness and determinism of the timestamps in Retrieve operations.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the use of block header timestamps to ensure deterministic responses, which is a great improvement. The changes are well-implemented across the DA client and JSON-RPC layers, including adding a new header API client and associated mocks and tests. The fallback logic in getBlockTimestamp to use time.Now() is robust. I have one suggestion to improve the robustness of the new getBlockTimestamp function by adding a timeout to the API call.

return time.Now()
}

header, err := c.headerAPI.GetByHeight(ctx, height)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The call to c.headerAPI.GetByHeight uses the original context ctx directly. This could lead to long waits if the header API is unresponsive and the parent context doesn't have a timeout. It's a good practice to use a timeout for this network call, similar to how c.blobAPI.GetAll is called with a timeout in the Retrieve function. Consider creating a new context with c.defaultTimeout for this call to prevent potential hangs.

	headerCtx, cancel := context.WithTimeout(ctx, c.defaultTimeout)
	defer cancel()
	header, err := c.headerAPI.GetByHeight(headerCtx, height)

@codecov
Copy link

codecov bot commented Dec 16, 2025

Codecov Report

❌ Patch coverage is 11.94969% with 140 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.23%. Comparing base (392959b) to head (7c38c2e).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pkg/da/jsonrpc/mocks/header_module_mock.go 0.00% 116 Missing ⚠️
pkg/da/jsonrpc/header.go 0.00% 10 Missing ⚠️
block/internal/da/client.go 59.09% 6 Missing and 3 partials ⚠️
pkg/da/jsonrpc/client.go 54.54% 4 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2939      +/-   ##
==========================================
- Coverage   60.14%   59.23%   -0.91%     
==========================================
  Files          88       90       +2     
  Lines        8427     8575     +148     
==========================================
+ Hits         5068     5079      +11     
- Misses       2787     2920     +133     
- Partials      572      576       +4     
Flag Coverage Δ
combined 59.23% <11.94%> (-0.91%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@tac0turtle tac0turtle marked this pull request as ready for review December 18, 2025 16:57
@tac0turtle tac0turtle changed the title feat: use header timestamp feat: use DA timestamp Dec 18, 2025
julienrbrt
julienrbrt previously approved these changes Dec 18, 2025
Copy link
Member

@julienrbrt julienrbrt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! The fallback to time.Now() isn't great however and you may fail on the next correct fetch with a quite undescriptive error.

Copy link
Member

@julienrbrt julienrbrt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK

@julienrbrt julienrbrt merged commit a406515 into main Dec 18, 2025
18 of 20 checks passed
@julienrbrt julienrbrt deleted the marko/header_timestamp branch December 18, 2025 20:48
alpe added a commit that referenced this pull request Dec 19, 2025
* main:
  feat: use DA timestamp (#2939)
  chore: improve code comments clarity (#2943)
  build(deps): bump libp2p (#2937)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make da retrieve timestamp based on DA block time

3 participants