Skip to content

add test cases to build robustness#48

Open
tac0turtle wants to merge 3 commits intomainfrom
marko/robustness
Open

add test cases to build robustness#48
tac0turtle wants to merge 3 commits intomainfrom
marko/robustness

Conversation

@tac0turtle
Copy link
Contributor

@tac0turtle tac0turtle commented Mar 13, 2026

Overview

Summary by CodeRabbit

Release Notes

  • New Features

    • Added in-flight submission controls to limit concurrent transactions.
    • Implemented per-request sequencing for proper transaction ordering.
    • Added pending transaction reconciliation after restarts.
  • Bug Fixes

    • Improved sequence mismatch error handling and recovery.
    • Enhanced resilience during external sequence conflicts.
  • Tests

    • Expanded test coverage with comprehensive edge case scenarios.

@coderabbitai
Copy link

coderabbitai bot commented Mar 13, 2026

📝 Walkthrough

Walkthrough

The PR enhances the DirectSubmitter with concurrent submission controls, per-signer sequence management, and recovery logic for pending transactions. It introduces in-flight limits, sequence reconciliation after restarts, and comprehensive test coverage for edge cases including sequence mismatches and recovery scenarios.

Changes

Cohort / File(s) Summary
In-Flight Submission & Sequence Management
pkg/submit/direct.go
Added in-flight submission controls (inFlight, maxInFlight), per-signer sequencing state (accountNumber, nextSequence, sequenceReady, pendingSequences), and reworked submission flow via broadcastTx with confirmation handling. Introduced sequence mismatch recovery using nextAccountLocked, reconcilePendingLocked, rememberPendingLocked, and clearPending helpers. Added expectedSequenceFromMismatchText parser and lifecycle management routines (startSubmission, finishSubmission, invalidateSequenceLocked).
Comprehensive Test Coverage
pkg/submit/direct_test.go, go.mod
Added extensive test scenarios covering sequence-spamming, recovery-after-restart, persisted-pending reconciliation, in-flight limits, and signer recovery. Introduced mock client implementations and helper utilities (decodeInnerTx, decodeSequenceFromBlobTx, waitForBroadcast). Updated existing test assertions and added transaction decoding helpers using protowire and protobuf structures.

Sequence Diagram

sequenceDiagram
    participant Client as Submit Client
    participant Submitter as DirectSubmitter
    participant Blockchain as Blockchain (BroadcastTx)
    participant Confirmer as Confirmation Handler

    Client->>Submitter: Submit(tx)
    activate Submitter
    Submitter->>Submitter: startSubmission (increment inFlight)
    Submitter->>Submitter: Fetch next sequence
    
    rect rgba(100, 150, 200, 0.5)
    Note over Submitter,Blockchain: Broadcast Path
    Submitter->>Blockchain: BroadcastTx(tx with nonce)
    Blockchain-->>Submitter: TxHash or Error
    end
    
    alt Sequence Mismatch Error
        Submitter->>Submitter: Extract expected sequence from error
        Submitter->>Submitter: reconcilePendingLocked()
        Submitter->>Submitter: invalidateSequenceLocked()
        Submitter->>Submitter: rememberPendingLocked(txHash)
    else Success
        Submitter->>Confirmer: Wait for confirmation
        Confirmer-->>Submitter: Confirmation received
        Submitter->>Submitter: clearPending()
    end
    
    Submitter->>Submitter: finishSubmission (decrement inFlight)
    Submitter-->>Client: Result
    deactivate Submitter
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • add tx submission support #45: Directly related as this PR extends and enhances the DirectSubmitter submission flow that was introduced in the prior PR.

Poem

🐰 In-flight hops with sequence grace,
Pending txs find their place,
Nonces mend when restarts come,
Broadcast spins till all is done! ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 5.26% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'add test cases to build robustness' is vague and generic, using non-descriptive terms that don't convey specific information about what test cases were added or what robustness improvements target. Replace with a more specific title that describes the actual test scenarios added, such as 'add direct submission sequence and recovery tests' or 'add robustness tests for sequence management and in-flight submissions'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch marko/robustness
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@tac0turtle tac0turtle marked this pull request as ready for review March 14, 2026 15:42
@tac0turtle
Copy link
Contributor Author

@codex review

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
pkg/submit/direct.go (1)

49-55: Consider documenting the maxInFlight default behavior.

When maxInFlight is 0 (the default, since it's not initialized in NewDirectSubmitter), the in-flight limit is effectively disabled due to the check s.maxInFlight > 0 && s.inFlight >= s.maxInFlight. This is reasonable for backward compatibility, but a comment would clarify the intent.

📝 Suggested documentation
 	mu                  sync.Mutex
 	inFlight            int
 	accountNumber       uint64
 	nextSequence        uint64
 	sequenceReady       bool
 	pendingSequences    map[string]uint64
+	// maxInFlight limits concurrent submissions; 0 means unlimited.
 	maxInFlight         int
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/submit/direct.go` around lines 49 - 55, Document the default behavior of
the maxInFlight field: add a brief comment on the maxInFlight field (or in
NewDirectSubmitter) stating that a value of 0 disables the in-flight limit for
backward compatibility and that the code enforces the limit only when
s.maxInFlight > 0 (see the check s.maxInFlight > 0 && s.inFlight >=
s.maxInFlight). This clarifies intent for future maintainers and callers of
DirectSubmitter.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/submit/direct.go`:
- Around line 243-266: The reconcilePendingLocked loop can leave entries in the
pendingSequences map forever when GetTx returns NotFound; update the data model
and logic: change pendingSequences to store a struct with the original sequence
and a timestamp (e.g., pendingEntry{seq int64, added time.Time}), and in
reconcilePendingLocked (and wherever entries are added) remove entries whose
added timestamp is older than a configurable threshold (or enforce a max map
size by evicting oldest entries) instead of keeping them indefinitely; reference
the reconcilePendingLocked method, pendingSequences map, GetTx call and
isTxNotFound branch to implement the time-based cleanup or size eviction and
update s.nextSequence logic accordingly.

---

Nitpick comments:
In `@pkg/submit/direct.go`:
- Around line 49-55: Document the default behavior of the maxInFlight field: add
a brief comment on the maxInFlight field (or in NewDirectSubmitter) stating that
a value of 0 disables the in-flight limit for backward compatibility and that
the code enforces the limit only when s.maxInFlight > 0 (see the check
s.maxInFlight > 0 && s.inFlight >= s.maxInFlight). This clarifies intent for
future maintainers and callers of DirectSubmitter.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c7ecbbc3-76eb-4e0a-9469-cd70408cc776

📥 Commits

Reviewing files that changed from the base of the PR and between a788ced and 5693353.

📒 Files selected for processing (2)
  • pkg/submit/direct.go
  • pkg/submit/direct_test.go

Comment on lines +243 to +266
func (s *DirectSubmitter) reconcilePendingLocked(ctx context.Context) error {
if len(s.pendingSequences) == 0 {
return nil
}

nextSequence := s.nextSequence
for hash, sequence := range s.pendingSequences {
_, err := s.app.GetTx(ctx, hash)
if err == nil {
delete(s.pendingSequences, hash)
continue
}
return nil, fmt.Errorf("broadcast blob tx: %w", err)
if isTxNotFound(err) {
if sequence >= nextSequence {
nextSequence = sequence + 1
}
continue
}
return fmt.Errorf("reconcile pending blob tx %s: %w", hash, err)
}
if err := checkTxStatus("broadcast", broadcast); err != nil {
return nil, err

s.nextSequence = nextSequence
return nil
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential unbounded growth of pendingSequences map.

When GetTx returns NotFound, the entry is not deleted from pendingSequences. This is correct for in-flight transactions that haven't been indexed yet. However, if a transaction is dropped from the mempool or evicted, the entry will remain indefinitely, leading to slow memory growth over time.

Consider adding a cleanup mechanism, such as removing entries older than a certain age or limiting the map size.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/submit/direct.go` around lines 243 - 266, The reconcilePendingLocked loop
can leave entries in the pendingSequences map forever when GetTx returns
NotFound; update the data model and logic: change pendingSequences to store a
struct with the original sequence and a timestamp (e.g., pendingEntry{seq int64,
added time.Time}), and in reconcilePendingLocked (and wherever entries are
added) remove entries whose added timestamp is older than a configurable
threshold (or enforce a max map size by evicting oldest entries) instead of
keeping them indefinitely; reference the reconcilePendingLocked method,
pendingSequences map, GetTx call and isTxNotFound branch to implement the
time-based cleanup or size eviction and update s.nextSequence logic accordingly.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5693353a33

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +192 to +194
s.sequenceReady = true
if err := s.reconcilePendingLocked(ctx); err != nil {
return nil, err

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep sequence cache invalid when pending reconcile fails

nextAccountLocked sets sequenceReady to true before running reconcilePendingLocked, so a transient GetTx error during reconciliation returns from Submit while leaving the cache marked ready. After that, later submissions skip both account refresh and reconciliation, continuing with potentially stale nextSequence/pending state instead of retrying reconciliation on the next request.

Useful? React with 👍 / 👎.

for {
tx, err := s.app.GetTx(ctx, hash)
if err == nil {
s.clearPending(hash)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid mutex-coupling confirmation completion to broadcasts

Calling s.clearPending(hash) here can block a confirmed submission behind another goroutine that is still inside broadcastTx, because clearPending and broadcastTx share s.mu and broadcastTx holds it across RPC calls. In that situation, this request has already observed confirmation but cannot return until the unrelated broadcast finishes, which can add unbounded latency and bypass the intended confirmation-timeout behavior.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant