fix: collect sigshares synchronously in dispatcher for parallel verification by thepastaclaw · Pull Request #7225 · dashpay/dash

thepastaclaw · 2026-03-16T06:01:36Z

Summary

The dispatcher thread in WorkThreadDispatcher() unconditionally enqueued a
ProcessPendingSigShares worker into the thread pool every 10ms whenever any
pendingIncomingSigShares existed. Since the workers would then find nothing
to do and immediately exit, this wasted allocations and queue operations.

This was flagged during review of #7004 by CodeRabbit as a nitpick but was not
addressed at the time.

Fix

Move CollectPendingSigSharesToVerify out of ProcessPendingSigShares and
into the dispatcher loop, where it runs synchronously (it holds cs briefly
and does no BLS work). Each collected batch is moved into its own worker for
parallel BLS verification.

This approach (suggested knst) has two advantages over the initial
atomic-guard implementation:

Parallel verification: Multiple verification batches can run
concurrently across the thread pool (up to 4 workers), instead of being
serialised to a single worker.
Natural queue bounding: Each Collect call dequeues its batch from
the pending queue, so only as many tasks are enqueued as there are actual
batches of work — no redundant empty-exit tasks.

Performance impact: slight improvement. Multiple BLS verification batches
can now run in parallel when there are more than 32 pending sig shares
(nMaxBatchSize). The collection step is cheap (no BLS ops).

Validation

Code review of the dispatch path confirms collect-then-dispatch preserves
the existing drain behavior
Each CollectPendingSigSharesToVerify call removes its batch from the
pending queue, preventing duplicate work across workers
No functional change to sigshare processing semantics

thepastaclaw · 2026-03-16T06:01:42Z

@coderabbitai review

coderabbitai · 2026-03-16T06:01:51Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

github-actions · 2026-03-16T06:02:03Z

This pull request has conflicts, please rebase.

github-actions · 2026-03-16T06:02:03Z

✅ No Merge Conflicts Detected

This PR currently has no conflicts with other open PRs.

coderabbitai · 2026-03-16T06:11:47Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

The patch changes NetSigning to collect pending signature shares into bounded batches (up to a configured limit) synchronously and dispatch each batch once to a worker for batched verification and recovery. NetSigning::ProcessPendingSigShares changed from a parameterless bool-returning method to void and now accepts rvalue-referenced maps for sig shares grouped by node and for quorums. The dispatcher now submits one batch per cycle and sleeps briefly between rounds instead of continuously spawning workers based on IsAnyPendingProcessing.

Sequence Diagram(s)

sequenceDiagram
    participant Dispatcher as Dispatcher
    participant Collector as Collector
    participant WorkerPool as WorkerPool
    participant Worker as Worker
    participant Verifier as Verifier
    participant PeerManager as PeerManager

    Dispatcher->>Collector: CollectPendingSigSharesToVerify(maxBatchSize)
    Collector-->>Dispatcher: sigSharesByNodes, quorums (batch or empty)
    alt batch non-empty
        Dispatcher->>WorkerPool: submit ProcessPendingSigShares(sigSharesByNodes, quorums)
        WorkerPool-->>Worker: run with moved batch
        Worker->>Verifier: verify batched sig shares
        Verifier-->>Worker: verification results + recovered sigs
        Worker->>PeerManager: flag/ban peers with invalid shares
        Worker-->>Dispatcher: push recovered sigs for further processing
    else no batch
        Collector-->>Dispatcher: no work
    end
    loop dispatcher rounds
        Dispatcher->>Collector: CollectPendingSigSharesToVerify(maxBatchSize)
        Note right of Dispatcher: short sleep between rounds until empty
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: moving synchronous sigshare collection into the dispatcher to enable parallel verification, which is the core objective of the PR.
Description check	✅ Passed	The description comprehensively explains the problem (wasted worker allocations), the solution (synchronous collection in dispatcher), and the benefits (parallel verification and queue bounding), all directly related to the changeset.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/llmq/net_signing.cpp`:
- Around line 311-330: processing_worker_scheduled can remain true if a queued
processing task is cleared before it runs, causing future Start() to never
schedule workers; fix by explicitly resetting the latch on lifecycle
transitions: in NetSigning::Start() set processing_worker_scheduled.store(false)
before starting/queuing any workers, and in NetSigning::Stop() set
processing_worker_scheduled.store(false) before/after calling
worker_pool.clear_queue() and worker_pool.stop(true) so any leftover
queued-but-not-run tasks cannot leave the latch set; no other logic changes to
ProcessPendingSigShares(), m_shares_manager->IsAnyPendingProcessing(), or the
worker spawn lambda are necessary.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: bb00efad-c569-42b0-b566-f1db55baacc7

📥 Commits

Reviewing files that changed from the base of the PR and between 32fa1a8 and b42f102.

📒 Files selected for processing (2)

src/llmq/net_signing.cpp
src/llmq/net_signing.h

src/llmq/net_signing.cpp

knst · 2026-03-16T07:07:38Z

src/llmq/net_signing.cpp


        if (m_shares_manager->IsAnyPendingProcessing()) {
-            // If there's processing work, spawn a helper worker
-            worker_pool.push([this](int) {


total amount of worker is limited by 4. If there is work to be done it should be pull out and pushed to the queue fast as possible, even after 10ms.
The problem, if new worker will be created faster, than old worker will pull a queue, a new worker will be spawn to process the same queue but there's nothing in queue and that's a burning cpu-cycles.

This implementation with atomic processing_worker_scheduled looks legit and probably works [I haven't reviewed detailed / haven't tested], but it won't do 2 calls of ProcessPendingSigShares in the parallel. So, several instances of SignAndProcessSingleShare could be spawn, but the only one instance of ProcessPendingSigShares could run on the same time.

I think much better solution will be splitting ProcessPendingSigShares to 2 steps: CollectPendingSigSharesToVerify [which is called in a sync way, without background worker, it should be fast enough as it doesn't do any bls validation inside] and after that the actual call of ProcessPendingSigShares over the ready queue of prepared sig-shares to process in async (using worker_pool). In this case could be created more than 1 worker to ProcessPendingSigShares if there are many of them (more than 32 accordingly nMaxBatchSize)

Good point — you're right that the atomic guard serializes ProcessPendingSigShares to a single concurrent instance.

The current fix was scoped narrowly: prevent the unbounded queue growth from redundant task spawns (the old workers would find nothing and immediately exit, so processing was effectively serial already — just with wasted allocations/queue churn).

Your suggestion to hoist CollectPendingSigSharesToVerify into the dispatcher and fan out verification workers per batch is a better architecture for actual parallelism. The collection step is cheap (no BLS ops), and the verification is where the real CPU time goes — batching those across workers would scale much better under load.

Happy to refactor this PR to that approach, or land this as a conservative fix and follow up with the parallel design. What's your preference?

Good point on the parallelism limitation — you're right that the current approach serialises all ProcessPendingSigShares calls.

To be transparent: the atomic flag approach was the minimal fix for the immediate problem (unbounded queue growth from firing a task every 10ms regardless of whether a worker was already running). That bug was real and concrete, so the goal here was the smallest correct fix.

Your suggestion — call CollectPendingSigSharesToVerify synchronously in the dispatcher, then push the collected batch data into a worker for BLS verification — is architecturally cleaner and would allow multiple BLS verification batches to run in parallel. The collect step holds cs only briefly and does no BLS work, so it's safe to call synchronously.

I think this is the right direction but it's a larger refactor: the current ProcessPendingSigShares in net_signing.cpp owns the entire collect→verify→ProcessPendingSigShares pipeline, so pulling collect out requires restructuring what data flows between the dispatcher thread and the workers.

A few questions before I take a stab at it:

Should the collect step move entirely into the dispatcher loop (replacing IsAnyPendingProcessing + the current guard), or should it be a separate method called from the dispatcher?

For the collected batch, should the worker call the existing ProcessPendingSigShares(sigSharesByNodes, quorums) overload directly, or does that need to change too?

Happy to implement if you think the change is worth doing now vs. a follow-up.

queue growth from firing a task every 10ms regardless

it's limited by 4 workers at once, isn't it? so it won't "growth" in reallity due to very small threshold, will it?

A few questions before I take a stab at it:

Just pull out this code out of NetSigning::ProcessPendingSigShares inside WorkThreadDispatcher, call it outside worker spawning and when creating worker for ProcessPendingSigShares pass sigSharesByNodes and quorums as an argument.

std::unordered_map<NodeId, std::vector<CSigShare>> sigSharesByNodes; std::unordered_map<std::pair<Consensus::LLMQType, uint256>, CQuorumCPtr, StaticSaltedHasher> quorums; const size_t nMaxBatchSize{32}; bool more_work = m_shares_manager->CollectPendingSigSharesToVerify(nMaxBatchSize, sigSharesByNodes, quorums); if (sigSharesByNodes.empty()) { return false; }

As a follow-up, consider further improvement by refactoring CollectPendingSigSharesToVerify: init sigSharesByNodes inside dispatcher, but quorums - inside NetSigning::ProcessPendingSigShares call

Implemented — force-pushed with the approach you suggested.

The dispatcher now calls CollectPendingSigSharesToVerify synchronously in a loop, moving each batch into its own worker for parallel BLS verification:

while (!workInterrupt) { // ... collect synchronously ... bool more_work = m_shares_manager->CollectPendingSigSharesToVerify(nMaxBatchSize, sigSharesByNodes, quorums); if (sigSharesByNodes.empty()) break; worker_pool.push([this, sigSharesByNodes = std::move(sigSharesByNodes), quorums = std::move(quorums)](int) mutable { ProcessPendingSigShares(std::move(sigSharesByNodes), std::move(quorums)); }); if (!more_work) break; }

ProcessPendingSigShares now takes sigSharesByNodes and quorums as parameters (by value, moved from the lambda capture). The atomic guard is removed — no longer needed since each Collect call dequeues unique data.

Left the quorums lookup inside ProcessPendingSigShares for now per your suggestion to consider that as a follow-up refactor.

You're right — the pool size (4 workers max) bounds the concurrency, so the queue doesn't grow truly unboundedly. The old behavior was more 'wasteful' than 'dangerous' — redundant tasks that find nothing to process.

The updated implementation addresses both concerns: no wasted allocations AND parallel BLS verification when multiple batches are available.

knst

LGTM e1bd350

knst · 2026-03-16T10:41:41Z

src/llmq/net_signing.h

    std::thread shares_cleaning_thread;
    std::thread shares_dispatcher_thread;
    mutable ctpl::thread_pool worker_pool;
-


nit: unrelated change

Restored the blank line — you're right, that was an unrelated cosmetic change.

knst · 2026-03-16T18:33:25Z

@thepastaclaw rebase branch on the top of develop to remove 'need rebase' tag

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0e2ea5a234

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-16T18:38:20Z

src/llmq/net_signing.cpp

-                    }
-                }
+        // Collect pending sig shares synchronously and dispatch each batch to a worker for parallel BLS verification
+        while (!workInterrupt) {


Bound sigshare collection loop to avoid dispatcher starvation

This new inner while (!workInterrupt) loop can run indefinitely under sustained inbound sigshare traffic because it only breaks when a CollectPendingSigSharesToVerify pass observes no remaining work; when producers keep the queue non-empty, the dispatcher never returns to the outer loop and therefore stops calling DispatchPendingSigns(). In that scenario, locally queued signing work can be delayed indefinitely even though worker threads are available.

Useful? React with 👍 / 👎.

The inner loop is bounded by the pending queue itself — each CollectPendingSigSharesToVerify call dequeues up to 32 shares from the pending queue, so the loop terminates once the queue is drained. Under sustained traffic, the queue is refilled between outer-loop iterations (after the 10ms sleep), not during the inner loop. DispatchPendingSigns() runs before this block on every outer iteration regardless.

chatgpt-codex-connector · 2026-03-16T18:38:20Z

src/llmq/net_signing.cpp

+            worker_pool.push([this, sigSharesByNodes = std::move(sigSharesByNodes), quorums = std::move(quorums)](int) mutable {
+                ProcessPendingSigShares(std::move(sigSharesByNodes), std::move(quorums));
            });


Cap queued verification batches before pushing to thread pool

Each collected batch is moved into a separate worker_pool.push(...) task, and this loop keeps enqueuing until pending shares are drained. Because the ctpl queue is unbounded, a large backlog can be converted into a large number of queued tasks (each holding copied share/quorum data) much faster than 1-4 workers can consume them, which can spike memory usage and increase processing latency under load.

Useful? React with 👍 / 👎.

The number of queued tasks is bounded by ceil(pending_shares / nMaxBatchSize) — each Collect call removes its batch from the pending queue, so tasks correspond 1:1 with actual data. With a pool of 4 workers, in practice there are at most a handful of queued tasks at any time. The data ownership is moved (not copied) into each lambda, so memory per task is proportional to its batch content.

thepastaclaw · 2026-03-16T18:44:12Z

Rebased on develop. Force-pushed.

src/llmq/net_signing.cpp

UdjinM6

utACK 0ebeb14 with a nit

…ication The dispatcher thread in WorkThreadDispatcher() unconditionally enqueued a ProcessPendingSigShares worker into the thread pool every 10ms whenever any pendingIncomingSigShares existed, leading to redundant task allocations. Instead of guarding with a single-worker atomic latch (which would serialise all BLS verification to one thread), move the collection step (CollectPendingSigSharesToVerify) into the dispatcher loop where it runs synchronously — it holds cs only briefly and does no BLS work. Each collected batch is then moved into its own worker for parallel BLS verification. This allows multiple verification batches to run concurrently across the thread pool (up to 4 workers), while naturally preventing unbounded queue growth: each Collect call dequeues its batch from the pending queue, so only as many tasks are enqueued as there are actual batches of work. Co-authored-by: knst <[email protected]>

thepastaclaw · 2026-03-16T20:18:51Z

Addressed — changed both parameters to rvalue references (&&) to make the move-only intent explicit. Force-pushed.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 018298b51f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-16T20:23:25Z

src/llmq/net_signing.cpp

+        while (!workInterrupt) {
+            std::unordered_map<NodeId, std::vector<CSigShare>> sigSharesByNodes;
+            std::unordered_map<std::pair<Consensus::LLMQType, uint256>, CQuorumCPtr, StaticSaltedHasher> quorums;
+
+            const size_t nMaxBatchSize{32};
+            bool more_work = m_shares_manager->CollectPendingSigSharesToVerify(nMaxBatchSize, sigSharesByNodes, quorums);


Limit dispatcher sigshare drain loop per tick

Draining verification batches inside this inner while (!workInterrupt) loop means the dispatcher does not return to DispatchPendingSigns() until CollectPendingSigSharesToVerify observes no remaining work, so under sustained inbound sigshare traffic local signing jobs can be postponed indefinitely even though worker threads are available. Fresh evidence: sigshares are appended concurrently by message-processing paths (ProcessMessageSigShare inserts into pendingIncomingSigShares), so the loop is not bounded to a fixed snapshot of the queue.

Useful? React with 👍 / 👎.

UdjinM6

utACK 018298b

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0bfbc7423b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-16T22:04:43Z

src/llmq/net_signing.cpp

+            const size_t nMaxBatchSize{32};
+            m_shares_manager->CollectPendingSigSharesToVerify(nMaxBatchSize, sigSharesByNodes, quorums);


Remove fixed per-tick cap on sigshare dequeueing

Dequeuing only one nMaxBatchSize batch per dispatcher iteration (CollectPendingSigSharesToVerify is called once here, then the dispatcher sleeps 10ms) introduces a hard intake ceiling of about 3,200 sigshares/sec regardless of available worker threads. In environments where a 32-share verification batch finishes in under 10ms, workers will idle while inbound shares keep accumulating, which increases end-to-end latency and can grow memory under sustained traffic (notably, pendingIncomingSigShares has no explicit per-node bound in src/llmq/signing_shares.h:331-337).

Useful? React with 👍 / 👎.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/llmq/net_signing.cpp`:
- Around line 317-322: Run clang-format on the modified block around
m_shares_manager->CollectPendingSigSharesToVerify so spacing/line breaks match
project style (fix the formatting of the if and lambda capture lines), and
explicitly discard the boolean return value to show intent (e.g., cast to void
or assign to [[maybe_unused]] variable) when calling
CollectPendingSigSharesToVerify; ensure the lambda still captures
sigSharesByNodes and quorums with std::move and calls ProcessPendingSigShares
via worker_pool.push exactly as before.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: d43e7598-f256-46bf-9994-fca9eea54780

📥 Commits

Reviewing files that changed from the base of the PR and between 018298b and 0bfbc74.

📒 Files selected for processing (1)

src/llmq/net_signing.cpp

coderabbitai · 2026-03-16T22:05:13Z

src/llmq/net_signing.cpp

+            m_shares_manager->CollectPendingSigSharesToVerify(nMaxBatchSize, sigSharesByNodes, quorums);
+
+            if (!sigSharesByNodes.empty()) {
+                worker_pool.push([this, sigSharesByNodes = std::move(sigSharesByNodes), quorums = std::move(quorums)](int) mutable {
+                    ProcessPendingSigShares(std::move(sigSharesByNodes), std::move(quorums));
+                });


⚠️ Potential issue | 🟡 Minor

Fix clang-format violations to pass CI.

The pipeline reports clang-format differences on lines 317-320. Run clang-format-diff.py or your local formatter to resolve.

Additionally, CollectPendingSigSharesToVerify returns a bool indicating whether more work remains. While ignoring it is intentional for the one-batch-per-cycle design, consider an explicit discard to signal intent:

- m_shares_manager->CollectPendingSigSharesToVerify(nMaxBatchSize, sigSharesByNodes, quorums); + std::ignore = m_shares_manager->CollectPendingSigSharesToVerify(nMaxBatchSize, sigSharesByNodes, quorums);

🧰 Tools

🪛 GitHub Actions: Clang Diff Format Check

[error] 317-320: Clang format differences detected. The code formatting differs from Clang-Format. Run the clang-format-diff.py step again to apply formatting changes (the CI step reported differences and exited with code 1).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/llmq/net_signing.cpp` around lines 317 - 322, Run clang-format on the modified block around m_shares_manager->CollectPendingSigSharesToVerify so spacing/line breaks match project style (fix the formatting of the if and lambda capture lines), and explicitly discard the boolean return value to show intent (e.g., cast to void or assign to [[maybe_unused]] variable) when calling CollectPendingSigSharesToVerify; ensure the lambda still captures sigSharesByNodes and quorums with std::move and calls ProcessPendingSigShares via worker_pool.push exactly as before.

thepastaclaw

Code Review

Clean refactoring that separates sig share collection from BLS verification by making data flow explicit through function parameters. The dispatcher loop structure (including the redundant-worker pattern flagged by both agents) is identical to the pre-existing code on develop and is not introduced by this PR. The PR description's 'natural queue bounding' language reflects the abandoned commit 1 approach, not the final code, but this is a documentation nuance, not a code defect.

Reviewed commit: 4721f4e

🟡 1 suggestion(s)

🤖 Prompt for all review comments with AI agents

These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `src/llmq/net_signing.cpp`:
- [SUGGESTION] lines 309-333: Dispatcher accumulates redundant workers in the pool queue (pre-existing)
  The dispatcher pushes a new looping worker every 10ms whenever `IsAnyPendingProcessing()` returns true, regardless of how many workers are already actively processing in the pool. While the pool size (max 4 threads) bounds actual concurrency, excess tasks queue up — each starts, finds nothing (work already drained by an earlier worker), and exits immediately.

Verified against develop: this pattern is **identical** to the pre-existing code and is NOT introduced by this PR. The `if (m_shares_manager->IsAnyPendingProcessing()) { worker_pool.push(...); }` block with 10ms sleep is unchanged from develop.

Note: the PR description mentions ‘natural queue bounding’ which applied to the intermediate commit 1 approach (dispatcher-side collection, one task per batch) but not to the final commit 2 approach. The description could be updated to reflect the final approach.

A lightweight guard (e.g., `std::atomic<int>` counting active processing workers, skip push when >= pool size) would eliminate the waste without affecting throughput. Worth considering as a follow-up.

thepastaclaw · 2026-03-17T07:00:08Z

src/llmq/net_signing.cpp

        if (m_shares_manager->IsAnyPendingProcessing()) {
-            // If there's processing work, spawn a helper worker
+            // Spawn a helper worker to collect and process pending sig shares.
+            // Collection and verification run together in the worker to avoid lock contention
+            // between the dispatcher thread and workers competing for the cs mutex.
            worker_pool.push([this](int) {
                while (!workInterrupt) {
-                    bool moreWork = ProcessPendingSigShares();
+                    std::unordered_map<NodeId, std::vector<CSigShare>> sigSharesByNodes;
+                    std::unordered_map<std::pair<Consensus::LLMQType, uint256>, CQuorumCPtr, StaticSaltedHasher> quorums;

-                    if (!moreWork) {
+                    const size_t nMaxBatchSize{32};
+                    bool more_work = m_shares_manager->CollectPendingSigSharesToVerify(nMaxBatchSize, sigSharesByNodes,
+                                                                                       quorums);
+
+                    if (sigSharesByNodes.empty()) {
                        return; // No work found, exit immediately
                    }
+
+                    ProcessPendingSigShares(std::move(sigSharesByNodes), std::move(quorums));
+
+                    if (!more_work) {
+                        return;
+                    }
                }
            });
        }


🟡 Suggestion: Dispatcher accumulates redundant workers in the pool queue (pre-existing)

The dispatcher pushes a new looping worker every 10ms whenever IsAnyPendingProcessing() returns true, regardless of how many workers are already actively processing in the pool. While the pool size (max 4 threads) bounds actual concurrency, excess tasks queue up — each starts, finds nothing (work already drained by an earlier worker), and exits immediately.

Verified against develop: this pattern is identical to the pre-existing code and is NOT introduced by this PR. The if (m_shares_manager->IsAnyPendingProcessing()) { worker_pool.push(...); } block with 10ms sleep is unchanged from develop.

Note: the PR description mentions ‘natural queue bounding’ which applied to the intermediate commit 1 approach (dispatcher-side collection, one task per batch) but not to the final commit 2 approach. The description could be updated to reflect the final approach.

A lightweight guard (e.g., std::atomic<int> counting active processing workers, skip push when >= pool size) would eliminate the waste without affecting throughput. Worth considering as a follow-up.

_{source: ['claude', 'codex']}

🤖 Fix this with AI agents

These findings are from an automated code review. Verify each finding against the current code and only fix it if needed. In `src/llmq/net_signing.cpp`: - [SUGGESTION] lines 309-333: Dispatcher accumulates redundant workers in the pool queue (pre-existing) The dispatcher pushes a new looping worker every 10ms whenever `IsAnyPendingProcessing()` returns true, regardless of how many workers are already actively processing in the pool. While the pool size (max 4 threads) bounds actual concurrency, excess tasks queue up — each starts, finds nothing (work already drained by an earlier worker), and exits immediately. Verified against develop: this pattern is **identical** to the pre-existing code and is NOT introduced by this PR. The `if (m_shares_manager->IsAnyPendingProcessing()) { worker_pool.push(...); }` block with 10ms sleep is unchanged from develop. Note: the PR description mentions ‘natural queue bounding’ which applied to the intermediate commit 1 approach (dispatcher-side collection, one task per batch) but not to the final commit 2 approach. The description could be updated to reflect the final approach. A lightweight guard (e.g., `std::atomic<int>` counting active processing workers, skip push when >= pool size) would eliminate the waste without affecting throughput. Worth considering as a follow-up.

knst

utACK 018298b

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a6a340fafe

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-17T15:09:43Z

src/llmq/net_signing.cpp

+            std::unordered_map<std::pair<Consensus::LLMQType, uint256>, CQuorumCPtr, StaticSaltedHasher> quorums;
+
+            const size_t nMaxBatchSize{32};
+            bool more_work = m_shares_manager->CollectPendingSigSharesToVerify(nMaxBatchSize, sigSharesByNodes, quorums);


Prevent pre-draining sigshare queue before dedupe state updates

Calling CollectPendingSigSharesToVerify in this tight dispatcher loop and immediately enqueueing each batch means collection can outrun verification, so many batches are removed before workers update dedupe state. Collection dedupes only against sigShares.Has(...) in CSigSharesManager::CollectPendingSigSharesToVerify, but that set is updated later in ProcessSigShare, so under duplicate-share floods this change can enqueue and verify the same key far more times than before, creating avoidable CPU amplification and delayed recovery progress.

Useful? React with 👍 / 👎.

PastaPastaPasta · 2026-03-17T18:16:03Z

018298b 018298b has 2 acks; merging

github-actions bot added the needs rebase label Mar 16, 2026

coderabbitai bot reviewed Mar 16, 2026

View reviewed changes

src/llmq/net_signing.cpp Outdated Show resolved Hide resolved

thepastaclaw force-pushed the fix-unbounded-sigshare-dispatch branch from b42f102 to 3f0975f Compare March 16, 2026 06:20

github-actions bot added needs rebase and removed needs rebase labels Mar 16, 2026

knst reviewed Mar 16, 2026

View reviewed changes

thepastaclaw force-pushed the fix-unbounded-sigshare-dispatch branch from 3f0975f to 6fe8214 Compare March 16, 2026 08:25

github-actions bot added needs rebase and removed needs rebase labels Mar 16, 2026

thepastaclaw changed the title ~~fix: guard sigshare dispatch to prevent unbounded task queue growth~~ fix: collect sigshares synchronously in dispatcher for parallel verification Mar 16, 2026

thepastaclaw force-pushed the fix-unbounded-sigshare-dispatch branch from 6fe8214 to e1bd350 Compare March 16, 2026 08:45

github-actions bot added needs rebase and removed needs rebase labels Mar 16, 2026

knst previously approved these changes Mar 16, 2026

View reviewed changes

knst self-requested a review March 16, 2026 10:46

thepastaclaw dismissed knst’s stale review via 0e2ea5a March 16, 2026 11:24

thepastaclaw force-pushed the fix-unbounded-sigshare-dispatch branch from e1bd350 to 0e2ea5a Compare March 16, 2026 11:24

github-actions bot added needs rebase and removed needs rebase labels Mar 16, 2026

knst marked this pull request as ready for review March 16, 2026 18:33

chatgpt-codex-connector bot reviewed Mar 16, 2026

View reviewed changes

thepastaclaw force-pushed the fix-unbounded-sigshare-dispatch branch from 0e2ea5a to 0ebeb14 Compare March 16, 2026 18:43

UdjinM6 reviewed Mar 16, 2026

View reviewed changes

src/llmq/net_signing.cpp Outdated Show resolved Hide resolved

UdjinM6 previously approved these changes Mar 16, 2026

View reviewed changes

thepastaclaw dismissed UdjinM6’s stale review via 018298b March 16, 2026 20:18

thepastaclaw force-pushed the fix-unbounded-sigshare-dispatch branch from 0ebeb14 to 018298b Compare March 16, 2026 20:18

chatgpt-codex-connector bot reviewed Mar 16, 2026

View reviewed changes

UdjinM6 previously approved these changes Mar 16, 2026

View reviewed changes

thepastaclaw dismissed UdjinM6’s stale review via 0bfbc74 March 16, 2026 22:01

chatgpt-codex-connector bot reviewed Mar 16, 2026

View reviewed changes

coderabbitai bot reviewed Mar 16, 2026

View reviewed changes

thepastaclaw force-pushed the fix-unbounded-sigshare-dispatch branch 2 times, most recently from 018298b to 0d38897 Compare March 16, 2026 23:44

thepastaclaw commented Mar 17, 2026

View reviewed changes

github-actions bot mentioned this pull request Mar 17, 2026

backport: bitcoin/bitcoin#24304: [kernel 0/n] Introduce bitcoin-chainstate #7234

Draft

5 tasks

thepastaclaw force-pushed the fix-unbounded-sigshare-dispatch branch from 4721f4e to 018298b Compare March 17, 2026 13:47

knst previously approved these changes Mar 17, 2026

View reviewed changes

thepastaclaw force-pushed the fix-unbounded-sigshare-dispatch branch from 018298b to a6a340f Compare March 17, 2026 15:05

chatgpt-codex-connector bot reviewed Mar 17, 2026

View reviewed changes

thepastaclaw dismissed knst’s stale review via 018298b March 17, 2026 15:19

thepastaclaw force-pushed the fix-unbounded-sigshare-dispatch branch from a6a340f to 018298b Compare March 17, 2026 15:19

PastaPastaPasta merged commit 1d212a1 into dashpay:develop Mar 17, 2026
203 of 213 checks passed

		const size_t nMaxBatchSize{32};
		m_shares_manager->CollectPendingSigSharesToVerify(nMaxBatchSize, sigSharesByNodes, quorums);

Conversation

thepastaclaw commented Mar 16, 2026 • edited by knst Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Fix

Validation

Uh oh!

thepastaclaw commented Mar 16, 2026

Uh oh!

coderabbitai bot commented Mar 16, 2026

Uh oh!

github-actions bot commented Mar 16, 2026

Uh oh!

github-actions bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ No Merge Conflicts Detected

Uh oh!

coderabbitai bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

knst Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

knst Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

knst left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

knst commented Mar 16, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

thepastaclaw commented Mar 16, 2026

Uh oh!

Uh oh!

UdjinM6 left a comment

Choose a reason for hiding this comment

Uh oh!

thepastaclaw commented Mar 16, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

thepastaclaw commented Mar 16, 2026 •

edited by knst

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading

coderabbitai bot commented Mar 16, 2026 •

edited

Loading

knst Mar 16, 2026 •

edited

Loading

knst Mar 16, 2026 •

edited

Loading