Skip to content

fix: handle resetter channel errors#22

Open
0x416e746f6e wants to merge 6 commits intomainfrom
fix/lagging-resetter-channel
Open

fix: handle resetter channel errors#22
0x416e746f6e wants to merge 6 commits intomainfrom
fix/lagging-resetter-channel

Conversation

@0x416e746f6e
Copy link
Member

reset signal channels didn't check for errors on receive, and would only stop the subproxy if receive was Ok. however, it's possible that receiver lags, in which case .recv() would yield Err, which would cause the resetter task to exit w/o stopping the web-service.

consequently, this can cause one of the sub-proxies to remain active, while 2 others would stop, and the whole proxy would endlessly wait in the join_all.

this PR fixes above condition, improves error handling in general, and adds basic unit test.

@akundaz
Copy link
Contributor

akundaz commented Feb 10, 2026

Have you tried using cancel tokens for this? It would completely eliminate the need to account for weird receiver behavior

@0x416e746f6e
Copy link
Member Author

Have you tried using cancel tokens for this? It would completely eliminate the need to account for weird receiver behavior

Cancel token is used to initiate graceful shutdown. Problem with it is that it's a one-time thing (which makes sense for graceful-shutdown), while circuit-breaker should be able to send the reset signal multiple times.

@akundaz
Copy link
Contributor

akundaz commented Feb 18, 2026

Have you tried using cancel tokens for this? It would completely eliminate the need to account for weird receiver behavior

Cancel token is used to initiate graceful shutdown. Problem with it is that it's a one-time thing (which makes sense for graceful-shutdown), while circuit-breaker should be able to send the reset signal multiple times.

But handler.stop can only be called once, right? So after you shut down the http proxy task, you'll need to start a new one, at which point it makes sense to tie it to a new cancel token.

@0x416e746f6e
Copy link
Member Author

Have you tried using cancel tokens for this? It would completely eliminate the need to account for weird receiver behavior

Cancel token is used to initiate graceful shutdown. Problem with it is that it's a one-time thing (which makes sense for graceful-shutdown), while circuit-breaker should be able to send the reset signal multiple times.

But handler.stop can only be called once, right? So after you shut down the http proxy task, you'll need to start a new one, at which point it makes sense to tie it to a new cancel token.

canceller stops everything (e.g. metrics server). resetter only proxies.

@akundaz
Copy link
Contributor

akundaz commented Feb 19, 2026

canceller stops everything (e.g. metrics server). resetter only proxies.

Sorry I wasn't being too descriptive. Cancel tokens only stop the tasks that listen to them. So you should create separate tokens and reset tasks for each proxy you're running, since they should be able to reset independently. If you just want to keep on reset task for now that's fine too.

To tie it to the other cancel token for the shutdown you can use the .child_token() fn. So you'd have something like this

let shutdown_signal: CancellationToken = Server::wait_for_shutdown_signal();
let reset_signal: CancellationToken = Server::wait_for_reset_signal(shutdown_signal.clone());

and in Server::wait_for_reset_signal:

fn wait_for_reset_signal(shutdown_signal: CancellationToken) -> CancellationToken {
    let reset_signal = shutdown_signal.child_token();

    tokio::spawn({
        let reset_signal = reset_signal.clone();
        let shutdown_signal = shutdown_signal.clone();
        async move {
            let mut hangup =
                signal(SignalKind::hangup()).expect("failed to install sighup handler");

            loop {
                tokio::select! {
                    _ = shutdown_signal.cancelled() => break,
                    _ = hangup.recv() => {
                        info!("Hangup signal received, resetting...");
                        reset_signal.cancel();
                    }
                }
            }
        }
    });

    reset_signal
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments