-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Introduce a 10s connect timeout #17733
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
See #17697 for discussion. Introduce a 10s connect timeout, configurable with `UV_HTTP_CONNECT_TIMEOUT`. The existing HTTP timeout gets the more precise description as HTTP read timeout.
EliteTK
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. But why not UV_CONNECT_HTTP_TIMEOUT for consistency with the upload one?
|
|
|
Though maybe it should be |
|
I have no opinion on the order of words :) I can update in either direction. |
Okay, that makes sense. Then I think the code shouldn't rename |
crates/uv-settings/src/lib.rs
Outdated
| pub http_timeout: Duration, | ||
| pub http_read_timeout: Duration, | ||
| pub http_connect_timeout: Duration, | ||
| pub http_upload_timeout: Duration, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems confusing to rename. This is the http read timeout for uv publish file uploads. We should probably call it http_read_timeout_upload or something?
| /// Handle a specific `reqwest` error, and convert it to [`io::Error`]. | ||
| fn handle_response_errors(&self, err: reqwest::Error) -> io::Error { | ||
| if err.is_timeout() { | ||
| // Assumption: The connect timeout with the 10s default is not the culprit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we validate this assumption based on the duration of the actual timeout that occurred or the type of the error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Or, why do you think this assumption is fine?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assumption is that if you're experiencing a network failure, the (default) UV_CONNECT_TIMEOUT is not the culprit and we don't want to guide users towards it. I'm assuming that if you can't connect to a server in 10s and 3 retries, your problem isn't solved by increasing the timeout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I'm not sure about whether anyone actually needs to bump UV_HTTP_TIMEOUT either (i'm not aware of cases where nothing happened for 30s but it recovered after that), now that it's a read timeout instead of a request timeout (trying to confirm with the one major user: apache/airflow#61205). This error should be hardly reachable now, and most likely indicates that either the server is down, or you're on a network so unreliable that the connection broke 4 times in ways that reqwest/hyper couldn't recover from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the DNS is down or something won't we then recommend increasing the read timeout erroneously though?
now that it's a read timeout instead of a request timeout
When did this change again?
This error should be hardly reachable now, and most likely indicates that either the server is down, or you're on a network so unreliable that the connection broke 4 times in ways that reqwest/hyper couldn't recover from.
This seems contrary to the claim that users won't hit the connect timeout here :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was imprecise, all the basic cases we handle properly.
If you hit this timeout something is down in a weird way, think a server starting to respond but then pulling from an upstream that never responds. Testing this I also found deactivating and reactivating network in under the timeout on my linux machine causes a timeout with 3 retries, which sounds like a bug in reqwest? (but doesn't rely on the timeout)
Reproducing the timeout requires a response far enough that uv doesn't reject it immediately, but then still hangs without sending more data.
The examples below for how we handle the basic failure cases use #17735 (stacked on top of this).
Using the wrong URL:
time uv pip install tqdm --index-url https://thisdoesnotexist.nope
⠦ Resolving dependencies... error: Request failed after 3 retries
Caused by: Failed to fetch: `https://thisdoesnotexist.nope/tqdm/`
Caused by: error sending request for url (https://thisdoesnotexist.nope/tqdm/)
Caused by: client error (Connect)
Caused by: dns error
Caused by: failed to lookup address information: Name or service not known
real 0m2,932s
user 0m0,043s
sys 0m0,032s
A server that is down (index):
$ time uv-debug pip install tqdm --index-url https://example.com:81
⠧ Resolving dependencies... error: Request failed after 3 retries
Caused by: Failed to fetch: `https://example.com:81/tqdm/`
Caused by: error sending request for url (https://example.com:81/tqdm/)
Caused by: client error (Connect)
Caused by: operation timed out
real 0m46,469s
user 0m0,084s
sys 0m0,040s
A server that is down (streaming):
$ time uv-debug pip install https://example.com:81/tqdm-0.1-py3-none-any.whl
× Failed to download `tqdm @ https://example.com:81/tqdm-0.1-py3-none-any.whl`
├─▶ Request failed after 3 retries
├─▶ Failed to fetch: `https://example.com:81/tqdm-0.1-py3-none-any.whl`
├─▶ error sending request for url (https://example.com:81/tqdm-0.1-py3-none-any.whl)
├─▶ client error (Connect)
╰─▶ operation timed out
real 0m44,384s
user 0m0,085s
sys 0m0,048s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the basic cases, the branch and 0.9.28 show the same messages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reported upstream: seanmonstar/reqwest#2956
See #17697 for discussion.
Introduce a 10s connect timeout, configurable with
UV_HTTP_CONNECT_TIMEOUT. The existing HTTP timeout gets the more precise description as HTTP read timeout.