-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Summary
The zyte-api CLI and the AsyncZyteAPI / ZyteAPI Python clients fail with a
ClientConnectorDNSError in any environment where outbound HTTP traffic is routed
through a system proxy, even when HTTP_PROXY / HTTPS_PROXY environment variables
are correctly set. The root cause is that aiohttp.ClientSession is created without
trust_env=True, so it bypasses the proxy entirely and attempts direct DNS resolution,
which fails.
Affected environments
Any environment where internet access goes through an HTTP/HTTPS proxy, including:
- Corporate networks with SSL inspection proxies
- Docker containers without host networking (proxy injected via env vars)
- CI/CD runners (GitHub Actions, GitLab CI, etc.) in restricted networks
- AI agent sandboxes (e.g. Claude.ai computer use, similar agentic platforms)
- Cloud VPC environments with NAT gateways or egress proxies
Steps to reproduce
- Set up an environment with
HTTPS_PROXY/HTTP_PROXYpointing to a working proxy. - Confirm outbound connectivity works via
curl(which respects proxy env vars by default). - Run the CLI:
zyte-api urls.txt --api-key YOUR_KEY --output results.jsonlExpected: requests route through the proxy, reach api.zyte.com successfully.
Actual:
ERROR:zyte_api:Cannot connect to host api.zyte.com:443 ssl:default
[Temporary failure in name resolution]
Successful URLs: 0 of N
Success ratio: 0.0%
Exception types: [(<class 'aiohttp.client_exceptions.ClientConnectorDNSError'>, N)]
Note: --api-url https://api.zyte.com/v1/ does not fix this — the issue is not
the URL but aiohttp's failure to route through the proxy.
Root cause
In zyte_api/_utils.py, create_session() creates the aiohttp.ClientSession without
trust_env=True:
def create_session(
connection_pool_size: int = 100, **kwargs: Any
) -> aiohttp.ClientSession:
kwargs.setdefault("timeout", _AIO_API_TIMEOUT)
if "connector" not in kwargs:
kwargs["connector"] = TCPConnector(limit=connection_pool_size, force_close=True)
return aiohttp.ClientSession(**kwargs) # <-- trust_env not setaiohttp does support reading proxy env vars, but only when trust_env=True is
explicitly passed. Without it, it opens TCP connections directly, performing its own
local DNS resolution — which fails in proxy-only environments.
This is confirmed by the aiohttp docs:
https://docs.aiohttp.org/en/stable/client_advanced.html#proxy-support
"To use proxy env variables, pass trust_env=True to ClientSession"
Confirmation
Monkey-patching create_session to inject trust_env=True resolves the issue
completely:
import zyte_api._utils as _u
_orig = _u.create_session
def _patched(*a, **kw):
kw['trust_env'] = True
return _orig(*a, **kw)
_u.create_session = _patchedAfter this patch, the CLI runs successfully (2/2 URLs, 100% success rate) in an
environment where it previously failed with 0% success.
Proposed fix
1: set trust_env=True
In zyte_api/_utils.py:
def create_session(
connection_pool_size: int = 100, **kwargs: Any
) -> aiohttp.ClientSession:
kwargs.setdefault("timeout", _AIO_API_TIMEOUT)
kwargs.setdefault("trust_env", True) # <-- add this line
if "connector" not in kwargs:
kwargs["connector"] = TCPConnector(limit=connection_pool_size, force_close=True)
return aiohttp.ClientSession(**kwargs)Using setdefault means callers who explicitly pass trust_env=False retain control.
This is the minimal, low-risk change.
or
2: Expose as a CLI flag
Add --trust-env / --no-trust-env to the CLI argument parser, defaulting to True.
This gives users explicit control and makes the behaviour discoverable. This could be
combined with Option A (default on, opt-out via flag).
or
3: Auto-detect proxy env vars
Check whether HTTP_PROXY / HTTPS_PROXY / ALL_PROXY are set in the environment
and automatically enable trust_env=True only in that case. Slightly more complex but
makes the behaviour fully transparent.
Recommendation
Option A is the right default. trust_env=True has no downside in environments without
a proxy : aiohttp simply finds no proxy vars and behaves identically to today. The
current default of False actively breaks users in proxy environments with no obvious
error message pointing to the cause.
Environment
zyte-apiversion: 0.8.1- Python: 3.12
aiohttpversion: (whatever ships with 0.8.1)- OS: Linux (Ubuntu 24, sandboxed container)
- Proxy: HTTP CONNECT proxy via
HTTPS_PROXYenv var