-
-
Notifications
You must be signed in to change notification settings - Fork 15
Moved tests to using testcontainers #232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Moved tests to using testcontainers #232
Conversation
WalkthroughThe PR adds three optional dev dependencies in 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey - I've found 1 issue, and left some high level feedback:
- The new
expdb_testfixture no longer usesautomatic_rollback, so test cases may now leak state into each other and depend on execution order; consider wrapping the Testcontainers-backed connection in the same rollback helper to preserve isolation. - Overriding
MySqlContainer._connectglobally in an autouse fixture tightly couples tests to a private implementation detail of testcontainers; if possible, prefer configuring an official wait strategy on the container instance instead of monkey-patching a private method at module scope.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new `expdb_test` fixture no longer uses `automatic_rollback`, so test cases may now leak state into each other and depend on execution order; consider wrapping the Testcontainers-backed connection in the same rollback helper to preserve isolation.
- Overriding `MySqlContainer._connect` globally in an autouse fixture tightly couples tests to a private implementation detail of testcontainers; if possible, prefer configuring an official wait strategy on the container instance instead of monkey-patching a private method at module scope.
## Individual Comments
### Comment 1
<location> `tests/conftest.py:67-74` </location>
<code_context>
+ container.stop()
+
+
[email protected]
+def expdb_test(mysql_container: MySqlContainer) -> Connection:
+ url = mysql_container.get_connection_url()
+ url = url.replace("mysql://", "mysql+pymysql://")
+
+ engine = sqlalchemy.create_engine(url)
+ with engine.begin() as connection:
+ yield connection
+
+
</code_context>
<issue_to_address>
**issue (testing):** The new `expdb_test` fixture no longer wraps connections in `automatic_rollback`, so test data will persist across tests and may break isolation.
With `engine.begin()` the transaction is committed on context exit, so writes from one test will persist into the next and can cause order-dependent, flaky tests when sharing the same database. To preserve isolation, either wrap the dynamic `engine` with `automatic_rollback`, explicitly manage a transaction and roll it back per test, or add a fixture that truncates/cleans the DB between tests while still using Testcontainers.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
tests/conftest.py
Outdated
| @pytest.fixture | ||
| def expdb_test(mysql_container: MySqlContainer) -> Connection: | ||
| url = mysql_container.get_connection_url() | ||
| url = url.replace("mysql://", "mysql+pymysql://") | ||
|
|
||
| engine = sqlalchemy.create_engine(url) | ||
| with engine.begin() as connection: | ||
| yield connection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (testing): The new expdb_test fixture no longer wraps connections in automatic_rollback, so test data will persist across tests and may break isolation.
With engine.begin() the transaction is committed on context exit, so writes from one test will persist into the next and can cause order-dependent, flaky tests when sharing the same database. To preserve isolation, either wrap the dynamic engine with automatic_rollback, explicitly manage a transaction and roll it back per test, or add a fixture that truncates/cleans the DB between tests while still using Testcontainers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@tests/conftest.py`:
- Line 45: Remove the unused lint suppression by deleting the trailing " #
noqa: SLF001" on the assignment MySqlContainer._connect = _connect so the line
reads only the assignment; this drops the unnecessary directive and resolves the
Ruff unused-directive warning while leaving the monkeypatch intact.
- Around line 48-64: The mysql_container fixture currently calls
container.start() unguarded and will hard-fail when Docker isn't reachable; wrap
the container.start() call in a try/except that catches
docker.errors.DockerException and calls pytest.skip(...) to skip the session
tests when Docker is unavailable (ensure you import pytest and docker.errors if
not present), then proceed to yield container and keep the existing finally
container.stop() behavior so cleanup still runs if start succeeded.
- Around line 29-44: The fixture override_testcontainers_connect is
monkey-patching the private MySqlContainer._connect; remove that override and
instead use the public waiting utilities—either call wait_for_logs(container,
r"ready for connections", timeout=60, interval=1) after container startup (use
the testcontainers.core.waiting_utils.wait_for_logs function) or create a
subclass (e.g., MySqlContainerWithWait) and implement _connect decorated with
`@wait_container_is_ready` to handle retry logic; in short: stop overriding
_connect directly in override_testcontainers_connect and switch to wait_for_logs
or a decorated subclass of MySqlContainer for log-based readiness.
♻️ Duplicate comments (1)
tests/conftest.py (1)
67-74:engine.begin()commits on exit; tests may leak data across cases.Line 73 uses
engine.begin(), which commits on context exit. That breaks isolation and can make tests order-dependent. Preferautomatic_rollback(engine)or explicit rollback per test.✅ Safer rollback-based fixture
engine = sqlalchemy.create_engine(url) - with engine.begin() as connection: + with automatic_rollback(engine) as connection: yield connectionIn SQLAlchemy, does `engine.begin()` commit on context exit, and what is the recommended pattern for rollback-based test fixtures?
🧹 Nitpick comments (1)
pyproject.toml (1)
35-37: Confirm and constrain new dev dependency versions.Lines 35–37 add unpinned dev deps; to keep CI reproducible and avoid breaking changes, consider adding compatible version ranges (or a constraints file) after confirming they support Python 3.12 and your SQLAlchemy/MySQL stack.
What are the latest stable versions of testcontainers, pymysql, and cryptography, and do they support Python 3.12? Any breaking changes that could affect SQLAlchemy or PyMySQL usage?
| @pytest.fixture(scope="session", autouse=True) | ||
| def override_testcontainers_connect() -> None: | ||
| """ | ||
| Override MySqlContainer._connect once per test session. | ||
| Applied automatically everywhere. | ||
| """ | ||
|
|
||
| def _connect(self: MySqlContainer) -> None: | ||
| wait_strategy = LogMessageWaitStrategy( | ||
| re.compile( | ||
| r".*: ready for connections", | ||
| flags=re.DOTALL | re.MULTILINE, | ||
| ) | ||
| ) | ||
| wait_strategy.wait_until_ready(self) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
🧩 Analysis chain
🌐 Web query:
testcontainers-python MySqlContainer wait strategy customize public API
💡 Result:
In testcontainers-python, MySqlContainer doesn’t expose a Java-style “wait strategy” hook on the class itself; the public API for customizing readiness is via the waiting utilities (log-based waiting) and/or by subclassing and overriding the connection check.
1) Log-based waiting (public API)
Use testcontainers.core.waiting_utils.wait_for_logs(container, predicate, timeout=None, interval=1) after the container is started (or inside a custom container’s startup flow). The predicate can be a regex string or a callable. The function supports timeout and interval. [1]
from testcontainers.mysql import MySqlContainer
from testcontainers.core.waiting_utils import wait_for_logs
with MySqlContainer("mysql:8.0") as mysql:
wait_for_logs(mysql, r"ready for connections", timeout=60, interval=1)MySqlContainer itself is the standard MySQL wrapper you’re already using. [2]
2) Connection-based waiting via @wait_container_is_ready (public API)
If you want “wait until I can connect” semantics (common for DBs), subclass MySqlContainer and decorate your _connect() with wait_container_is_ready(...) so failures are retried until success/timeout. [3]
import mysql.connector
from testcontainers.mysql import MySqlContainer
from testcontainers.core.waiting_utils import wait_container_is_ready
class MyReadyMySql(MySqlContainer):
`@wait_container_is_ready`(mysql.connector.Error)
def _connect(self):
conn = mysql.connector.connect(
host=self.get_container_host_ip(),
port=int(self.get_exposed_port(3306)),
user=self.username,
password=self.password,
database=self.dbname,
)
conn.close()
with MyReadyMySql("mysql:8.0") as mysql:
...3) Tuning retry/timeout behavior
Retry limits/sleep/timeout for readiness waiting are controlled via testcontainers_config (global configuration). [3]
from testcontainers.core.config import testcontainers_config
testcontainers_config.timeout = 180
testcontainers_config.max_tries = 180
testcontainers_config.sleep_time = 1Sources: [1] [2] [3]
Use wait_for_logs() or subclass with @wait_container_is_ready() instead of monkey-patching _connect.
Line 36 overrides a private method, which breaks on testcontainers-python upgrades if internals change. testcontainers-python provides public APIs:
wait_for_logs(): Call after container startup with a regex predicate for log-based readiness- Subclass +
@wait_container_is_ready()decorator: Override_connectin a subclass with the decorator to handle retry logic safely
Example using wait_for_logs:
from testcontainers.mysql import MySqlContainer
from testcontainers.core.waiting_utils import wait_for_logs
with MySqlContainer("mysql:8.0") as mysql:
wait_for_logs(mysql, r"ready for connections", timeout=60, interval=1)🤖 Prompt for AI Agents
In `@tests/conftest.py` around lines 29 - 44, The fixture
override_testcontainers_connect is monkey-patching the private
MySqlContainer._connect; remove that override and instead use the public waiting
utilities—either call wait_for_logs(container, r"ready for connections",
timeout=60, interval=1) after container startup (use the
testcontainers.core.waiting_utils.wait_for_logs function) or create a subclass
(e.g., MySqlContainerWithWait) and implement _connect decorated with
`@wait_container_is_ready` to handle retry logic; in short: stop overriding
_connect directly in override_testcontainers_connect and switch to wait_for_logs
or a decorated subclass of MySqlContainer for log-based readiness.
| ) | ||
| wait_strategy.wait_until_ready(self) | ||
|
|
||
| MySqlContainer._connect = _connect # noqa: SLF001 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused # noqa: SLF001.
Ruff reports this directive as unused at Line 45; drop it to keep lint clean.
🧹 Proposed cleanup
- MySqlContainer._connect = _connect # noqa: SLF001
+ MySqlContainer._connect = _connect📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| MySqlContainer._connect = _connect # noqa: SLF001 | |
| MySqlContainer._connect = _connect |
🧰 Tools
🪛 Ruff (0.14.13)
45-45: Unused noqa directive (non-enabled: SLF001)
Remove unused noqa directive
(RUF100)
🤖 Prompt for AI Agents
In `@tests/conftest.py` at line 45, Remove the unused lint suppression by deleting
the trailing " # noqa: SLF001" on the assignment MySqlContainer._connect =
_connect so the line reads only the assignment; this drops the unnecessary
directive and resolves the Ruff unused-directive warning while leaving the
monkeypatch intact.
| @pytest.fixture(scope="session") | ||
| def mysql_container() -> MySqlContainer: | ||
| container = MySqlContainer( | ||
| os.environ.get( | ||
| "OPENML_DATABASES_OPENML_URL", | ||
| "openml/test-database:20240105", | ||
| ), | ||
| username=os.environ.get("OPENML_DATABASES_OPENML_USERNAME", ""), | ||
| password=os.environ.get("OPENML_DATABASES_OPENML_PASSWORD", ""), | ||
| dbname="openml_expdb", | ||
| ) | ||
|
|
||
| container.start() | ||
| try: | ||
| yield container | ||
| finally: | ||
| container.stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
testcontainers-python MySqlContainer.start() Docker unavailable exception type
💡 Result:
In testcontainers-python, MySqlContainer.start() ultimately initializes the Docker SDK client (docker / “docker-py”). If Docker isn’t reachable (daemon not running, wrong DOCKER_HOST, etc.), the failure you’ll see is typically:
docker.errors.DockerException(often with a message like “Error while fetching server API version …”) [1][2]
So, the exception type to catch for “Docker unavailable” is generally docker.errors.DockerException. [1][2]
🏁 Script executed:
# Check if there are existing patterns for Docker error handling in the codebase
rg -i "docker|container" tests/conftest.py -A 3 -B 3Repository: openml/server-api
Length of output: 1397
🏁 Script executed:
# Search for any existing pytest.skip usage patterns related to Docker or external services
rg "pytest.skip" --type py -B 2 -A 2Repository: openml/server-api
Length of output: 2038
🏁 Script executed:
# Check imports in conftest.py to understand what's currently imported
head -30 tests/conftest.pyRepository: openml/server-api
Length of output: 840
🏁 Script executed:
# Check if there are any other exception handling patterns for testcontainers in the codebase
rg "testcontainers|MySqlContainer" --type pyRepository: openml/server-api
Length of output: 622
🏁 Script executed:
# Verify the fixture is used as a dependency in tests to understand impact scope
rg "mysql_container" tests/ --type pyRepository: openml/server-api
Length of output: 265
Skip tests gracefully when Docker isn't available (CI currently fails).
The pipeline reports a Docker connection error; container.start() will hard-fail the suite when Docker isn't reachable. Wrap the start call in a try-except to catch docker.errors.DockerException and skip tests in non-Docker environments.
🛠️ Proposed fix
+from docker.errors import DockerException
+
`@pytest.fixture`(scope="session")
def mysql_container() -> MySqlContainer:
container = MySqlContainer(
os.environ.get(
"OPENML_DATABASES_OPENML_URL",
"openml/test-database:20240105",
),
username=os.environ.get("OPENML_DATABASES_OPENML_USERNAME", ""),
password=os.environ.get("OPENML_DATABASES_OPENML_PASSWORD", ""),
dbname="openml_expdb",
)
- container.start()
+ started = False
+ try:
+ container.start()
+ started = True
+ except DockerException as exc:
+ pytest.skip(f"Docker unavailable: {exc}")
try:
yield container
finally:
- container.stop()
+ if started:
+ container.stop()🧰 Tools
🪛 GitHub Actions: Tests
[error] 50-50: Failed to initialize MySqlContainer due to docker connection error
🤖 Prompt for AI Agents
In `@tests/conftest.py` around lines 48 - 64, The mysql_container fixture
currently calls container.start() unguarded and will hard-fail when Docker isn't
reachable; wrap the container.start() call in a try/except that catches
docker.errors.DockerException and calls pytest.skip(...) to skip the session
tests when Docker is unavailable (ensure you import pytest and docker.errors if
not present), then proceed to yield container and keep the existing finally
container.stop() behavior so cleanup still runs if start succeeded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@tests/conftest.py`:
- Around line 65-73: The expdb_test fixture currently uses engine.begin() which
commits on exit and persists test data; replace that with the same
automatic_rollback pattern used by user_test to ensure test isolation.
Specifically, after creating engine in expdb_test (and keeping mysql_container
and url logic), open the connection using the automatic_rollback context/utility
(the same automatic_rollback used by user_test) and yield that connection
instead of using engine.begin(), so tests roll back automatically and avoid
double-commit interactions with persisted_flow.
- Around line 48-52: Rename the misleading environment variable
OPENML_DATABASES_OPENML_URL to a name that reflects it holds a Docker image
(suggested OPENML_TEST_DATABASE_IMAGE) wherever it's used in tests/conftest.py
and related test configuration; update the call that passes its value into
MySqlContainer to read os.environ.get("OPENML_TEST_DATABASE_IMAGE",
"openml/test-database:20240105") instead of the old name, and search/replace
other references (tests, CI configs, and docs) to the old variable so callers
are updated and the naming is consistent with OPENML_DATABASES_OPENML_USERNAME /
OPENML_DATABASES_OPENML_PASSWORD.
🧹 Nitpick comments (2)
tests/conftest.py (2)
67-68: Fragile URL scheme replacement.
url.replace("mysql://", "mysql+pymysql://")would also match a prefix of"mysql+mysqlconnector://"or"mysql+pymysql://"(the latter becoming"mysql+pymysql+pymysql://"). This same pattern is repeated inuser_test. Prefer a more targeted replacement or useurllib.parseto swap only the scheme component.Safer alternative
- url = mysql_container.get_connection_url() - url = url.replace("mysql://", "mysql+pymysql://") + url = mysql_container.get_connection_url() + if url.startswith("mysql://"): + url = "mysql+pymysql://" + url[len("mysql://"):]
46-62: NewEnginecreated on every test invocation — consider session or module scope.Both
expdb_testanduser_testcallsqlalchemy.create_engine()on every test. Engine creation involves connection pool setup. Since the container URL doesn't change, consider creating the engine once at session scope and only yielding fresh connections per test.Sketch
`@pytest.fixture`(scope="session") def expdb_engine(mysql_container: MySqlContainer) -> Engine: url = mysql_container.get_connection_url() if url.startswith("mysql://"): url = "mysql+pymysql://" + url[len("mysql://"):] return sqlalchemy.create_engine(url) `@pytest.fixture` def expdb_test(expdb_engine: Engine) -> Connection: with automatic_rollback(expdb_engine) as connection: yield connectionAlso applies to: 65-73, 85-93
| container = MySqlContainer( | ||
| os.environ.get( | ||
| "OPENML_DATABASES_OPENML_URL", | ||
| "openml/test-database:20240105", | ||
| ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
rg -n "OPENML_DATABASES_OPENML_URL" --type py -C 3Repository: openml/server-api
Length of output: 476
🏁 Script executed:
# Search for MySqlContainer definition and imports
rg -n "MySqlContainer" --type py -C 5Repository: openml/server-api
Length of output: 3255
🏁 Script executed:
# Check the testcontainers library documentation or type hints
# Also search for any .md files that might document environment variables
find . -type f -name "*.md" | head -20Repository: openml/server-api
Length of output: 429
🏁 Script executed:
# Check documentation files for env var documentation
cat docs/contributing/tests.mdRepository: openml/server-api
Length of output: 4491
🏁 Script executed:
# Check docker and installation documentation
cat docker/mysql/README.mdRepository: openml/server-api
Length of output: 2404
🏁 Script executed:
# Check the complete context of the mysql_container fixture
cat -n tests/conftest.py | sed -n '45,65p'Repository: openml/server-api
Length of output: 732
🏁 Script executed:
# Search for any documentation or comments about OPENML_DATABASES env vars
rg -n "OPENML_DATABASES" --type py --type md -C 2Repository: openml/server-api
Length of output: 1828
Rename OPENML_DATABASES_OPENML_URL to clarify it's a Docker image name, not a database connection URL.
The env var name OPENML_DATABASES_OPENML_URL misleadingly suggests a database connection string (e.g., mysql://host:3306/db), but it's actually passed as the Docker image name to MySqlContainer. This could cause confusion if someone sets it to an actual database URL, which would fail silently or with a confusing error when the container tries to start. The naming pattern is also inconsistent—adjacent env vars are clearly named USERNAME and PASSWORD. Consider renaming to OPENML_TEST_DATABASE_IMAGE to match its actual purpose.
🧰 Tools
🪛 GitHub Actions: Tests
[error] 48-48: FileNotFoundError during attempting to connect to Docker daemon. Docker API not reachable.
🤖 Prompt for AI Agents
In `@tests/conftest.py` around lines 48 - 52, Rename the misleading environment
variable OPENML_DATABASES_OPENML_URL to a name that reflects it holds a Docker
image (suggested OPENML_TEST_DATABASE_IMAGE) wherever it's used in
tests/conftest.py and related test configuration; update the call that passes
its value into MySqlContainer to read
os.environ.get("OPENML_TEST_DATABASE_IMAGE", "openml/test-database:20240105")
instead of the old name, and search/replace other references (tests, CI configs,
and docs) to the old variable so callers are updated and the naming is
consistent with OPENML_DATABASES_OPENML_USERNAME /
OPENML_DATABASES_OPENML_PASSWORD.
|
Caution Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted. Error details |
2 similar comments
|
Caution Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted. Error details |
|
Caution Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted. Error details |
|
Caution Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted. Error details |
3 similar comments
|
Caution Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted. Error details |
|
Caution Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted. Error details |
|
Caution Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted. Error details |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/conftest.py (1)
136-167:⚠️ Potential issue | 🟠 Major
persisted_flowinteracts subtly withexpdb_test's transaction management.
persisted_flowexplicitly callsexpdb_test.commit()(line 152) and laterexpdb_test.rollback()(line 156). Ifexpdb_testwraps the connection inengine.begin(), these explicit transaction controls interact with the CM's auto-commit/rollback in unpredictable ways. Switchingexpdb_testtoautomatic_rollback(as suggested above) would make these interactions clearer and safer.
🧹 Nitpick comments (2)
tests/conftest.py (2)
86-95: Fragile URL manipulation via chainedstr.replace.Lines 89–91 use three chained
str.replace()calls to transform the connection URL. Thereplace("openml_expdb", "openml")on line 91 is particularly risky — ifopenml_expdbappeared in any other part of the URL (e.g., a password or hostname), it would silently corrupt the connection string. Consider usingsqlalchemy.engine.make_url()to parse and safely modify the database name:Suggested approach
from sqlalchemy.engine import make_url raw = mysql_container.get_connection_url() parsed = make_url(raw).set(drivername="mysql+pymysql", database="openml") engine = sqlalchemy.create_engine(parsed)This same concern applies to
expdb_test(line 67) —make_urlwould be safer there too.
104-113: Fixture looks correct; minor type annotation nit.The dependency overrides and cleanup logic are sound. The return type
Generator[TestClient, None, None]works butIterator[TestClient]would be more idiomatic for a pytest fixture that only yields (nosend()).
Moved tests to using testcontainers. It does seem to work but it might be good to verify if this is what was intended. :)