feat(medcat-trainer): Add OTEL tracing. Add docker build cache by alhendrickson · Pull Request #364 · CogStack/cogstack-nlp

alhendrickson · 2026-03-09T18:43:25Z

What does this do?

This adds optional tracing to medcat-trainer

It makes a span for every API call recieved, and every request out that is made. Can optionally enable all the sql calls but it's super noisy.

I've also added more spans/detail/events for calls to SOLR, Medcat library, and medcat-service calls

Example

See calls to medcat-service. The trace propagates through so can see the underlying call to cat.get_entities is actually coming from the remote service

Call to SOLR - showing solr returns 404, but our code has a JSONDecodeError

…-service calls

mart-r

Looks good to me.

A comment about all the dependencies. I.e since I don't think they're main dependencies, we may want to group them together as an optional extra. But not sure its worth it.

mart-r · 2026-03-10T13:06:22Z

medcat-trainer/webapp/pyproject.toml

    "django-health-check>=3.22",
    "requests>=2.31",
+    "opentelemetry-sdk>=1.40.0",
+    "opentelemetry-distro[otlp]>=0.61b0",


Can these perhaps be set up as an optional extra? I.e deep-telemetry or something? And if that's enabled, you get all those along with all the detailed telemetry instead of the basic ones?

That would - of course - involve setting up another env var for this, reading it, and changing the install command based on it in the Dockerfile - so may not be worth it.

Hey I get this idea, but I don't think it's really worth it it in this exact scenario

I 100% agree with doing this if this if this was a library. However here I kind of don't want to support trainer being installed in any other way outside of the docker image we build, and I'd rather not make different docker images for these deps. Looking around most things just bake it into the image (eg keycloak, solr, prefect), they all just have it ready to be turned on in the image). The libraries promise to be small and noop though

In future I'd like to add tracing into the medcat library itself, so when that happens then I'd definitely go with your idea and make them an optional dependency there. May go with langfuse & would really not want to bring that dep in for the 99.999% of people that dont need it.

My main idea was to split into multiple images (with and without these extras). But you're right, it doesn't really make a lot of sense for us to manage multiple different types of images for the same thing (at least not at this time or for this specific reason).

With that said, we could still separate these to an optional extra. And since we control the way the image is built, we can just always include them.
Doesn't really change much other than just separating a bunch of these dependencies into a separate list. While this might be a little confusing as well, to me it would clear up the visible dependencies of the project. I.e the project doesn't necessarily depend on all of these. They're nice to haves, not necessities. We'll bake them in anyway, but it might be nice to separate the true dependencies (i.e something this wouldn't work without) from the nice-to-haves.

But this is still a matter of opinion. Having a separate list for packages that always get included anyway might also become a burden (i.e "which list do I add this new thing to?").

Cool! Yeah I see your point here, it makes a lot of sense to separate these a bit

I've updated here - I've put them into a different group (noting to include the deps directly imported), but still hardcoded in the image to include the extra ones. I'm thinking this is the mid path you're looking for

I've tried to follow https://opentelemetry.io/docs/concepts/instrumentation/libraries/#libraries-should-only-use-the-opentelemetry-api with the required deps. It says to bring in only opentelemetry-api at version 1.0.0 or higher. To note, this is likely to be the pattern followed if we were to also do this inside medcat, the 2 deps would probably be added for all users but they promise that they do nothing unless the opentelemetry-sdk is included too.

I've tried to follow https://opentelemetry.io/docs/concepts/instrumentation/libraries/#libraries-should-only-use-the-opentelemetry-api with the required deps. It says to bring in only opentelemetry-api at version 1.0.0 or higher. To note, this is likely to be the pattern followed if we were to also do this inside medcat, the 2 deps would probably be added for all users but they promise that they do nothing unless the opentelemetry-sdk is included too.

This might not be as relevant here as this is trainer - it's not a library someone else will install. But certainly will be for future MedCAT use cases.

Looks good to me overall!

…-service calls - cleanup pyproject.toml

…-service calls - cleanup pyproject.toml -uv lock

…-service calls - split deps into observability optional group

…semconv

alhendrickson · 2026-03-10T16:31:48Z

medcat-trainer/webapp/pyproject.toml

+torch = { index = "pytorch-cpu" }
+
+[[tool.uv.index]]
+name = "pytorch-cpu"


For future reference - This was added to fix this error. Open to this being the wrong approach, but this does fix it at least. Can't see a wheel being used in medcat-v2 to best understand what the right pattern is for it - but it feels reasonable to have to opt in to using that index, instead of eveyrthing trying to use it by default

(medcattrainer-webapp) vscode ➜ .../CogStack/cogstack-nlp/medcat-trainer/webapp (main) $ uv add opentelemetry-sdk Resolved 105 packages in 2.68s error: Distribution `markupsafe==3.0.3 @ registry+https://download.pytorch.org/whl/cpu/` can't be installed because it doesn't have a source distribution or wheel for the current platform hint: You're using CPython 3.13 (`cp313`), but `markupsafe` (v3.0.3) only has wheels with the following Python implementation tag: `cp314`

I think trainer is unlikely to be running in GPU enabled situations (at least for now). Plus, this will also make the image smaller since it won't be installing all the CUDA stuff.

With that said, it's one of those things I've seen before. Where uv picks a version for you to use that's not actually usable in your environment. If I recall, this is because uv (in order to be as quick as it is at resolving dependencies) will skip some (many? not sure) checks regarding specific versions that it picks for usage.

To me, it feels like a lot of newer python stuff (like uv) is built around people who are always on the latest python version. E.g when you build a new website from scratch that depends on very few things other than your web framework. And for those this will not be an issue. But because we depend on medcat, which in turn depends on a dozen other things, which brings in a lot of transitive dependencies as well, we end up a few versions behind on python.

alhendrickson added 3 commits March 9, 2026 18:42

feat(medcat-trainer): Add OTEL tracing. Add docker build cache

3fc8e9a

feat(medcat-trainer): Add telemetry spans for Solr, medcat and medcat…

88a2f72

…-service calls

feat(medcat-trainer): Add telemetry spans for Solr, medcat and medcat…

05daa01

…-service calls

alhendrickson marked this pull request as ready for review March 10, 2026 11:27

alhendrickson added 2 commits March 10, 2026 11:27

feat(medcat-trainer): Add telemetry spans for Solr, medcat and medcat…

3578429

…-service calls

feat(medcat-trainer): Add telemetry spans for Solr, medcat and medcat…

f637965

…-service calls

mart-r approved these changes Mar 10, 2026

View reviewed changes

alhendrickson added 6 commits March 10, 2026 15:24

feat(medcat-trainer): Add telemetry spans for Solr, medcat and medcat…

3125241

…-service calls - cleanup pyproject.toml

feat(medcat-trainer): Add telemetry spans for Solr, medcat and medcat…

efe6bab

…-service calls - cleanup pyproject.toml

feat(medcat-trainer): Add telemetry spans for Solr, medcat and medcat…

f0fe268

…-service calls - cleanup pyproject.toml -uv lock

feat(medcat-trainer): Add telemetry spans for Solr, medcat and medcat…

93233aa

…-service calls - split deps into observability optional group

feat(medcat-trainer): Update pyproject.toml with optional deps

024af74

feat(medcat-trainer): Update pyproject.toml with optional deps - add …

488604f

…semconv

alhendrickson commented Mar 10, 2026

View reviewed changes

alhendrickson merged commit b387581 into main Mar 10, 2026
10 checks passed

alhendrickson deleted the feat/medcat-trainer/add-otel-tracing branch March 10, 2026 16:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(medcat-trainer): Add OTEL tracing. Add docker build cache#364

feat(medcat-trainer): Add OTEL tracing. Add docker build cache#364
alhendrickson merged 11 commits intomainfrom
feat/medcat-trainer/add-otel-tracing

alhendrickson commented Mar 9, 2026 •

edited

Loading

Uh oh!

mart-r left a comment

Uh oh!

mart-r Mar 10, 2026

Uh oh!

alhendrickson Mar 10, 2026 •

edited

Loading

Uh oh!

mart-r Mar 10, 2026 •

edited

Loading

Uh oh!

alhendrickson Mar 10, 2026 •

edited

Loading

Uh oh!

mart-r Mar 10, 2026

Uh oh!

alhendrickson Mar 10, 2026 •

edited

Loading

Uh oh!

mart-r Mar 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

alhendrickson commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this do?

Example

Uh oh!

mart-r left a comment

Choose a reason for hiding this comment

Uh oh!

mart-r Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

alhendrickson Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mart-r Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alhendrickson Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mart-r Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

alhendrickson Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mart-r Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alhendrickson commented Mar 9, 2026 •

edited

Loading

alhendrickson Mar 10, 2026 •

edited

Loading

mart-r Mar 10, 2026 •

edited

Loading

alhendrickson Mar 10, 2026 •

edited

Loading

alhendrickson Mar 10, 2026 •

edited

Loading