feat: Added cuda version selection to uv build. by BlueCrescent · Pull Request #433 · Modalities/modalities

BlueCrescent · 2026-02-18T18:56:08Z

What does this PR do?

Support different CUDA versions in uv installation.

General Changes

Added different optional dependency version for pytorch which use different index urls when used via uv.
Updated Readme.
Also added B200 to MFU calculator.

Checklist before submitting final PR

My PR is minimal and addresses one issue in isolation
I have merged the latest version of the target branch into this feature branch
I have reviewed my own code w.r.t. correct implementation, missing type hints, proper documentation, etc.
I have run a sample config for model training
I have checked that all tests run through (python tests/tests.py)
I have updated the internal changelog (CHANGELOG_DEV.md)

Copilot

Pull request overview

This pull request adds support for different CUDA versions in uv installation by moving PyTorch from a required dependency to an optional dependency with multiple CUDA-specific variants (cpu, cu126, cu128, cu130). The PR configures uv to use different PyTorch index URLs based on the selected CUDA variant.

Changes:

Moved torch from required dependencies to optional dependencies with CUDA version variants
Added uv-specific configuration for CUDA version conflicts and PyTorch index sources
Updated README installation instructions to reflect the new CUDA version selection

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File	Description
pyproject.toml	Removed torch from dependencies, added CUDA-specific torch extras (cpu, cu126, cu128, cu130), and configured uv for handling PyTorch installation from different index URLs
README.md	Updated installation commands to include CUDA version selection syntax for uv

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-18T19:00:50Z

pyproject.toml

+cpu = ["torch>=2.10,<2.11.0"]
+cu126 = ["torch>=2.10,<2.11.0"]
+cu128 = ["torch>=2.10,<2.11.0"]
+cu130 = ["torch>=2.10,<2.11.0"]


The torch dependency has been moved from the main dependencies to optional dependencies, which means torch is no longer installed by default. Users who install the package without specifying one of the CUDA extras (cpu, cu126, cu128, cu130) will not get PyTorch installed. This is a breaking change that should be documented in the CHANGELOG_DEV.md (which the PR description indicates as incomplete).

Consider whether this is the intended behavior. If users must explicitly choose a CUDA version, this should be clearly communicated in the documentation and changelog.

Copilot · 2026-02-18T19:00:50Z

README.md

 curl -LsSf https://astral.sh/uv/install.sh | sh

-uv sync
+uv sync --extra [cpu|cu126|cu128|cu130]  # Get CUDA version via nvidia-smi


The README shows installation commands using bracket notation [cpu|cu126|cu128|cu130], but this is not valid shell syntax. Users cannot literally type this command and expect it to work. The command should be clarified to show that users need to choose ONE option, for example:

uv sync --extra cpu

uv sync --extra cu126

uv sync --extra cu128

uv sync --extra cu130

Consider using a different notation or providing explicit examples rather than shell pipe syntax inside brackets.

Copilot · 2026-02-18T19:00:50Z

README.md


 # For developers: use [tests,linting] and install pre-commit hooks
-uv sync --extra tests --extra linting
+uv sync --extra [cpu|cu126|cu128|cu130] --extra tests --extra linting


The README shows installation commands using bracket notation [cpu|cu126|cu128|cu130], but this is not valid shell syntax. Users cannot literally type this command and expect it to work. The command should be clarified to show that users need to choose ONE option, for example:

uv sync --extra cpu --extra tests --extra linting

uv sync --extra cu126 --extra tests --extra linting

Consider using a different notation or providing explicit examples rather than shell pipe syntax inside brackets.

pyproject.toml

README.md

…optional dependencies

…ting PreTrainedModel.

rrutmann

In general the PR looks good, but I don't understand why we have the same requirements for all cuda versions and why we should differentiate them in this case. Only the index-urls are different, correct?

rrutmann · 2026-02-27T09:17:02Z

src/modalities/utils/mfu.py

 #
 # NOTE: These values are valid for fp16 and bf16 only
-PEAK_PERFORMANCE = {"A100": 312e12, "H100": 989e12, "GH200": 989e12}
+# FIXME: For B200, is 2.25 PFLOPS correct or should it be 4.5 PFLOPS?


Did you verify this value? Why are you considering different values?

On further research, it seems like this value should be correct and the 4.5 one is only for sparse operations:

https://www.tomshardware.com/pc-components/gpus/nvidias-next-gen-ai-gpu-revealed-blackwell-b200-gpu-delivers-up-to-20-petaflops-of-compute-and-massive-improvements-over-hopper-h100

https://www.cudocompute.com/blog/nvidias-blackwell-architecture-breaking-down-the-b100-b200-and-gb200

rrutmann · 2026-02-27T09:19:06Z

pyproject.toml

+  "torchvision",
+  "flash-attn==2.8.3; platform_system != 'Darwin' and platform_machine != 'aarch64'"
+]
+cu130 = [


Aren't the dependencies for all 3 cu the same? Why do we differentiate here?

Sadly, it seems like there is no way around this and this is the recommended way to do it with uv: https://docs.astral.sh/uv/guides/integration/pytorch/#configuring-accelerators-with-optional-dependencies

rrutmann · 2026-02-27T11:01:25Z

README.md


 # Install PyTorch, psutil, Ninja and Flash Attention
-pip install "torch<2.11.0"
+pip install "torch<2.11.0"  # Or appropriate version for your CUDA setup.


Suggested change

pip install "torch<2.11.0" # Or appropriate version for your CUDA setup.

pip install "torch>=2.10,<2.11.0" # Or appropriate version for your CUDA setup.

I don't understand the comment. The pytorch requirements are the same for all CUDA versions, or not?

The comment is about selecting the correct PyTorch index for the installed cuda version. I changed the comment to point that out clearly.

Co-authored-by: Richard Rutmann <97447451+rrutmann@users.noreply.github.com>

- also included torchvision in pip install instructions

2.25 seems to be the correct value for dense operations on B200. 4.5 is for sparse operations.

feat: Added cuda version selection to uv build.

7db8c74

BlueCrescent requested a review from Copilot February 18, 2026 18:56

Copilot started reviewing on behalf of BlueCrescent February 18, 2026 18:56 View session

Copilot AI reviewed Feb 18, 2026

View reviewed changes

BlueCrescent added 4 commits February 19, 2026 11:11

refactor: moved flash attention from dependencies into cuda specific …

af49ae3

…optional dependencies

fix: Added torchvision to dependencies due to import error when impor…

60aa4f4

…ting PreTrainedModel.

fix: set torchvision dependency to also use the correct index

68123ac

feat(utils): Added mfu support for B200.

11c40c3

rrutmann reviewed Feb 27, 2026

View reviewed changes

le1nux approved these changes Feb 27, 2026

View reviewed changes

BlueCrescent and others added 3 commits February 27, 2026 22:27

docs: forcing python version 2.10

ba96c89

Co-authored-by: Richard Rutmann <97447451+rrutmann@users.noreply.github.com>

docs: added explanation about pytorch index selection

002383f

- also included torchvision in pip install instructions

docs(mfu): removed B200 FIXME

7fd495c

2.25 seems to be the correct value for dense operations on B200. 4.5 is for sparse operations.

	pip install "torch<2.11.0" # Or appropriate version for your CUDA setup.
	pip install "torch>=2.10,<2.11.0" # Or appropriate version for your CUDA setup.

Conversation

BlueCrescent commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

General Changes

Checklist before submitting final PR

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

rrutmann left a comment

Choose a reason for hiding this comment

Uh oh!

rrutmann Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

BlueCrescent Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

rrutmann Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

BlueCrescent Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

rrutmann Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

rrutmann Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

BlueCrescent Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

BlueCrescent commented Feb 18, 2026 •

edited

Loading