Add Triton installation for Windows CUDA, Linux ROCm/XPU#31
Add Triton installation for Windows CUDA, Linux ROCm/XPU#31cmdr2 merged 3 commits intoeasydiffusion:mainfrom
Conversation
|
Hi @iwr-redmond, since you mentioned ROCm support in Issue #5, could you help test if this installation logic works on your AMD setup? |
|
Thanks @godnight10061 ! You can also ask on the You may also want to provide a simple script file that they can run to test |
|
I'll give it a try with Windows 11 + WSL2 (Ubuntu) soon |
My RTX 4070 is as useful for testing ROCm as the second buggy in a one-horse town. |
|
Hi @cmdr2, this PR is ready for your review. I have implemented the automatic Triton installation logic for Windows (CUDA), Linux (ROCm), and Linux (XPU). To make verification easier for the community, I also added a built-in self-test command in the latest commit: python -m torchruntime test compile. This will automatically verify if Triton is correctly installed and functional with torch.compile. I have successfully smoke-tested this on Windows with an RTX 3060 Ti. since I lack AMD and Intel hardware, I’ve also reached out on Discord to find testers for the ROCm and XPU paths. |
|
Quick update: A community member just verified this on Linux Mint 22.1 (Python 3.9). The test compile command passed successfully on their system, confirming that the ROCm-based Triton installation logic works as expected on Debian-based Linux distributions. |
|
Merged thanks! |
|
Released in v2.0.0 |
Summary
cu*/nightly/cu*): installstriton-windowspytorch-triton-rocm(fromhttps://download.pytorch.org/whl)pytorch-triton-xpu(fromhttps://download.pytorch.org/whl)packaging(required bytorchruntime.platform_detection).Refs: #5
Why
torch.compile(and many third-party kernels) require Triton. On some platforms Torch bundles it (e.g. Linux CUDA), but on others users end up without Triton even after installing a GPU build of Torch.Implementation
torchruntime/installer.pyappends an extra pip install command for the platform-specific Triton package.Testing
python -m pytest -q.torch/torchvision/torchaudiofromhttps://download.pytorch.org/whl/cu128import tritonfailspython -m torchruntime install-> installstriton-windowstorch.compileCUDA smoke test -> OKRequest For Testing (hardware help wanted)
If you have one of these setups, please try:
python -m torchruntime installthen verifyimport tritonand run a smalltorch.compiletest.Also welcome: Windows CUDA users on different GPUs/Python versions.