Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update range of gpu arch #23309

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Update range of gpu arch #23309

wants to merge 3 commits into from

Conversation

yf711
Copy link
Contributor

@yf711 yf711 commented Jan 9, 2025

Description

Remove deprecated gpu arch and reduce nuget/python package size (latest TRT supports sm75 Turing and newer arch)

Test on pkg CI Python-cuda12 Nuget-cuda12
Before Linux: 279MB Win: 267MB Linux: 247MB Win: 235MB
After Linux: 174MB Win: 162MB Linux: 168MB Win: 156MB

Motivation and Context

snnn
snnn previously approved these changes Jan 9, 2025
@tianleiwu
Copy link
Contributor

If we drop older arch, shall we also drop ort package for cuda 11.8 in next release?

@snnn
Copy link
Member

snnn commented Jan 9, 2025

If we drop older arch, shall we also drop ort package for cuda 11.8 in next release?

I highly recommend doing so. Now we only have two people working on build pipelines. We should focus more on the main targets.

@yf711 yf711 requested a review from jywu-msft January 10, 2025 00:32
@snnn
Copy link
Member

snnn commented Jan 10, 2025

/azp run Win_TRT_Minimal_CUDA_Test_CI

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

snnn
snnn previously approved these changes Jan 10, 2025
@yf711
Copy link
Contributor Author

yf711 commented Jan 11, 2025

After testing, adding sm90 to build arch list is causing issues to cuda 11.8+cudnn8 alt pkg build on windows,
which is likely because cudnn8 is deprecated by blackwell. cuda 12 pkg build is not affected.

To support sm90, we can choose to support cuda12 only, or we might need to update current cuda 11.8 env with cudnn9

@snnn
Copy link
Member

snnn commented Jan 11, 2025

CUDA 11.8 with cudnn9 doesn't work. I tried.

I hit the following compilation error when compiling cudnn_flash_attention.cu

/build/Release/_deps/cudnn_frontend-src/include/cudnn_frontend/graph_interface.h:519:27:   required from here
/build/Release/_deps/cudnn_frontend-src/include/cudnn_frontend/thirdparty/nlohmann/json.hpp:9132:68: error: static assertion failed: Missing/invalid function: bool boolean(bool)
 9132 |     static_assert(is_detected_exact<bool, boolean_function_t, SAX>::value,

Therefore, I suggest giving up on that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants