Skip to content

Conversation

@justincdavis
Copy link

Summary

Add the backend kernel for ToDtype transform using CV-CUDA

How to use

import cvcuda
import torchvision.transforms.v2.functional as F

cvc_tensor = cvcuda.Tensor((1, 224, 224, 3), cvcuda.Type.U8, cvcuda.TensorLayout.NHWC)
# Dispatches to F.to_dtype_cvcuda
cvc_fp32_tensor = F.to_dtype(cvc_tensor, torch.float32)

@pytorch-bot
Copy link

pytorch-bot bot commented Nov 19, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/9278

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla
Copy link

meta-cla bot commented Nov 19, 2025

Hi @justincdavis!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

Copy link
Member

@AntoineSimoulin AntoineSimoulin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @justincdavis, thanks a lot for the PR. I left some comments and questions as a first review. Let me know what you think!

@AntoineSimoulin
Copy link
Member

@justincdavis could you complete the missing Contributor License Agreement (c.f. earlier comment from the meta-cla bot)?

Copy link
Member

@AntoineSimoulin AntoineSimoulin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @justincdavis, thanks for addressing my first round of comments. I had another pass. Will it be possible to have another iteration on the PR based on my new comments? Thanks a lot for your time here!

@meta-cla meta-cla bot added the cla signed label Dec 2, 2025
Copy link
Contributor

@zy1git zy1git left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

This is just a light pass of the review. Let me know what you think.

@justincdavis
Copy link
Author

Hi @zy1git thanks for the first pass! I have updated this PR to reflect the conventions of the flip PR, LMK what you think!

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the PR @justincdavis , I left a first pass

make_image_cvcuda,
marks=pytest.mark.skipif(not CVCUDA_AVAILABLE, reason="CVCUDA is not available"),
),
pytest.param(make_image_cvcuda, marks=CV_CUDA_TEST),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a not that you should be able to remove these changes once #9305 lands.

def test_functional_signature(self, kernel, input_type):
if kernel is F._misc._to_dtype_image_cvcuda:
input_type = _import_cvcuda().Tensor
check_functional_kernel_signature_match(F.to_dtype, kernel=kernel, input_type=input_type)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this test!

Comment on lines 2702 to 2705
if is_uint16_to_uint8:
atol = 255
elif is_uint8_to_uint16 and not scale:
atol = 255
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this 255 tol is needed because in torch, when scale is False, we're doing a brutal .to(dtype) which is going to cause a lot of overflows, whereas in CVCUDA you either cap the result or always scale?

I'm hoping we can simplify this a bit, potentially by dropping support for uint8 <-> uint16 conversions when scale is False on CV-CUDA. I feel like that's not a really valid conversion to support anyway. The general idea is that for all transforms, we'll want the CVCUDA backend to have very close results to the existing tensor backend. A difference of 255 is too large.

BTW, we should be able to set atol to 0 or 1 when is_uint16_to_uint8 and scale is True?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am looking into if we can reduce this more and if not I will update to drop support for uint16<->uint8 through cvcuda

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NicolasHug I made some changes to the atol calculations and dropped uint16->uint8 with scale=False from the CV-CUDA version. All atol values are <=1 now for all supported use cases. LMK if you want to see more changes/verification from this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants