cuda denoiser benchmarksImage and Video Denoiser on CUDA

Image/video denoisers are widely used in many applications. We have developed CUDA-accelerated denoiser kernels that run on existing CUDA hardware from NVIDIA. We have implemented both luma and chroma noise removal and got very high performance both for image and video processing.

cuda denoiser

CUDA Denoiser Library Features

  • Input format: 8/10/12/14/16-bit per channel input data array from CPU or GPU memory
  • Output format: 24/48-bit output data array in CPU or GPU memory
  • Denoising with 16/32-bit accuracy
  • Denoising algorithms
    • Wavelet denoiser (raw and rgb) CDF 5/3 and CDF 9/7 with Hard, Soft, Garrote thresholding
    • Bilateral denoiser
    • NLM denoiser
  • Compatibility with FastVCR software for machine vision cameras
  • Timing and performance measurements
  • Compatibility with Windows-10, Linux Ubuntu and L4T (Jetson)

Benchmarks for fast image and video denoiser on CUDA

Image resolution: 4112×2176 (8.9 MPix), 16-bit per channel, RGB

Test description: all data in GPU memory, timing includes GPU computations only

Wavelet transform: CDF 9/7
Number or DWT resolutions: up to 7
DWT thresholds for YCbCr: 80;150;150

NLM denoiser parameters: windows 3×3 and 5×5, strength 800
Bilateral denoiser parameters: 3×3, sigmaColor 5, sigmaSpace 500

Software: OS Windows-10, CUDA-12.3
Hardware: NVIDIA GeForce RTX 4090

  • RAW DWT denoiser – 1.8 ms (4.9 GPix/s)
  • RGB DWT denoiser – 3.05 ms (2.9 GPix/s)
  • NLM denoiser (RGB) - 1.92 ms (4.6 GPix/s)
  • Bilateral denoiser (RGB) - 1.21 ms (7.3 GPix/s)

The above results are comparable with the processing time of our best MG debayer algorithm which is around 1.05 ms (which is 8.5 GPix/s) for that image on that GPU.

We have designed that software as a part of our GPU Image & Video Processing SDK. Now our customers have opportunity to utilize GPU-accelerated denoiser in their applications as a part of general image processing pipeline.

Testing

To test our CUDA denoisers, please download FastVCR software.

CUDA-based denoising roadmap

  • Acceleration of NLM and Bilateral denoisers - in progress
  • Temporal denoiser on CUDA - in progress
  • Denoising algorithm which is based on "camera noise profile" and variance stabilizing transform (VST) - in progress
  • Total variation denoising (total variation regularization)

Contact Form

This form collects your name and email. Check out our Privacy Policy on how we protect and manage your personal data.