cuda denoiser benchmarksImage and Video Denoiser on CUDA

Image/video denoisers are widely used in many applications. We have developed CUDA-accelerated denoiser kernels that run on existing CUDA hardware from NVIDIA. We have implemented both luma and chroma noise removal and got very high performance both for image and video processing.

CUDA Denoiser Library Features

  • Input format: 8/10/12/14/16-bit per channel input data array from CPU or GPU memory
  • Output format: 24/48-bit output data array in CPU or GPU memory
  • High quality and high speed denoising algorithms
  • GUI to show processed data via OpenGL with minimum latency
  • Timing and performance measurements
  • Compatibility with Windows-7/8/10 and Linux Ubuntu/CentOS

Benchmarks for fast image and video denoiser on CUDA

Images: 2K image (1920×1080, 24-bit) and 4K image (3840×2160, 24-bit)
Wavelet transform: CDF 9/7
Number or DWT resolutions: 7
DWT thresholds for YCbCr: 10;10;10
Test description: all data in GPU memory, timing includes GPU computations only
Software: OS Windows-10 (64-bit), CUDA-9.2
Hardware: CPU Intel Core i7-5930K (Haswell-E, 6 cores, 3.5–3.7 GHz), NVIDIA GeForce GTX 1080

  • 2K denoising time – 1.78 ms (3.3 GByte/s)
  • 4K denoising time – 5.84 ms (4.0 GByte/s)

We have designed that software as a part of our CUDA image processing SDK. Now our customers have opportunity to utilize CUDA-accelerated denoiser in their applications as a part of general image processing pipeline.


To test our CUDA-based denoiser, please download Fast CinemadDNG Processor software from the download page. We have implemented two types of denoisers: before demosaicing and after demosaicing. Currently that software is working with DNG series and you can get sample set of DNG images at the download page as well.

Contact Form

This form collects your name and email. Check out our Privacy Policy on how we protect and manage your personal data.