The source code used here can be found at: https://github.com/ravkum/openclDataAndKernelParallelExecution

This blog assumes that the reader has understanding of OpenCL and is aware of host and device programs. I am taking AMD devices as an example but the concept applies to all OpenCL devices.

There are two kinds of GPU devices:

1) APU, where the GPU is integrated with CPU (and so is called iGPU). An example of this is AMD Ryzen series APU. Here the CPU and GPU share the virtual memory address space so memory transfer is not required. ZeroCopy buffers should be used in this case. …

Ravi Kumar

GPGPU programmer with focus on AI.. An avid reader..