Diferență între revizuiri ale paginii „PC Lab 6”
Cbira (discuție | contribuții) |
Cbira (discuție | contribuții) |
||
Linia 28: | Linia 28: | ||
'''Note''' In order to use the ACS GPGPU Cluster see [[Using ACS Cluster]] | '''Note''' In order to use the ACS GPGPU Cluster see [[Using ACS Cluster]] | ||
+ | |||
+ | '''GP-GPU Programming guide''' https://developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/CUDA_C_Programming_Guide.pdf | ||
+ | Please read chapters 1,2, 4,5 (skip Ch3 which is very particular to CUDA) | ||
'''Points (out of 10) vs. expected performance''': | '''Points (out of 10) vs. expected performance''': |
Versiunea de la data 26 aprilie 2018 16:31
Session 6
Task: run matrix-column normalization using OpenCL (https://www.khronos.org/opencl)
Matrix-column normalization means that, at the end of the process, every sum of squared elements on the same column is 1.
Example: Assuming matrix is
[ 1, 2 ] [ 3, 4 ]
the result of normalization is :
[ 0.3162 0.4472 ] [ 0.9487 0.8944 ]
That is: 0.3162 * 0.3162 + 0.9487 * 0.9487 = 1 and of course, 0.3162 / 0.9487 is kept as 1 / 3 ratio That is: 0.4472 * 0.4472 + 0.8944 * 0.8944 = 1 and of course, 0.4472 / 0.8944 is kept as 2 / 4 ratio
- Install opencl drivers for your platform
- Check what opencl-capable devices with command clinfo
- Run the VectorAddOpenCL app [[1]] to see that all works ok
- Implement the normalization operation on a CPU, for reference.
- Implement the normalization operation across 1 OpenCL thread of a single device. Check the result.
- Implement the normalization operation across multiple OpenCL threads of the same device. Check the result.
- How much faster is the OpenCL op performed on all threads vs. 1 thread on the same Open CL device ?
Note In order to use the ACS GPGPU Cluster see Using ACS Cluster
GP-GPU Programming guide https://developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/CUDA_C_Programming_Guide.pdf Please read chapters 1,2, 4,5 (skip Ch3 which is very particular to CUDA)
Points (out of 10) vs. expected performance:
[[]]