Diferență între revizuiri ale paginii „PC Lab 6”

Versiunea curentă din 26 aprilie 2018 17:43

Session 6

Task: run matrix-column normalization using OpenCL (https://www.khronos.org/opencl)

Matrix-column normalization means that, at the end of the process, every sum of squared elements on the same column is 1.

Example: Assuming matrix is

[ 1, 2 ] 
[ 3, 4 ]

the result of normalization is :

[ 0.3162     0.4472 ]
[ 0.9487     0.8944 ]

That is: 0.3162 * 0.3162 + 0.9487 * 0.9487 = 1 and of course, 0.3162 / 0.9487 is kept as 1 / 3 ratio
That is: 0.4472 * 0.4472 + 0.8944 * 0.8944 = 1 and of course, 0.4472 / 0.8944 is kept as 2 / 4 ratio

Install opencl drivers for your platform
Check what opencl-capable devices with command clinfo
Run the VectorAddOpenCL app [[1]] to see that all works ok.
Implement the normalization operation on a CPU, for reference purposes.
Implement the normalization operation across 1 OpenCL thread of a single device. Check the result against CPU.
Implement the normalization operation across multiple OpenCL threads of the same device. Check the result against CPU.
How much faster is the OpenCL op performed on all threads vs. 1 thread on the same Open CL device ?
Send e-mail to the teacher, with subject PAO_Lab_6, x86 CPU configuration (eg. i7-2670QM 4C/8T @ 2.2 GHz) and GPU configuration (nVidia GT 540M / 96 CudaCores @ 1344 MHz)

Note In order to use the ACS GPGPU Cluster see Using ACS Cluster

Please read chapters 1,2, 4,5 (skip Ch3 which is very particular to CUDA) [GP-GPU Programming guide [2]]

Points (out of 10) vs. expected performance:

[[]]

@@ Linia 9: / Linia 9: @@
   [ 1, 2 ]
   [ 3, 4 ]
 the result of normalization is :
+ [ 0.3162     0.4472 ]
+ [ 0.9487     0.8944 ]
+* That is: 0.3162 * 0.3162 + 0.9487 * 0.9487 = 1 and of course, 0.3162 / 0.9487 is kept as 1 / 3 ratio
+* That is: 0.4472 * 0.4472 + 0.8944 * 0.8944 = 1 and of course, 0.4472 / 0.8944 is kept as 2 / 4 ratio
 # Install opencl drivers for your platform
-# Check opencl capable devices with command clinfo
+# Check what opencl-capable devices with command '''clinfo'''
-# Run the VectorAddOpenCL app [[http://wiki.dcae.pub.ro/index.php/VectorAddOpenCL.cpp]]
+# Run the VectorAddOpenCL app [[http://wiki.dcae.pub.ro/images/4/4a/VectorAddOpenCL.cpp]] to see that all works ok.
+# Implement the normalization operation on a CPU, for reference purposes.
+# Implement the normalization operation across 1 OpenCL thread of a single device. Check the result against CPU.
+# Implement the normalization operation across multiple OpenCL threads of the same device. Check the result against CPU.
+# How much faster is the OpenCL op performed on all threads vs. 1 thread on the same Open CL device ?
+# Send e-mail to the teacher, with subject PAO_Lab_6, x86 CPU configuration (eg. i7-2670QM 4C/8T @ 2.2 GHz) and GPU configuration (nVidia GT 540M / 96 CudaCores @ 1344 MHz)
+'''Note''' In order to use the ACS GPGPU Cluster see [[Using ACS Cluster]]
+Please read chapters 1,2, 4,5 (skip Ch3 which is very particular to CUDA)
+['''GP-GPU Programming guide''' [https://developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/CUDA_C_Programming_Guide.pdf]]

Diferență între revizuiri ale paginii „PC Lab 6”

Versiunea curentă din 26 aprilie 2018 17:43

Meniu de navigare

Page actions

Page actions

Unelte personale

Navigare

Căutare

Unelte