Diferență între revizuiri ale paginii „PC Lab 5”

Versiunea curentă din 19 aprilie 2018 16:04

Session 5

Task: run an open-source profiler (valgrind & gprof or visual studio) and improve performance of keypoint extraction in ASIFT C++ code

1. Download ASIFT project from here: http://www.ipol.im/pub/art/2011/my-asift/

2. Run demo_ASIFT with the two included Adams as input images from the Sixtine Chapel. Horizontal result should look like this:

3. Modify code to only do "compute_asift_keypoints" (matching is not interesting, since it was covered in the previous session)

4. Run the valgrind profiler

eg, for dummy program:

g++ -std=c++11 dummy.cpp -o dummy (compile program dummy)

valgrind --tool=callgrind ./dummy (run the program with callgrind; generates a file callgrind.out.12345 that can be viewed with kcachegrind)

kcachegrind whateverprofile.callgrind // open profile.callgrind with kcachegrind

5. Look over the report, and propose 3 leaf-functions (functions that do not call other functions) for offloading towards a coprocessor. Write the reason for choosing each of them, and how much time is gained by offloading them. Assume coprocessor works at infinite clock, but data is transferred at 200 GB/s. Hint: Use the callgraph (by installing the graphiviz package). Keep a snapshot with the analysis report as proof. Send results/comments/snapshot(s) by e-mail to the teacher.

Note: Valgrind is also great for checking memory leaks:

valgrind --leak-check=full <path>

valgrind --tool=memcheck <path>

Points (out of 10) vs. expected performance:

10 points for identifying 3 most-heavy leaf-functions and correct (within 10%) computation for offloading impact. DESCRIBE_INSTR.

9 points for identifying 3 most-heavy leaf-functions and acceptable (within 20%) computation for offloading impact. DESCRIBE_INSTR.

8 points for identifying 2 most-heavy leaf-functions and correct (within 10%) computation for offloading impact. DESCRIBE_INSTR.

7 points for identifying 2 most-heavy leaf-functions and resonable (within 30%) computation for offloading impact. DESCRIBE_INSTR.

6 points for identifying 1 most-heavy leaf-functions and resonable (within 30%) computation for offloading impact. DESCRIBE_INSTR.

5 points for identifying 1 most-heavy leaf-functions and coarse (within 50%) computation for offloading impact. DESCRIBE_INSTR.

DESCRIBE_INSTR = Write as function prototype with result, name, operand number, types and size (similar to Intel Intrinsics Guide). Write a natural-language description of the behaviour (or alternatively, the formal description as in Intel Intrinsics Guide)

Fișier:Callgrind.out.20485.zip

@@ Linia 4: / Linia 4: @@
 '''Task: run an open-source profiler (valgrind & gprof or visual studio) and improve performance of keypoint extraction in ASIFT C++ code'''
-. Download ASIFT project from here: http://www.ipol.im/pub/art/2011/my-asift/
+'''1.''' Download ASIFT project from here: http://www.ipol.im/pub/art/2011/my-asift/
-. Run demo_ASIFT with the two included Adams as input images from the Sixtine Chapel. Horizontal result should look like this: [[Fișier:Hadam.png]]
-. Modify code to only do "compute_asift_keypoints" (matching is not interesting, since it was covered in the previous session)
-. Run the valgrind profiler
-eg, for cpuload program:
+'''2.''' Run demo_ASIFT with the two included Adams as input images from the Sixtine Chapel. Horizontal result should look like this: [[Fișier:Hadam.png]]
-g++ -std=c++11 cpuload.cpp -o cpuload (compile program cpuload)
+'''3.''' Modify code to only do "compute_asift_keypoints" (matching is not interesting, since it was covered in the previous session)
-valgrind --tool=callgrind ./cpuload  (run the program with callgrind; generates a file callgrind.out.12345 that can be viewed with kcachegrind)
+'''4.''' Run the valgrind profiler
-kcachegrind profile.callgrind // open profile.callgrind with kcachegrind
+eg, for dummy program:
+g++ -std=c++11 dummy.cpp -o dummy (compile program dummy)
+valgrind --tool=callgrind ./dummy  (run the program with callgrind; generates a file callgrind.out.12345 that can be viewed with kcachegrind)
+kcachegrind whateverprofile.callgrind // open profile.callgrind with kcachegrind
+'''5.''' Look over the report, and propose 3 leaf-functions (functions that do not call other functions) for offloading towards a coprocessor. Write the reason for choosing each of them, and how much time is gained by offloading them. Assume coprocessor works at infinite clock, but data is transferred at 200 GB/s. Hint: Use the callgraph (by installing the graphiviz package). Keep a snapshot with the analysis report as proof. Send results/comments/snapshot(s) by e-mail to the teacher.
 '''Note''': Valgrind is also great for checking memory leaks:
@@ Linia 24: / Linia 29: @@
-Points (out of 10) vs. expected performance ():
+'''Points (out of 10) vs. expected performance''':
+points for identifying 3 most-heavy leaf-functions and correct (within 10%) computation for offloading impact. DESCRIBE_INSTR.
+points for identifying 3 most-heavy leaf-functions and acceptable (within 20%) computation for offloading impact. DESCRIBE_INSTR.
+points for identifying 2 most-heavy leaf-functions and correct (within 10%) computation for offloading impact. DESCRIBE_INSTR.
+points for identifying 2 most-heavy leaf-functions and resonable (within 30%) computation for offloading impact. DESCRIBE_INSTR.
+points for identifying 1 most-heavy leaf-functions and resonable (within 30%) computation for offloading impact. DESCRIBE_INSTR.
+points for identifying 1 most-heavy leaf-functions and coarse (within 50%) computation for offloading impact. DESCRIBE_INSTR.
+DESCRIBE_INSTR = Write as function prototype with result, name, operand number, types and size (similar to Intel Intrinsics Guide). Write a natural-language description of the behaviour (or alternatively, the formal description as in Intel Intrinsics Guide)
-TBD
+[[Fișier:Callgrind.out.20485.zip]]

Diferență între revizuiri ale paginii „PC Lab 5”

Versiunea curentă din 19 aprilie 2018 16:04

Meniu de navigare

Page actions

Page actions

Unelte personale

Navigare

Căutare

Unelte