Diferență între revizuiri ale paginii „PC Lab 1”

De la WikiLabs
Jump to navigationJump to search
(Pagină nouă: Session 1, x86 optimization: C/C++: increase the execution speed for the code that reverses the order of the bits composing a 100 M esantioane of unsigned 32-bit (eg. 10111...11...)
 
 
(Nu s-au afișat 2 versiuni intermediare efectuate de același utilizator)
Linia 1: Linia 1:
Session 1, x86 optimization:  
+
'''Session 1, x86 optimization''':
  
 
C/C++: increase the execution speed for the code that reverses the order of the bits composing a 100 M esantioane of unsigned 32-bit (eg. 10111...11 -> 11..11101)
 
C/C++: increase the execution speed for the code that reverses the order of the bits composing a 100 M esantioane of unsigned 32-bit (eg. 10111...11 -> 11..11101)
Linia 5: Linia 5:
 
Expected/Presented techniques:  
 
Expected/Presented techniques:  
  
first implementation ~ 10 seconds
+
Grade 7/10: first implementation ~ 10 seconds
  
compiler optimized ~ 5 seconds
+
Grade 8/10: compiler optimized ~ 5 seconds
  
loop unrolling
+
Grade 9/10:
 +
loop unrolling ( > 1s )
  
bit-tricks
+
optimizing variables into registers ( > 1s )
  
optimizing variables into registers
+
Grade 10:  bit-tricks 500 ms
  
256-entry 8-bit table of 1 KB
+
Grade 10: 256-entry 8-bit table of 1 KB ~ 400 ms
  
16-bit table of 64 k entries (256 KB)
+
Grade 10: 16-bit table of 64 k entries (256 KB) ~ 200 ms
  
32-bit table of 4 G entries (16 GB)
+
32-bit table of 4 G entries (16 GB) ~ 66 ms

Versiunea curentă din 8 martie 2018 13:53

Session 1, x86 optimization:

C/C++: increase the execution speed for the code that reverses the order of the bits composing a 100 M esantioane of unsigned 32-bit (eg. 10111...11 -> 11..11101)

Expected/Presented techniques:

Grade 7/10: first implementation ~ 10 seconds

Grade 8/10: compiler optimized ~ 5 seconds

Grade 9/10: loop unrolling ( > 1s )

optimizing variables into registers ( > 1s )

Grade 10: bit-tricks 500 ms

Grade 10: 256-entry 8-bit table of 1 KB ~ 400 ms

Grade 10: 16-bit table of 64 k entries (256 KB) ~ 200 ms

32-bit table of 4 G entries (16 GB) ~ 66 ms