News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Interesting experiments on hyperthreading

Started by Biterider, August 07, 2021, 07:11:58 PM

Previous topic - Next topic

Biterider

Hi
This video https://www.youtube.com/watch?v=1i34ZOQ0OKo shows some cool experiments that reveal performance differences based on the code being executed.
Well worth watching.

Biterider

LiaoMi

Quote from: Biterider on August 07, 2021, 07:11:58 PM
Hi
This video https://www.youtube.com/watch?v=1i34ZOQ0OKo shows some cool experiments that reveal performance differences based on the code being executed.
Well worth watching.

Biterider

Hi Biterider,

Application to test what can make a program run slower if hyperthreading is enabled - https://github.com/tamas-p/ht
LLC Cache = 8192 KB
Number of physical cores = 4
Number of logical cores = 8
Hyperthreading is enabled: There are 2 logical cores per a physical core.
---------------------------------------------------------------------------
Thread-0 started doing memcpy operations (buffer length = cache_size / num_pcores / 2 = 1048576)
Thread-3 started doing memcpy operations (buffer length = cache_size / num_pcores / 2 = 1048576)
Thread-2 started doing memcpy operations (buffer length = cache_size / num_pcores / 2 = 1048576)
Thread-4 started doing memcpy operations (buffer length = cache_size / num_pcores / 2 = 1048576)
Thread-1 started doing memcpy operations (buffer length = cache_size / num_pcores / 2 = 1048576)
Thread-5 started doing memcpy operations (buffer length = cache_size / num_pcores / 2 = 1048576)
Thread-6 started doing memcpy operations (buffer length = cache_size / num_pcores / 2 = 1048576)
Thread-7 started doing memcpy operations (buffer length = cache_size / num_pcores / 2 = 1048576)
Thread-0: Completed operations = 12500
Thread-3: Completed operations = 12500
Thread-2: Completed operations = 12500
Thread-5: Completed operations = 12500
Thread-7: Completed operations = 12500
Thread-4: Completed operations = 12500
Thread-1: Completed operations = 12500
Thread-6: Completed operations = 12500
All completed operations = 100000
Elapsed time = 8133.088000 ms
Thread-0 started doing memcpy operations (buffer length = cache_size / num_pcores / 2 = 1048576)
Thread-1 started doing memcpy operations (buffer length = cache_size / num_pcores / 2 = 1048576)
Thread-3 started doing memcpy operations (buffer length = cache_size / num_pcores / 2 = 1048576)
Thread-2 started doing memcpy operations (buffer length = cache_size / num_pcores / 2 = 1048576)
Thread-0: Completed operations = 25000
Thread-1: Completed operations = 25000
Thread-2: Completed operations = 25000
Thread-3: Completed operations = 25000
All completed operations = 100000
Elapsed time = 3007.567000 ms
lcores_elapsedtime = 8133.088000
pcores_elapsedtime = 3007.567000
lcores_elapsedtime / pcores_elapsedtime = 2.704208
---------------------------------------------------------------------------
Thread-1 started doing FPU operations
Thread-3 started doing FPU operations
Thread-0 started doing FPU operations
Thread-4 started doing FPU operations
Thread-2 started doing FPU operations
Thread-5 started doing FPU operations
Thread-7 started doing FPU operations
Thread-6 started doing FPU operations
Thread-5: Completed operations = 12500
Thread-0: Completed operations = 12500
Thread-1: Completed operations = 12500
Thread-4: Completed operations = 12500
Thread-3: Completed operations = 12500
Thread-2: Completed operations = 12500
Thread-6: Completed operations = 12500
Thread-7: Completed operations = 12500
All completed operations = 100000
Elapsed time = 2538.762000 ms
Thread-0 started doing FPU operations
Thread-2 started doing FPU operations
Thread-3 started doing FPU operations
Thread-1 started doing FPU operations
Thread-2: Completed operations = 25000
Thread-0: Completed operations = 25000
Thread-3: Completed operations = 25000
Thread-1: Completed operations = 25000
All completed operations = 100000
Elapsed time = 4079.953000 ms
lcores_elapsedtime = 2538.762000
pcores_elapsedtime = 4079.953000
lcores_elapsedtime / pcores_elapsedtime = 0.622253
---------------------------------------------------------------------------
Thread-0 started doing integer operations
Thread-1 started doing integer operations
Thread-5 started doing integer operations
Thread-4 started doing integer operations
Thread-7 started doing integer operations
Thread-6 started doing integer operations
Thread-3 started doing integer operations
Thread-2 started doing integer operations
Thread-4: Completed operations = 12500
Thread-1: Completed operations = 12500
Thread-0: Completed operations = 12500
Thread-6: Completed operations = 12500
Thread-3: Completed operations = 12500
Thread-7: Completed operations = 12500
Thread-5: Completed operations = 12500
Thread-2: Completed operations = 12500
All completed operations = 100000
Elapsed time = 8807.504000 ms
Thread-0 started doing integer operations
Thread-1 started doing integer operations
Thread-3 started doing integer operations
Thread-2 started doing integer operations
Thread-3: Completed operations = 25000
Thread-0: Completed operations = 25000
Thread-1: Completed operations = 25000
Thread-2: Completed operations = 25000
All completed operations = 100000
Elapsed time = 7686.427000 ms
lcores_elapsedtime = 8807.504000
pcores_elapsedtime = 7686.427000
lcores_elapsedtime / pcores_elapsedtime = 1.145852
---------------------------------------------------------------------------




Project done during sophomore year of Assembly Programming that conducted tests in efficiency of OpenMP multithreading on an Intel Core machine, and scaling based on multiple sizes - https://github.com/erich1510/Research-in-Intel-Core-hyperthreading-on-matrix-vector-multiply

Tests the impact of multithreading on the performance of an application and the effects hyperthreading has on multithreading - https://github.com/djperrone/Hyperthreading

Hyperthreading
https://github.com/djperrone/Hyperthreading/blob/main/Hyperthreading.pdf
https://github.com/djperrone/Hyperthreading/blob/main/test/test/hyperthreading_test.csv

Researched and gave a presentation on multithreading, hyperthreading, and multicore CPUs. By David Perrone and Russel Royality.
Source code tests the impact of multithreading on the performance of an application and the effects hyperthreading has on multithreading

Source code can be found in test/test/src

Ran on Visual Studio Community 2019 on Windows 10

ctpl_stl.h is a thread pooling library created by Vitality Vitsentiy that allows for more efficient use of threads in C++

daydreamer

thanks Biterider,LiaoMi ,interesting :thumbsup:
so this proves for or against the oldest advice turning off HT on first p4's having them to get better performance on the old onethreaded games?or if you only have cpu with one physical core+HT?

my none asm creations
https://masm32.com/board/index.php?topic=6937.msg74303#msg74303
I am an Invoker
"An Invoker is a mage who specializes in the manipulation of raw and elemental energies."
Like SIMD coding