I’m in the process of attempting to benchmark a multithreaded application on my new HP Proliant Server which has an 2x Opteron 6272 and 64gb of ram.
When I run the application on a desktop machine (a range of i7s and a Xeon X5675 processors) the application will cause all cores to hit a near 100% utilization.
When I run the application on my server, no matter how many threads I run, the total cpu utilization of the application hovers around 20-25%. That is if I’m running with 32 threads, all 32 cores will hang at around 20%, if I run 16 threads they’ll hang around 40%, and so on.
- At first I suspected this had to do with the operating system, so I
installed Windows 7 on the server so that the desktops and the
server had the same OS.
- Then I suspected it was the hardware, I changed the power management in the bios to High Performance. Even though this did increase the benchmark time, the same 20% utilization problem persists.
- I can get all 32 cores at 100% using the y-cruncher benchmark. My custom benchmark is written in .NET, could this possibly have anything to do with it?
I’m perplexed by this problem. Anyone have an idea of what could cause this?
If your app is processing a large amount of data, try following the data’s path – if the input data is fed from the network, check for possible latencies, bandwidth limitations or transmission errors. You already checked disk I/O which otherwise would be a likely candidate for a bottleneck.
Last but not least, since it is a highly multithreaded .NET application, you should make sure that server garbage collection is used, otherwise you might see weird load characteristics as described in this post from stack overflow.
- Running a VM on a hyperthreaded CPU
- Can I utilize the full performance of a hypervisor’s physical CPU cores when running a high CPU usage appliciation inside a VM?
- Can a Multi core CPU be configured to allow the OS to the see the cores as a single CPU
- vmware ESX 4 with AMD Opteron 2435 MHz 6 core CPU’s. CPU-Z reports core speed 528Mhz!
- CPU Utilization of a multithreaded process