2639 frames 18.604 seconds 141.85 fps ( 7.05 ms/f) 11.984 fps variability
OS: Ubuntu 14.04 LTS
CPU: i5 6400
GPU: AMD R9 380
8 Gb DDR4
ASRock B150M PRO4S
Resolution: 1920x1080
Full screen
Comanglia high quality pc config
2639 frames 18.604 seconds 141.85 fps ( 7.05 ms/f) 11.984 fps variability
OS: Ubuntu 14.04 LTS
CPU: i5 6400
GPU: AMD R9 380
8 Gb DDR4
ASRock B150M PRO4S
Resolution: 1920x1080
Full screen
Comanglia high quality pc config
2639 frames 7.477 seconds 352.97 fps ( 2.83 ms/f) 25.239 fps variability
CPU: Intel Core i7-5930K 4.6Ghz
RAM: DDR4 CL15 2600mhz 16GB
GPU: Asus 680GTX Top
OS: Windows 7
Driver version: 364.72
Dxlevel: 81
Res: 1920x1080 Fullscreen
FPS configs: FPS Comanglia, nohatsmod
http://abload.de/img/bench1l1ksr.jpg
2639 frames 7.099 seconds 371.75 fps ( 2.69 ms/f) 27.637 fps variability
@ 640x480 Res. :D
2639 frames 20.164 seconds 130.88 fps ( 7.64 ms/f) 11.199 fps variability
CPU and overclock: AMD FX6300 @ Stock (3.5 ghz)
Graphics Card: GeForce GT 730
OS: Windows 10
Driver version: 361.91
dxlevel (default is 90): 81
Resolution: 1280x720
Full-screen or windowed: Full
FPS configs enabled: Chris maxframes
Shadows enabled/disabled: disabled
Additional notes:
No hat mod
No explosion smoke script
One monitor
Believe it or not, my frames were slightly better on 1920x1080:
2639 frames 19.987 seconds 132.03 fps ( 7.57 ms/f) 12.709 fps variability
2639 frames 12.157 seconds 217.08 fps ( 4.61 ms/f) 13.798 fps variability
CPU and overclock: Intel Core i5 6600k @3.5-3.9 GHz (stock)
Graphics Card: Asus R9 270X
Ram: G.Skill 16GB DDR4-2133
OS: Windows 10
Driver version: 16.5.1
dxlevel (default is 90): 81
Resolution: 1280x720
Full-screen or windowed: Full
FPS configs enabled: comanglia high quality pc config
Shadows enabled/disabled: disabled
Additional notes:
single monitor
no hats mod
pvhud
sapper particle explosions
2118 frames 116.556 seconds 18.17 fps (55.03 ms/f) 1.554 fps variability
2639 frames 13.06 seconds 202.07 fps (4.94ms/f) 7.04 fps variability
cpu: fx 8350 @5.0ghz
ram: ddr3 cl6 800mhz 16gb
gpu: msi gtx 960 4g
os: windows 10
dxlevel: 80
res: 1920x1080 fullscreen
fps configs: comanglia high quality pc
x4 760k oc'd to 4.2ghz
r7 370 4gb (slight oc)
dxlevel (default is 90): 81
Resolution: 1920x1080
Full-screen or windowed: Full
FPS configs enabled: Comanglia's stability
Shadows enabled/disabled: disabled
notes: 2 monitors, one 144hz
2639 frames 25.788 seconds 102.33 fps ( 9.77 ms/f) 8.089 fps variability
That can't be right, I use intel hd 530 and get 107fps
i3-4030U @ 1.90GHz
Intel HD Graphics Family (school laptop)
dxlevel: 81
Resolution: 640x480
Fullscreen
Comanglia's cfg
shadows disabled
results average from 3 tries
2639 frames 30.107 seconds 87.71 fps (11.14 ms/f) 5.820 fps variability
panda106That can't be right, I use intel hd 530 and get 107fps
The CPU, AMD Athlon X4 760k isn't a strong CPU, so it's reasonable that the framerates are low, since tf2 isn't gpu dependent, rather cpu.
#344
TF2 kills IPC, only clockrate matters.
Also why are you so salty about him getting a better results than yours?
If you had read the post properly you would've seen that it was on 640x480. He's also using a different cfg. Try running the benchmark 640x480 and you'll get a better result, although it'll be mostly due to RAM.
Did you even test -threads? Because
1. Where do you think a performance gain from more than 8 threads is supposed to come from on a CPU that can only run 8 threads concurrently?
2. If you had actually tested it you would've seen that there's no benefit beyond 2 threads.
Anyway the reason why I'm posting today is that I've looked at TF2 with VTune a bit a while ago and have given up on making a nice post with pictures. I'll just list what I've found out.
1. TF2 can indeed use at least 8 threads.
2. Only 3 of these threads actually matter. The execution time is split about 3:1:1 or 60%/20%/20% between those 3 threads. All other threads combined are usually <10%.
3. IPC is pretty bad, the threads that matter are <0.8 (>1.2 CPI). What appears to be the main game loop is properly terrible at <0.15 IPC (>6.7 CPI), but there are some that are even worse than that.
Some more details regarding 2.:
The most important thread contains the engine / main game loop. The second and third are rendering and the GPU driver. The others seem to be independent tasks that were easy to split off like audio. The overhead on those doesn't really seem to be worth it though. Back when I benchmarked -threads 2 (the GPU driver isn't counted) gave me the best results, beyond that it got steadily worse. It was within margin of error though, that's why didn't post it originally. I couldn't make sense of it at the time.
This also explains most of my previous results. The 33% or 1/3 gain in fps from enabling "multicore rendering" is in line with 1/4 of the work being split off into a seperate thread (1/0.75 = 1.33). The gain in performance when running single threaded or when adding a third core while using -threads 2 (driver + rendering together don't max out one core) can be explained with chache / branch predictor thrashing that occurs when multiple threads run on the same core (in case of single threaded game + driver).
In summary Valve didn't parallelize anything, they just went for the low hanging fruit and split off what was easy to split off. If you were able to split up the rest perfectly (those 60%) we could see 3 times the performance although you'd need at least 5 cores (more likely 6). A more realistic goal would be to parallelize just a portion of the code but even that could double the fps.
It's not too bad though. Comparing with Crysis (same year, multithreading also added later via patch) it's only slightly worse and most of that is due to Crysis also splitting off physics which aren't nearly as big of a deal in TF2.
Regarding 3.:
If I had to guess I'd blame this on Visual Studio. The functions with terrible IPC seem to be mostly bound by the microcode sequencer. In other words there's some terribly CISCy instructions in there. Crysis suffers just as much from cache and DTLB misses and branch mispredicts are even worse than in TF2 but it still gets signficantly higher IPC because it's not bound by the microcode sequencer nearly as much (0.5% vs 3%, but most of that is in the main loop which is what holds TF2 back so it's worse). There also seem to be some terrible dependency chains, which again probably wouldn't appear if they used a different compiler.
Realistically speaking just switching the compiler could improve performance by 50%.
EDIT:
A certain someone posted this on reddit so I'm copying one of my replies here because it does contain some information why I think a different compiler would help:
diegodamohillSo... I bet there's more to it, if it was simple as that these problems wouldn't exist anymore, so i see three explanations:
1 - TF2 dev team is just dumb, so they didn't noticed/knew how to fix it.
2 - The actual code is more complicated and messed than a banking software made in pascal, so it would cost too much (or even impossible) to correct everything, considering they have only a handfull of people
3 - TF2 dev team doesn't care, and we are doomed.
take your bets boys.
SetsulIt's a mix of all 3 (which one in brackets). That version of the source engine was originally single threaded, remember "multicore rendering" got added through an update later. So parallelizing anything that's not completely independent is a massive pain in the ass because the engine isn't built to deal with locks and race conditions (2).
You could probably still find some parallelism or just bite the bullet and do what is essentially rewriting parts of the engine. But anyone who knows (1) or is willing to do that (3) is probably working at Source 2. Since there were rumours of TF2 being ported and Source 2 being especially built to be able to semi-automatically port Source 1 games I'd expect their reasoning to be "It's not worth putting any effort into making multithreading in Source 1 better, just improve Source 2 and get it ready asap so all games can enjoy the benefits" (3). It does make sense and I can't blame them for it, Source 2 being Valve-time late does make it a bit awkward though.
The last thing about the compiler is similar. They are using Visual Studio if I'm not mistaken (pretty much everyone does) and there's nothing wrong with it, but the compiler just performs horribly at times. Most of the time it does a reasonably good job, so it's probably a matter of "if it ain't broken don't fix it" (3). Still when I looked at it the parts that hold performance back are not so much due to stalling but because there is almost no ILP to be found. The best example is what I think might be the main game loop with its terrible 6.7+ CPI. Sure it's stalling 50% of the time which isn't good but during the 50% that it isn't stalling >90% of the cycles only one instruction gets done. I can't help but wonder what would happen if they used the Intel compiler with Visual Studio (which is possible).
EDIT: previous posts:
http://www.teamfortress.tv/post/488391/tf2-benchmarks
http://www.teamfortress.tv/post/530699/tf2-benchmarks
Can anyone do reliable benchmarks before/after update? Curious how much the performance changes actually do
Checking in once again after updates.
CPU and overclock: i7 2700k @ 4.6Ghz
Graphics Card: GTX 980ti
Driver version: 368.69
dxlevel: 90
Resolution: 1920x1080
Full-screen or windowed: Windowed Borderless
Config: Personal config emphasizing fps
Extras: Have 2 monitors in total hooked up to the same graphics card
Running Windows 10 (was running 7 in the last two benchmarks before this one).
Result 1 (9 months ago CPU@4.2Ghz, GFX 660ti):
2639 frames 13.608 seconds 193.92 fps ( 5.16 ms/f) 14.320 fps variability
2639 frames 13.599 seconds 194.06 fps ( 5.15 ms/f) 13.640 fps variability
2639 frames 13.610 seconds 193.90 fps ( 5.16 ms/f) 14.000 fps variability
Result 2 (7 months ago CPU@4.3Ghz, GFX 660ti):
2639 frames 13.112 seconds 204.38 fps ( 4.89 ms/f) 14.957 fps variability
2639 frames 13.013 seconds 203.80 fps ( 4.93 ms/f) 14.336 fps variability
2639 frames 13.158 seconds 204.56 fps ( 4.99 ms/f) 14.106 fps variability
Newest result (after MM update CPU@4.6Ghz, GFX 980ti):
2639 frames 12.887 seconds 304.79 fps ( 4.88 ms/f) 22.259 fps variability
2639 frames 12.581 seconds 305.23 fps ( 4.91 ms/f) 22.301 fps variability
2639 frames 12.208 seconds 304.90 fps ( 4.73 ms/f) 22.756 fps variability
beastieChecking in once again after updates.
CPU and overclock: i7 2700k @ 4.6Ghz
Graphics Card: GTX 980ti
Driver version: 368.69
dxlevel: 90
Resolution: 1920x1080
Full-screen or windowed: Windowed Borderless
Config: Personal config emphasizing fps
Extras: Have 2 monitors in total hooked up to the same graphics card
Running Windows 10 (was running 7 in the last two benchmarks before this one).
Result 1 (9 months ago CPU@4.2Ghz, GFX 660ti):2639 frames 13.608 seconds 193.92 fps ( 5.16 ms/f) 14.320 fps variability 2639 frames 13.599 seconds 194.06 fps ( 5.15 ms/f) 13.640 fps variability 2639 frames 13.610 seconds 193.90 fps ( 5.16 ms/f) 14.000 fps variability
Result 2 (7 months ago CPU@4.3Ghz, GFX 660ti):2639 frames 13.112 seconds 204.38 fps ( 4.89 ms/f) 14.957 fps variability 2639 frames 13.013 seconds 203.80 fps ( 4.93 ms/f) 14.336 fps variability 2639 frames 13.158 seconds 204.56 fps ( 4.99 ms/f) 14.106 fps variability
Newest result (after MM update CPU@4.6Ghz, GFX 980ti):2639 frames 12.887 seconds 304.79 fps ( 4.88 ms/f) 22.259 fps variability 2639 frames 12.581 seconds 305.23 fps ( 4.91 ms/f) 22.301 fps variability 2639 frames 12.208 seconds 304.90 fps ( 4.73 ms/f) 22.756 fps variability
That's quite an improvement! The .3ghz increase in CPU speed and any other changes in your setup definitely clouds how much the optimizations are to thank, though.
Using my own stress-test demo (taken from spectating "action" in a 32 player pl server for ~2 minutes), I went from about 70 to 80 average FPS with an i7 4790k at 4.5ghz OC.
stabbyThat's quite an improvement! The .3ghz increase in CPU speed and any other changes in your setup definitely clouds how much the optimizations are to thank, though.
Using my own stress-test demo (taken from spectating "action" in a 32 player pl server for ~2 minutes), I went from about 70 to 80 average FPS with an i7 4790k at 4.5ghz OC.
I also forgot to add that I switched from an HDD to an SSD and one of my more RAM savvy friends fixed my timings which I hope helped boost performance.
2639 frames 29.505 seconds 89.44 fps (11.18 ms/f) 5.254 fps variability
2639 frames 40.645 seconds 64.93 fps (15.40 ms/f) 4.321 fps variability
Texture detail low vs medium (as locked by mm) thanks valve
CPU and overclock: Intel Core i7-3630QM CPU @ 2.40 GHz
Graphics Card: Intel HD Graphics 4000 (lol)
dxlevel: 90
Resolution: 1024x768
Full-screen or windowed: Full
FPS configs enabled: no
Shadows enabled/disabled: disabled
Additional notes: I basically have everything as low as possible without using a cfg (I had one for a while but I removed it after mym)
2639 frames 21.255 seconds 124.16 fps ( 8.05 ms/f) 11.896 fps variability
CPU and overclock: Intel Pentium G3258 @ 4.0 GHz (OC)
Graphics Card: Asus GTX 750 Ti
dxlevel: 90
Resolution: 1024x768
Full-screen or windowed: Full
FPS configs enabled: maxframes, hrblsnohatsmod.14.05.2016, headsfeet05.04
Shadows enabled/disabled: disabled
Did it for another thread so might as well put it here.
2639 frames 10.686 seconds 246.95 fps ( 4.05 ms/f) 16.403 fps variability
Specs are:-
i5 4670K @3.8GHz
R9 280X Vapor X Tri X (Driver version: 16.200.1035.1001)
16GB G Skill Ripjaws 1600Mhz
465GB Samsung SSD
Windows 10 64bit
Game settings are:-
Dxlevel 81
1920x1080 144hz
Fullscreen
Comaglias High end config (with stabbys edits)
Nohatsmod, Nohateffectsmod, No explosion smoke script
STOGEDid it for another thread so might as well put it here.
2639 frames 10.686 seconds 246.95 fps ( 4.05 ms/f) 16.403 fps variability
Specs are:-
i5 4670K @3.8GHz
R9 280X Vapor X Tri X (Driver version: 16.200.1035.1001)
16GB G Skill Ripjaws 1600Mhz
465GB Samsung SSD
Windows 10 64bit
Game settings are:-
Dxlevel 81
1920x1080 144hz
Fullscreen
Comaglias High end config (with stabbys edits)
Nohatsmod, Nohateffectsmod, No explosion smoke script
245 with picmic -1 on 1080p?
2639 frames 35.905 seconds 73.50 fps (13.61 ms/f) 6.019 fps variability
CPU and overclock: Intel i3-4010U @ 1.7Ghz
Graphics Card: NVidia GeForce 820m 2GB
Driver version: 368.39
dxlevel (default is 90): 81
Resolution: 1366x768
Full-screen or windowed: full
FPS configs enabled: Comanglia's Stability Config
Shadows enabled/disabled: disabled
Additional notes: n/a
2639 frames 14.701 seconds 179.51 fps ( 5.57 ms/f) 10.186 fps variability
CPU and overclock:Intel Xeon E5-2670 v3 ES 12 core @ 2.6ghz HT disabled
Graphics Card: ASUS GTX 780 Ti DirectCU II
Ram:16gb DDR4 2400 dual channel
OS: Windows 10 Enterprise 64 bit
Driver version: 368.81
dxlevel: 98
Resolution: 1920x1080 144hz
Full-screen or windowed: Full
FPS configs enabled: Comanglia' frames config medium models very high textures
Shadows enabled/disabled: disabled
Stole it, doesnt seem bad
CPU: Pentium g840 2.8Ghz
GPU: GT440
RAM: 4GB 1333MHZ
OS: WIN7
DRIVER: Latest yay
DX: 80
@1080
2639 frames 22.395 seconds 117.84 fps ( 8.49 ms/f) 13.655 fps variability
2639 frames 28.416 seconds 92.87 fps (10.77 ms/f) 4.944 fps variability
CPU and overclock: Core i5-3350P @ 3.1 GHz
Graphics Card: Radeon R9 280X
Driver version: Crimson 16.7.3
dxlevel (default is 90): 95
Resolution: 1680x1050
Full-screen or windowed: Full
FPS configs enabled: none (settings: http://i.imgur.com/z7fN5kq.png)
Shadows enabled/disabled: enabled
I'll post one with configs in a while.
2639 frames 20.564 seconds 128.33 fps ( 7.79 ms/f) 7.569 fps variability
CPU and overclock: Core i5-3350P @ 3.1 GHz
Graphics Card: Radeon R9 280X
Driver version: Crimson 16.7.3
dxlevel (default is 90): 95
Resolution: 1680x1050
Full-screen or windowed: Full
FPS configs enabled: Comanglia's cfg (w/ facial expressions, gibs and ragdolls enabled; mat_picmip -1), no bullet dust, reduced explosions
Shadows enabled/disabled: disabled
It's either my Radeon being bad at video games, or my CPU somehow not being good enough to keep up the pace...
d to load sound "vo\medic_painsharp07.wav", file probably missing from disk/repository
2639 frames 15.293 seconds 172.57 fps ( 5.79 ms/f) 11.765 fps variability
CPU: i5 3570K @ 4.4GHz
GPU: Nvidia GTX 750Ti
8GB RAM
Res: 1920x1080
FPS config: Comaglia's cfg with mat_picmip 0, reduced explosions.
Do you think those FPS are ok for my setup?
I kinda feel like some people with similar setups get more FPS than that.
TheReduxPLno bullet dust
How does that work?
2639 frames 14.882 seconds 177.33 fps ( 5.64 ms/f) 11.978 fps variability
i7-4790k at 4.4ghz
msi gtx 960 at stock
16gb of ram
1920x1080
custom fps config (high quality models and textures, low everything else)
dx level 90
2639 frames 18.553 seconds 142.24 fps ( 7.03 ms/f) 10.859 fps variability
CPU: i5-4690k at 4.1 GHz
GPU: R9 270 OC'd to about stock 270x
RAM: Single 8 GB stick
Res: 1920x1080
Config: Slightly modified comaglia
DX level 90
Note: Two monitors on same gpu.
2639 frames 19.542 seconds 135.05 fps ( 7.40 ms/f) 13.547 fps variability
This is DX level 91
2639 frames 18.729 seconds 140.91 fps ( 7.10 ms/f) 11.823 fps variability
DX level 95
2639 frames 17.263 seconds 152.87 fps ( 6.54 ms/f) 13.340 fps variability
DX level 81
Can someone please help me? I feel like my results are too low, but maybe I dont know what I'm doing or have done something wrong.
2639 frames 43.526 seconds 60.63 fps (16.49 ms/f) 6.347 fps variability
CPU: i7 950 @ 3.07 ghz
GPU: EVGA GeForce GTX 1070 FTW GAMING ACX 3.0
RAM: 16GB
Res 1920x1080
Config: I have a few things turned off.. i use : Chris' dx9frames config
Full Screen
Shadows disabled
Motherboard: Sabertooth x58
I feel like I should get more FPS than this. What could be the culprit?
SunTzuCan someone please help me? I feel like my results are too low, but maybe I dont know what I'm doing or have done something wrong.
2639 frames 43.526 seconds 60.63 fps (16.49 ms/f) 6.347 fps variability
CPU: i7 950 @ 3.07 ghz
GPU: EVGA GeForce GTX 1070 FTW GAMING ACX 3.0
RAM: 16GB
Res 1920x1080
Config: I have a few things turned off.. i use : Chris' dx9frames config
Full Screen
Shadows disabled
Motherboard: Sabertooth x58
I feel like I should get more FPS than this. What could be the culprit?
you need better cpu.