Transfer time is irrelevant unless your monitor has horrible input lag in that case you could "improve" your situation and go from horrible input lag to less but still horrible input lag plus horrible picture quality.
Progressive scan -> doesn't matter, limited by refresh rate
Buffered -> two options:
1. bad monitor in terms of latency (most likely since it's using a buffer so latency clearly isn't a concern) -> transmit at default speed, transfer time is set by refresh rate (identical to progressive scan, except you got one frametime of lag added)
2. good monitor in terms of latency that uses the maximum transfer rate but for some reason is using a buffer which totally makes no sense if they care about latency so those monitors are rare/nonexistant -> normal latency + transfer time, still worse than progressive scan.
Progressive -> transfer time is irrelevant.
1. -> transfer time is irrelevant
2. -> transfer time in theory relevant. In practice only very bad monitors do that so you'll probably get another 30ms response time on top of that so you're already fucked in terms of input lag.
-> You want progressive scan and in that case transfer time is set by the refresh rate, no cable or transfer protocol is going to change that, best it could do is reduce the overhead time. In addition to that it'll only work if the monitor's inputs are capable of higher speeds than what the monitor actually needs.
I'll show you with my awesome MS Paint skills a bit further down.
DX9 does support triple buffering. It can't drop frames, that's the problem. And it's only a problem with much higher fps than the refresh rate so the added input lag is a fraction of the refresh rate. The pre-rendered frames queue adds more input lag, reducing its length helps more.
Also let's keep in mind that DX9 is over 12 years old. That's not exactly state of the art anymore.
http://www.gamedev.net/topic/649174-why-directx-doesnt-implement-triple-buffering/
http://gamedev.stackexchange.com/questions/58481/does-directx-implement-triple-buffering
wareyaIf you send it an extremely small frame (such as 800x600 16kcolors) then it doesn't need to wait for the entire panel worth of data to be transmitted before it has data with which it can write to the bottom row.
1. 15bpp is 32k and 16bpp is 64k.
2. Your eyes are pretty much the worst measurement tool except for random guessing.
3. You've mixed up two cases.
If the scaler waits for a complete frame to be buffered transfer time matters, but it can't start immediately.
If the scaler starts immediately transfer time doesn't matter.
The bottom row is the last one to be transmitted. Obviously it has to be transmitted before it can be displayed. Per definition if the last row has been transmitted the whole frame has been transmitted. Also this is exactly what I talked about in the best case. The scaler can start working with the first transmitted line so the only the scaler latency is added. Given sufficient bandwidth the scaler could finish the frame only slightly after the transmission has been completed. You're right in that respect. However this isn't the limitation. If the display is running at max refresh rate going from the top to the bottom row during a refresh takes pretty exactly 1/[refresh rate] seconds. Example progressive scan (ignoring response time): 1ms scaler latency, 1920x1080 16.67ms transfer time, 800x600 3.8ms transfer time, 16.67ms refresh time.
No scaler: frame transmission starts at 0ms, refresh starts at 0ms. Transmission ends at 16.67ms, refresh ends at 16.67ms. Next refresh from 16.67ms to 33.33ms and so on.
Scaler: Frame transmission starts at 0ms, scaler starts at 0ms. Scaler finishes first line at 1ms, refresh starts at 1ms. Transmission ends at 3.8ms. Scaler finishes last line at 4.8ms. Next transmission starts at 16.67ms, scaler starts at 16.67ms. Refresh ends at 17.67ms, scaler finishes first line at 17.67ms. New refresh starts at 17.67ms.
See the problem? Refresh time is the limit. All it did was add the scaler latency.
Now an example with a buffered scaler where transfer time matters (T=transmission, S=scaler, R=refresh):
T starts at 0ms. T ends at 3.8ms. S starts at 3.8ms. S finishes first line at 4.8ms, R starts at 4.8ms. S finishes last line somewhere between 4.8ms and 8.6ms. New T starts at 16.67ms. T ends at 20.47ms. S starts at 20.47ms. S finishes first line at 21.47ms, R ends at 21.47ms, new R starts at 21.47ms.
Now it's scaler latency + transfer time. This is only an improvement if scaler latency + small res transfer time is less than large res transfer time and if the monitor buffers a frame no matter what instead of using progressive scan for the native resolution. Is it a TV?
Bottom line is that a scaler can't save time. It can't speed up the refresh itself. Even if the refresh only takes e.g. 13.33ms (75Hz) physically on the screen so transfer time is the limiting factor lowering the resolution only improves things marginally. Yes the average display lag is reduced by 1.167ms but there will still only be one frame transmitted every 16.67ms. If the next frame has to wait 5ms for it's transmission then that won't change at all. What you want to do in that case is overclock that monitor and that's when transfer time actually matters. If the cable/output/input/protocol/whatever don't have sufficient bandwidth to keep the transfer time below the refresh time you have to drop the resolution. Because even with the 1ms scaler latency for example for that 5ms delayed frame you gain another 3.33ms (oc to 75Hz) or a total of 4.5ms reduction.
[talks about triple buffered vsync]
[posts picture about normal vsync]
wareyaAddemendum
Did you mean addendum?
http://i.imgur.com/K8mTUek.png
Black: Sampled input with constant vsync
Light Blue: vsync delay + rendering
Dark Green: waiting for display + actual refresh
Yellow: Random input delay (earlier inputs more delay, later less)
Red: Constant delay through rendering + waiting.
Light Green: Random refresh delay (top line appears first, no delay, bottom line last, full delay)
Brown: Same frame displayed twice.
Dark Blue: Rendering
You see the first missed refresh? That one is unavoidable. You'll have to live with one added refresh time of input lag, even with uncapped fps.
The problem is the second one. Because the next frame doesn't start rendering until the refresh starts you get the input over two full refreshes instead of around one and all of that data is delayed by another refresh if the frame doesn't render in time. And those refreshes worth of lag keep going for as long as the fps are below the refresh rate. Triple buffering avoids that because the next frame already starts rendering while the previous one is waiting (pink) because it missed the refresh. That reduces the added input lag to the normal one refresh time that you get when the fps are below the refresh rate.
And that holds true for DX triple buffering aswell. The only difference occurs when the fps spike to more than twice the refresh rate. DX won't render a third frame until the last refresh finishes and that adds some input lag. However it might actually feel smoother because on DX the average input lag, while higher, stays pretty much the same, whereas on normal triple buffering it drops in the next frame, then spikes up again and then drops again and so on. Well in the picture it doesn't drop because the fps drop pretty harshly, instead it increases less, on more stable fps it would drop, but you get the idea, the variation increases.
You can both get DX to behave more normal (reduce input lag) and get normal triple buffering to behave more like DX (smooth input lag) by using an fps cap. In fact thanks to statistics DX can actually have less input lag than normal triple buffering.
Continued in next post.