Applications of GF100’s Compute Hardware
Last but certainly not least are the changes to gaming afforded by the improved compute/shader hardware. NVIDIA believes that by announcing the compute abilities so far ahead of the gaming abilities of the GF100, that potential customers have gotten the wrong idea about NVIDIA’s direction. Certainly they’re increasing their focus on the GPGPU market, but as they’re trying their hardest to point out, most of that compute hardware has a use in gaming too.
Much of this is straightforward: all of the compute hardware is what processes the pixel and vertex shader commands, so the additional CUDA cores in the GF100 give it much more shader power than the GT200. We also have DirectCompute, which can use the compute hardware to quickly do some things that couldn’t be done quickly via shader code, such as Self Shadowing Ambient Occlusion in games like Battleforge, or to take an NVIDIA example, the depth-of-field effect in Metro 2033.
Perhaps the single biggest improvement for gaming that comes from NVIDIA’s changes to the compute hardware are the benefits afforded to compute-like tasks for gaming. PhysX plays a big part here, as along with DirectCompute it’s going to be one of the biggest uses of compute abilities when it comes to gaming.
NVIDIA is heavily promoting the idea that GF100’s concurrent kernels and fast context switching abilities are going to be of significant benefit here. With concurrent kernels, different PhysX simulations can start without waiting for other SMs to complete the previous simulation. With fast context switching, the GPU can switch from rendering to PhysX and back again while wasting less time on the context switch itself. The result is that there’s going to be less overhead in using the compute abilities of GF100 during gaming, be it for PhysX, Bullet Physics, or DirectCompute.
NVIDIA is big on pushing specific examples here in order to entice developers in to using these abilities, and a number of demo programs will be released along with GF100 cards to showcase these abilities. Most interesting among these is a ray tracing demo that NVIDIA is showing off. Ray tracing is something even G80 could do (albeit slowly) but we find this an interesting way for NVIDIA to go since promoting ray tracing puts them in direct competition with Intel, who has been showing off ray tracing demos running on CPUs for years. Ray tracing nullifies NVIDIA’s experience in rasterization, so to promote its use is one of the riskier things they can do in the long-term.
NVIDIA's car ray tracing demo
At any rate, the demo program they are showing off is a hybrid program that showcases the use of both rasterization and ray tracing for rendering a car. As we already know from the original Fermi introduction, GF100 is supposed to be much faster than GT200 at ray tracing, thanks in large part due to the L1 cache architecture of GF100. The demo we saw of a GF100 card next to a GT200 card had the GF100 card performing roughly 3x as well as the GT200 card. This specific demo still runs at less than a frame per second (0.63 on the GF100 card) so it’s by no means true real-time ray tracing, but it’s getting faster all the time. For lower quality ray tracing, certainly this would be doable in real-time.
Dark Void's turbulence in action
NVIDIA is also showing off several other demos of compute for gaming, including a PhysX fluid simulation, the new PhysX APEX turbulence effect on Dark Void, and an AI path finding simulation that we did not have a chance to see. Ultimately PhysX is still NVIDIA’s bigger carrot for consumers, while the rest of this is to entice developers to make use of the compute hardware through whatever means they’d like (PhysX, OpenCL, DirectCompute). Outside of PhysX, heavy use of the GPU compute abilities is still going to be some time off.
115 Comments
View All Comments
Zool - Tuesday, January 19, 2010 - link
There are still plenty of questions.Like how tesselation efects MSAA with increased geametry per pixel. Also the flat stairs in uniengine (and very plastic, realistic after tesselation and displacement mapping), would they work with collision detection as after tesselation or before as completely flat and somewhere else in the 3d space. The same with some physix efects. The uniengine heaven is more of a showcase of tesselation and what can be done than a real game engine.
marraco - Monday, January 18, 2010 - link
Far Cry Ranch Small, and all the integrated benchmark, reads constantly the hard disk, so is dependent of HD speed.It's not unfair, since FC2 updates textures from hard disk all the time, making the game freeze constantly, even in the better computers.
I wish to see that benchmark run with and without SSD.
Zool - Monday, January 18, 2010 - link
I want also note that for the stream of fps/3rd person shooters/rts/racing games that look all same sometimes upgrading the graphic card doesnt have much sense these days.Can anyone make a game that will use pc hardware and it wont end in running and shoting at each other from first or third person ? Dragon age was a quite weak overhyped rpg.
Suntan - Monday, January 18, 2010 - link
Agreed. That is one of the main reasons I've lost interest in PC gaming. Ironically though, my favorite console games on the PS3 have been the two Uncharted games...-Suntan
mark0409mr01 - Monday, January 18, 2010 - link
Does anybody know if Fermi, GF100 or whatever it's going to be called have support for bitstream of HD audio codecs?Also do we know anything else about the video capabilites of the new card, there doesn't really seem to have been much mentioned about this.
Thanks
Slaimus - Monday, January 18, 2010 - link
Seeing how the GF100 chip has no display components at all on-chip (RAMDAC, TMDS, Displayport, PureVideo), they will probably be using a NVIO chip like the GT200. Would it not be possible to just put multiple NVIO chips to scale with the number of display outputs?Ryan Smith - Wednesday, January 20, 2010 - link
If it's possible, NVIDIA is not doing it. I asked them about the limit on display outputs, and their response (which is what brought upon the comments in the article) was that GF100 cards were already too late in the design process after they greenlit Surround to add more display outputs.I don't have more details than that, but the implication is that they need to bake support for more displays in to the GPU itself.
Headfoot - Monday, January 18, 2010 - link
Best comment for the entire page, I am wondering the same thing.Suntan - Monday, January 18, 2010 - link
Looking at the image of the chip on the first page, it looks like a miniature of a vast city complex. Man, when are they going to remake “TRON”……although, at the speeds that chips are running now-a-days, the whole movie would be over in a ¼ of a second…
-Suntan
arnavvdesai - Monday, January 18, 2010 - link
In you conclusion you mentioned that the only thing which would matter would be price/performance. However, from the article I wasnt really able to make out a couple of things. When NVIDIA says they can make something look better than the competition, how would you quantify that?I am a gamer & I love beautiful graphics. It's one of the reasons I still sometimes buy games for PCs instead of consoles. I have a 5870 & a 1080p 24" monitor. I would however consider buying this card if it made my game look better. After a certain number(60fps) I really only care about beautiful graphics. I want no grass to look like paper or jaggies to show on distant objects. Also, will game makers take advantage of this? Unlike previous generations game manufacturers are very deeply tied to the current console market. They have to make sure the game performs admirably on current day consoles which are at least 3-5 years behind their PC counterparts, so what incentive do they have to try and advance graphics on the PC when there arent enough people buying them. I am looking at current games and frankly just playing it, other than an obvious improvement in framerate, I cannot notice any visual improvements.
Coming back to my question on architecture. Will this tech being built by Nvidia help improve visual quality of games without additional or less additional work from the game manufacturing studios.