Page 1 of 2

NVIDIA G80: Architecture and GPU Analysis

Posted: 2006-11-08 02:45pm
by Ace Pace

Posted: 2006-11-08 02:56pm
by Arrow
I'm reading the HardOCP review right now. I'm LOVING this new architecture.

And my EVGA 8800 GTX will be here tomorrow! :twisted:

Posted: 2006-11-08 02:58pm
by Ace Pace
Fuck you, fuck you hard. You WILL bench that thing under realistic conditions.

I'm reading the beyond3d review first. Discussion of actual benchs to the other thread.

Posted: 2006-11-08 03:09pm
by Arrow
Ace Pace wrote:Fuck you, fuck you hard. You WILL bench that thing under realistic conditions.
Haha! I KNEW that was going to be your reaction! Don't worry, I plan on gathering some data from BF2142, Oblivion and NWN2 tonight with my GX2, and again tomorrow on the GTX (don't expect a lot, though, I want to play with my new toy).

Also, if there's demand for it, I could do Company of Heroes and Fear timedemos this weekend (but those would be GTX only).

Posted: 2006-11-08 03:10pm
by Ace Pace
FEAR timedemos.
Ah, the ROP hardware. In terms of base pixel quality it's able to perform 8x multisampling using rotated or jittered subsamples laid over a 4-bit subpixel grid, looping through the ROP taking 4 multisamples per cycle. It can multisample from all backbuffer formats too, NVIDIA providing full orthogonality, including sampling from pixels maintained in a non-linear colour space or in floating point surface formats. Thus the misunderstood holy grail of "HDR+AA" is achieved by the hardware with no real developer effort. Further, it can natively blend pixels in integer and floating point formats, including FP32, at rates that somewhat diminish with bandwidth available through the ROP (INT8 and FP16 full speed (measured) and FP32 half speed). Each pair of ROPs share a single blender (so 12 blends per cycle) from testing empirically.

Sample rates in the ROP are hugely increased over previous generation NVIDIA GPUs, a G80 ROP able to empirically sample Z 8 times per clock (4x higher than G7x ever could), with that value scaling for every discrete subsample position, per pixel, bandwidth permitting of course. Concluding 'free' 4xMSAA from the ROP with enough bandwidth is therefore an easy stretch of the imagination to make, and the advantages to certain rendering algorithms become very clear. The 24 ROPs in a full G80 are divided into partitions of 4 each, each partition connecting to a 64-bit memory channel out to the DRAM pool for intermediary or final storage.
Am I reading this right? Is this basicly free AA aslong as the ROPs arn't busy actully writing data to the display?

Posted: 2006-11-08 03:20pm
by Ace Pace
To add, free AF.
anisotropic filtering is effectively pretty much free, depending on your game and target resolution of course.

Posted: 2006-11-08 03:47pm
by Arrow
AA's not free. It depends on what else is going on, but it certainly is less of hit than before.

Now, since Nvidia has the CUDA engine now, I need to see if I can get my boss to get on for my work machine for *cough* DSP work *cough*.

Posted: 2006-11-08 04:14pm
by InnocentBystander
So Nvidia brings some cool shit to the table. What has ATI got?

Posted: 2006-11-08 05:35pm
by Arrow
Right now? Nothing.

R600, which is should be a February launch, will be unified as well (although supposedly only 64 shaders) and possibly a 512-bit bus with 1GB of RAM. If that last part is true, the R600 will probably be an ultra expensive card, and the extra memory and bandwidth probably won't help it much against the G80. On the flip side, no one knew what the G80 was going to be until a month ago, so ATI may well have some surprises in store.

Posted: 2006-11-08 05:43pm
by Jaepheth
Arrow wrote: [...] no one knew what the G80 was going to be until a month ago, so ATI may well have some surprises in store.
I would suspect ATI and NVidia have enough spies in each other's operations or at the very least a good enough team of reverse engineers to compete with anything the other throws out.

Posted: 2006-11-08 05:57pm
by Fingolfin_Noldor
I'm sorely tempted to get the 8800GTS, but I just upgraded a few months back to a 1900XT 512mb from a 6600GT. So it's really too early to upgrade.

I doubt ATI is that stupid to launch a card that can't match NVidia's best.

Posted: 2006-11-08 08:27pm
by Captain tycho
I really want one of these, but I simply can't justify spending 500+ dollars on a video card when the one I have now does everything I need it to do at a very acceptable speed, and when DX10 cards aren't that far off.

Posted: 2006-11-08 08:40pm
by NRS Guardian
Captain tycho wrote:I really want one of these, but I simply can't justify spending 500+ dollars on a video card when the one I have now does everything I need it to do at a very acceptable speed, and when DX10 cards aren't that far off.
Uh... this is a DX10 card, the first DX10 card to be released. That's why it's so expensive.

Posted: 2006-11-08 10:40pm
by phongn
NRS Guardian wrote:Uh... this is a DX10 card, the first DX10 card to be released. That's why it's so expensive.
Well, that and a 680 million transistor GPU at 90nm probably has rather low yields. By comparison, the newest x86 CPUs have less than 300 million at 65 nm. RAM for GPUs is also more expensive as well (being specialized, high-performance parts)

Posted: 2006-11-08 11:45pm
by ThatGuyFromThatPlace
I on the other hand haven't upgraded since 6 series and have been waiting and saving for the DX10 cards for months.
Can't wait till february.

Posted: 2006-11-09 12:31am
by Ace Pace
phongn wrote:
NRS Guardian wrote:Uh... this is a DX10 card, the first DX10 card to be released. That's why it's so expensive.
Well, that and a 680 million transistor GPU at 90nm probably has rather low yields. By comparison, the newest x86 CPUs have less than 300 million at 65 nm. RAM for GPUs is also more expensive as well (being specialized, high-performance parts)
Architecting such a massive GPU has taken NVIDIA a great deal of time and money, four years and $475M to be exact.
:shock:

Posted: 2006-11-09 08:51am
by Arrow
phongn wrote:
NRS Guardian wrote:Uh... this is a DX10 card, the first DX10 card to be released. That's why it's so expensive.
Well, that and a 680 million transistor GPU at 90nm probably has rather low yields. By comparison, the newest x86 CPUs have less than 300 million at 65 nm. RAM for GPUs is also more expensive as well (being specialized, high-performance parts)
The 8800 GTX isn't that much more expensive than the 7800 GTX when it launched (I paid $579 for an OC'ed version when it launched). In fact, you could possibly argue that its cheaper, since I needed two 7800 GTXs to drive my 24" LCD (1920x1200), while it looks I'll only need one 8800 GTX for the same display.

Posted: 2006-11-09 09:53am
by Ace Pace
phongn wrote:
NRS Guardian wrote:Uh... this is a DX10 card, the first DX10 card to be released. That's why it's so expensive.
Well, that and a 680 million transistor GPU at 90nm probably has rather low yields. By comparison, the newest x86 CPUs have less than 300 million at 65 nm. RAM for GPUs is also more expensive as well (being specialized, high-performance parts)
No kidding.
Nvidia's isn't handing out exact die size measurements, but they claim to get about 80 chips gross per wafer.
Eighty. From a 300mm wafer. And thats just gross numbers...

:shock:

Posted: 2006-11-09 09:54am
by Arrow
For those of you that don't want to go through 15-page review after 15-page review, Ars has a great one page summary of the new architecture and what it means.

Posted: 2006-11-09 11:13am
by phongn
Arrow wrote:The 8800 GTX isn't that much more expensive than the 7800 GTX when it launched (I paid $579 for an OC'ed version when it launched). In fact, you could possibly argue that its cheaper, since I needed two 7800 GTXs to drive my 24" LCD (1920x1200), while it looks I'll only need one 8800 GTX for the same display.
That's comparing apples and oranges. We're comparing stock board to stock-board, not pre-overclocked variants or SLI.

Posted: 2006-11-09 03:34pm
by The Kernel
Ace Pace wrote:
phongn wrote:
NRS Guardian wrote:Uh... this is a DX10 card, the first DX10 card to be released. That's why it's so expensive.
Well, that and a 680 million transistor GPU at 90nm probably has rather low yields. By comparison, the newest x86 CPUs have less than 300 million at 65 nm. RAM for GPUs is also more expensive as well (being specialized, high-performance parts)
Architecting such a massive GPU has taken NVIDIA a great deal of time and money, four years and $475M to be exact.
:shock:
Mostly this was due to higher costs associated with doing transistor level design apparently.

Posted: 2006-11-09 06:41pm
by Arthur_Tuxedo
phongn wrote:
Arrow wrote:The 8800 GTX isn't that much more expensive than the 7800 GTX when it launched (I paid $579 for an OC'ed version when it launched). In fact, you could possibly argue that its cheaper, since I needed two 7800 GTXs to drive my 24" LCD (1920x1200), while it looks I'll only need one 8800 GTX for the same display.
That's comparing apples and oranges. We're comparing stock board to stock-board, not pre-overclocked variants or SLI.
Still, the 7800 GTX was a somewhat anomalous case because there were so many available on launch and priced dropped by like $75 under MSRP scant weeks after release. Top end cards have been pretty consistently priced at $600 or so at release, and this is no exception. The GeForce 3 came out at $600, after all.

Posted: 2006-11-09 08:14pm
by Arrow
Ok, to hell with the price debate. I have my card. I'll post pics and screen shots tomorrow (sorry Ace, but this is just too much!). First impressions:

1) This is EVGA's best packaging yet. It reminds of the same packaging used by a vendor I deal with at work for their $25,000 and (way, way) up FPGA boards. Your card will not be damaged during shipping.

2) Easy install, as expected. Only problem was that I managed to unseat my X-Fi.

3) Ok, I think I'm done tormenting Ace.

4) Maybe not! ;)

5) The first thing you notice with this card over all previous Nvidia cards is the image quality. It is literally a night and day difference. When I fired up 3DMark06 (default settings), I was completely blown away. For a second I wasn't sure it was the same program. The screenshots in the reviews simply do not do this new card justice. Everything looks more real and life like.

6) The Medieval 2 demo absolutely LOVES this card. Using the same settings as the GX2, the game ran two to three times faster (15-20 FPS to 35 to 50 FPS). The G80 has so much power, that the pussy French broke and ran without my noble Englishmen so much as lifting a finger! So I maxed everything out (but the AA - I set it to 4x because I want to play with the 8x CSAA override later) and it was still butter smooth at 1920x1200.

Ok, now its time to try a few other games, but I think I'm going to need to borrow a OMGWTFPWNED image and the Samual L Jackson pic to do a proper critique.

Posted: 2006-11-09 10:34pm
by Arrow
Ok, I actually got some benchmarking done.

First up, my system configuration:

Image
Image
Image

Here's the 8800GTX 3DMark06 score:
Image
Right in line with everyone else's stock single card results.

Ace asked to do a Shadermark (v2.1) comparison between the 7950GX2 and the 8800GTX.

Here's the GX2 results

Here's the 8800GTX results

The 8800 pretty much blows the GX2 out of the water on that one.

Ok, now time for Oblivion. I currently have a couple dozen mods install, the ones that affect the following results are the "Low-Poly Grass" mod, the "Unique Landscapes - Ancient Redwoods 1.3" mod and an LOD texture mod. The low-poly looks 99% as good as the stock grass, and it runs better. The Ancient Redwoods is probably the single most demanding area I've ever seen in Oblivion, mod or stock. This test run started at Gottlesfont Priory, and I ran due north until I hit the Black Road, which pretty much goes straight the heart of the Redwoods mod. I stuck the same path as much as humanly possible for these runs.

7950GX2: 2xAF, 1/3 grass, 1/4 item, 1/3 shadows, HDR, all else max
Frames: 1973 - Time: 73047ms - Avg: 27.010 - Min: 15 - Max: 44

8800GTX: Default AF, 1/3 grass, 1/4 item, 1/3 shadows, HDR, all else max
Frames: 3179 - Time: 71654ms - Avg: 44.366 - Min: 15 - Max: 63

8800GTX: 8x CSAA, 16xAF, HDR, EVERYTHING maxed!
Frames: 2178 - Time: 71695ms - Avg: 30.379 - Min: 18 - Max: 45

Uh, yeah, I'll play with everything cranked, thankyouverymuch!

And here's some screenshots of Oblivion with the maxed out settings. Also, you should note that I turned on all reflections in the .ini file, and each screenshot has the FPS stamp in the upper left. 56k download warning applies as well.

Redwoods
Crowd at War

The above screen is should be very representative of the worst case with the max settings, and the game was still playable.

Here's a nice HDR shot
Lighting on armor
The lake from the Ancient Yews mod
Swamps near Fort Teleman

And last but not least, Hardware p0rn (all in 7.2 megapixel glory!):
The Box
The Card
The Computer

And fuck, I'm out of webspace...

Posted: 2006-11-09 10:39pm
by Uraniun235
What's that card next to the vid card?