|
| Notices |
DriverHeaven is currently recruiting for the AWOMO Beta Test / Elite Op Team. AWOMO is a digital download service for games, and we're looking to expand the beta team. If you're interested. Sign up as a member here at DriverHeaven and then head HERE to submit your details. Thanks
For more info on AWOMO visit their site HERE
Welcome to the DriverHeaven.net forums. You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content and access many other special features. Registration is fast, simple and absolutely free so please, join our community today! If you have any problems with the registration process or your account login, please contact contact us. |
 |
Dec 20, 2003, 07:09 AM
|
#1
|
|
Burned
Join Date: May 2002
Posts: 29,775
|
DH GFX XMAS Review: 5950 v 9800 XT
Over the last few weeks we have been testing the top end ATI 9800 XT and Nvidia 5950 and put them head to head in what is a thorough review showing how the cards will not only benchmark but more importantly how they will cope with a variety of games at various resolutions. You still have time to get one of these babies just before Christmas, but the only turkey you want to end up with, is on your dinner plate, so read on to see which deserves your hard earned money !
How did they fare? is the 9800 XT a vastly superior card or have nvidia got the edge at higher resolutions with all the eye candy turned on?
To see the full review head over here. We spent alot of time with this review and the hardware to make sure all our members got the highest quality, most unbiased review possible, so any comments in our forum are appreciated !
Back in October when we had our first look at the 9800 XT its competition was the 5900 Ultra. 2 months down the line Nvidia have released their refresh to the 5900 Ultra design, the higher clocked differently cooled 5950 Ultra.
|
|
|
Dec 20, 2003, 11:38 AM
|
#2
|
|
It Never.....
Join Date: Nov 2002
Location: Kentucky
Posts: 3,174
Rep Power: 0
|
Kudos!
The usual great work guys. It is interesting to see that Nvidia has made up some ground especially with AA and AF on. However I know it wasnt just me that thought the ATI would still come out ahead by just a bit. Like you said for that kind of money you expect all of what you can get and all of the time!
|
|
|
Dec 20, 2003, 11:45 AM
|
#3
|
|
DriverHeaven Lover
Join Date: May 2003
Location: Deep in Martian soil where it's warm and the air is good
Posts: 232
Rep Power: 0
|
Couple of little things that got my attention (it's Saturday morning so forgive me for rambling):
In the "Specs" listing for the nV38, I see that you list:
Pixels Per Clock (peak): 3
I'm really curious about that. Is it a typo, and you meant to write "4"? Or do you have some written documentation from nVidia which states that nV38 does 3 pixels per clock, as opposed to nV35, which does 4 pixels per clock? (Of course the "peak" designation here, which I suspect was supplied by nVidia, is meaningless. There's no "peak" to it--it either does 3 pixels per clock, or 4, at a constant rate.) So, I feel sure that it must be a typo, and you meant to write 4, because the spec listed as:
Fill Rate: 3.8billion tex/sec
is only possible at 475MHz x 4 pixel pipes x 2 texel units per pixel pipe. IE, the texel units are a part of the pixel pipeline, and you couldn't get the number above with 3 pixels per clock (which would be 3 pixel pipelines x2 texel units per pipe x475MHz which would equal 25% less than 3.8b tex/sec.) So I'm leaning toward the typo theory here, but I'd certainly like to know if nVidia is telling you 3 pixels per clock!...  [I hope, I really do, that nVidia's not trying say it does a "peak" 8 pixels per clock, and that the typo is not that you put in a 3 instead of an "8." nV38 never renders 8 pixels per clock to screen, under any conditions, "peak" included.]
What's always been interesting to me is the fact when 3d-card IHVs advertise "texels per clock" and "pixels per clock" they usually do so separately, so that a person reading the specs thinks that the "fill rate" for texels and pixels are independent of each other, when in fact texels are simply pixel elements. IE, no 3d card renders texels to the screen...  The only things 3d-cards render are pixels, to which texels are attached in the pixel pipeline. So the production of texels and pixels are completely linked and while a 3d card can render a pixel without a texel, it cannot render to screen (make visible) a texel without a pixel. So you can see that it does little good to list separate "fill rates" for texels, unless you understand the relationship between a texel and a pixel.
A comparison of the per-clock pixel pipeline organization between R360 and nV38 is interesting. The organization for R360 is 8x1 (8 pixel pipes, each capable of attaching 1 texel per pixel, per clock.) nV38 is 4x2 (4 pixel pipes, each capable of attaching 2 texels per pixel, per clock.) The behavior of each gpu organization is different, however, under the following software conditions:
Single texturing: In software conditions which require that only 1 texel is applied to each pixel rendered, the per clock organization for R360 is 8 pixels per clock. It is 4 pixels per clock for nV38. In this condition R360 is 2x faster than nV38, per clock, and nV38 effectively "drops back" to the performance of a 4x1 organization.
Multitexturing: In software conditions which require that two texels be attached per pixel, things change for R360. Because R360 has only one texel unit in each of its 8 pixel pipelines, it must effectively "drop back" to 4 pixels per clock, to which 2 texels are applied to each pixel. In this case, essentially, R360's 8x1 becomes 4x2 (this is descriptive and not meant to be technically exact.) In this condition, then, R360 becomes exactly as fast as nV38, per clock, and is producing 4 pixels per clock, with 2 texels applied to each pixel.
These days, there are no 3d games developed which are either 100% single texturing, or 100% multitexturing. Rather, all games are always a mixture of the two, and the single and multitexturing requirements of each game vary from game to game, according to the desires and needs of the developers--there is no fixed ratio for this among games. So the bottom line is that where fill rate is concerned (pixels/texels) R360 is always 2x faster than nV38 when single texturing, per clock, and when multitexturing, R360 is always as fast as nV38, per clock. From a fill rate pixel/texel perspective, there are no conditions in which R360 is ever slower than nV38, per clock.
One reason nV38 keeps as close to the R360 as it does, and in some cases may surpass it in terms of raw pixel/texel fill rate (which it must be noted is much more important in DX7-8 era games than it is for DX9-type fp & pixel-shader games, where the strength of the competing gpu pixel shader engines is a larger factor, but that's another performance issue completely and beyond the scope of these comments), is because generally the nV38-based products comparable to the R360-based products, are clocked higher. At 475MHz, for instance, a 5950U is clocked ~15% higher than a 9800XT (@412MHz.) The core clock disparity for pixel/texel fill rate, then, when both are single texturing, is that the 9800XT is not quite exactly 2x faster, but rather is 2x faster - 15% (minus the difference in clock rate.) When multitexturing, however, the core clock disparity is such that the 9800XT is 15% slower than the 5950U, when both products are clocked at their stock config MHz speeds.
So it's this disparity between the way some games are programmed with respect to the kinds and amounts single and multitexturing in them (assuming they are not ps2.0 games of course, in which the dynamics change considerably, as nV38 is relatively weak in ps2.0 compared to R360--but again, that's unrelated here), that causes the results you see when comparing current game performance between these two products.
So, with respect to the spec listed above as "Fill Rate: 3.8billion tex/sec", for the 5950U, it should be noted that this theroetical maximum is only possibly while multitexturing. In a single-texturing scenario, the "texel" fill rate drops to 1/2 that rate, or "1.9Billion tex/sec." When single texturing, the second texel unit in each of the 4 pixel pipes in nV38 is simply unused per clock, and the only reason R360 is not always exactly 2x as fast as nV38 under that condition is because of the clock-rate disparity between them as configured in these two products.
To recap:
In terms of raw pixel/texel fill rates, when single texturing the 9800XT is always 2x faster than the 5850U, minus the difference in clock rate. When multitexturing, the 9800XT is always exactly as fast as the 5950U, minus the difference in clock rate. It's just one of those things I think it is worth knowing when people look at raw pixel/texel fill-rate specs for competing 3d products...
Second thing, briefly, is just to comment on what you probably already know. When running any standard UT2K3 benchmark, such as the "Fly By," it should be noted that nVidia's drivers cancel out trilinear filtering for the display of detail textures in the game. So, to do an apples-apples comparison of UT2K3 performance between these products, when using detail textures in the game, you ought not use the in-game setting relative to trilinear and AF with the ATi product, but should use the Cat control panel to setup a level of AF in the game, which will cause the ATi driver behavior to mimick nVidia's, and detail textures will be bilineared instead of trilineared, which will put them on an equal IQ setting for a performance comparison. If you set the Catalysts to "Application Preference," however, and engage trilinear and AF from within UT2K3, the Catalysts will apply trilinear and AF to the detail textures as instructed, while the nVidia drivers when set to "application preference" will not even if you turn on these features inside the game, and so the ATi product will be running at higher IQ settings making an equal performance comparison impossible. In reading the review, it wasn't entirely clear to me how you set things up for the Fly-By benchmark, so I thought I'd just make this comment. Also relative to this benchmark is the fact that because nVidia does not publicize info on its driver "optimizations," it cannot be known whether the nVidia drivers contain optimizations relative to the Fly-By benchmark but not relative to UT2K3 itself, which is important to consider, I think.
One last little thing...  I noticed that while the picture of the of the 5950U showed plenty of clearance for the board on your motherboard length-wise, I was curious as to what happens width-wise, as to any obfuscation of PCI slots that might occur as a result of the 5950U's double-wide design. I haven't looked at the particular Asus motherboard you used here, but from the picture it appeared there were no PCI slots near the AGP slot...  Is that the case with this motherboard? Or was the overlap of PCI slots, if any, simply not pictured?
Thanks, Good Review!
|
|
|
Dec 20, 2003, 12:16 PM
|
#4
|
|
Burned
Join Date: May 2002
Posts: 29,775
|
As its stuarts review ill let him answer some of your questions and perhaps detail some comments when he gets back later, but I will address the Pixels Per Clock comment, great job in spotting that walt and I hold my hand up for the typo involved in that, according to the nvidia documentation stu presented me when I was designing it, it says "8", so its a typo on my behalf. I have rectified it, and thanks for your detailed response to the review ... you ever thought about taking up reviewing or writing yourself?
|
|
|
Dec 20, 2003, 01:48 PM
|
#5
|
|
Administrator
Join Date: Nov 2002
Location: Cloaked
Posts: 2,853
|
HI Walt, just dropped by and saw your comments, i'm in the middle of somthing just now so i'll respond in detail later today.
|
|
|
Dec 20, 2003, 02:03 PM
|
#6
|
|
DriverHeaven Extreme Member
Join Date: May 2002
Location: Nova Scotia
Posts: 4,473
|
very well put together reviews guuys.... Good to see the ATi card still leads the way, but the gap has closed considerably it seems...So if ATi puts some serious effort into OpenGL optomizations I think they would take the lead hands down.....
|
|
|
Dec 20, 2003, 04:46 PM
|
#7
|
|
Administrator
Join Date: Nov 2002
Location: Cloaked
Posts: 2,853
|
Quote:
Originally posted by WaltC
Lots of stuff...
|
Completely agree with what your saying on multi/single texturing. The section in the review that you are commenting on was taken from various marketing blurbs across the net...some of it cut and paste yes. I proof read it all and there didnt seem to be any glaring errors anywhere (but i'm not perfect) and generally you can trust these blurbs because any errors can be classed as false advertising. I'll have a re-read and clear up any "wooly marketing speak" over the next few days to give a clearer picture.
I'd also like anyone who's reading this thread and didn't read Walt's comments to reread them as it gives an excellent summation of the theoretical performance of each architecture.
On the UT 2003 front i was aware of the FX issues with bilinear/trilinear, i am also aware of the issue ATI has with some textures in UT2003 being incorrect. Its one of these tough calls you make as a reviewer and i feel i addressed it in the conclusion. I personally would be hard pushed to tell the difference when playing (this applies to UT only, in other games the difference may be noticable). The other tough call was do you use the preset benchmark or do you use FRAPS because it's very possible that the prerecorded demo can be optimised for specifically (other than bilinear/trilinear issues) in the case of UT2003 i felt the benchmark built in is good becacuse it allows people to compare their system at home to the one in the review. Its also the reason why half of the review was using FRAPS.
"its an optimisation jungle out there!"
Finally, on the clearence issue, one PCI slot is obscured by the 5950 on the SK8V.
Thanks again for the comments, nice to see you enjoyed the review 
|
|
|
Dec 20, 2003, 08:15 PM
|
#8
|
|
Banned
Join Date: Jul 2002
Location: Kingston, Ontario .. Canada
Posts: 2,319
Rep Power: 0
|
Nicely done guys ..
One thing we all must note .. NVIDIA submitted its competetion two months after the fact..
Still potent ATI is .. XT is awesome on so many different levels 
|
|
|
Dec 20, 2003, 09:44 PM
|
#9
|
|
At Your Service...
Join Date: May 2002
Location: North Carolina
Posts: 3,725
|
WaltC
Thanks for the tech talk on the pixel/texel relationship (among other things) - I really like that, and for taking the time to study the review (great review too!) carefully enough to be able to make the observations you noted here.
Your posts are interesting, informative, and thoughful - ramble on!!!
|
|
|
Dec 21, 2003, 12:30 PM
|
#10
|
|
DriverHeaven Lover
Join Date: May 2003
Location: Deep in Martian soil where it's warm and the air is good
Posts: 232
Rep Power: 0
|
Quote:
Originally posted by Zardon
As its stuarts review ill let him answer some of your questions and perhaps detail some comments when he gets back later, but I will address the Pixels Per Clock comment, great job in spotting that walt and I hold my hand up for the typo involved in that, according to the nvidia documentation stu presented me when I was designing it, it says "8", so its a typo on my behalf. I have rectified it, and thanks for your detailed response to the review ... you ever thought about taking up reviewing or writing yourself?
|
Zardon,
Thanks to you, Veridian3, and everyone who commented, for the kind words. Heh...:) I barely have time to skim the web these days, and to comment in 3-4 forums every now and then, much less write reviews...! But thanks much for the sentiments. It's far easier for me to make the comments I make off and on than it is for Veridian3 to write a review, and I know that often writing a review is 10% fun and 90% work. Thanks again for the kind comments, though, but I honestly believe that you and Veridian3 deserve the lion's share of them for the things you guys do with DH....:)
Anyway, whenever a person writes in public manner (whether you write a review or just comment in a forum in your spare time) , it's beyond doubt that you'll be misunderstood from time to time, because all of us have slightly different cultural and contextual meanings that we associate with the same words. I think you may have misunderstood what I was saying about nV38 and the "pixels per clock" specification originally typo'ed as "3"....:)
What I wanted to point out was that since the word "peak" was included, I could imagine that nVidia was still trying to sell the "8 pixels per clock" marketing line for nV38 that it originally started with nV30 long before the product shipped out in prototype form to the first web sites which reviewed it. What I wanted to say was that the designation marketed for any nV3x gpu as follows:
Pixels Per Clock (peak): 8
...has never been correct (and of course "3" wasn't, either, which is why I was sure it was just a typo to begin with.) The most pixels per clock any nV3x chip can render to screen is 4, which would be true for nV30, nV35, and the nV38 used in the 5950U Veridian3 reviewed. Evidently I didn't explain myself as clearly as I should have above, and for that I do apologize. You were actually much closer to being accurate with the "3" typo than you are to state nVidia's suggestion of "8"...:)
I really can only speculate as to the reasons nVidia might ever have wanted to represent 4-pixel-per-clock gpus as 8-pixel-per-clock gpus, since of course even when the question has been put directly to nVidia PR people it has never been answered straightforwardly. So, I can only speculate, and that speculation is that when the 8-pixel-per-clock R300 shipped from ATi last year, long before even the first prototype nV30's hit the review circuit, nVidia believed that to market nV30 as anything less than an 8-pixel-per-clock gpu would hurt its sales in the marketplace and cause consumers to see nV30 as non-competitive with R300. And so, nVidia began marketing nV30 as an "8-pixel-per-clock" gpu long before they shipped the first nV30's to review sites and ultimately pulled nV30 from the market, even though it has always been a 4-pixel-per-clock gpu. From what you tell me, they are still misrepresenting this fundamental specification even now, with nV38, after having been called out on it months ago by websites across the Internet.
This caused a lot of head-scratching and confusion within various hardware sites early this year, because when the nV30 prototypes started circulating, nobody could figure out how, since nV30 was an 8-pixel-per-clock gpu just like R300, and nV30 was clocked higher, the R300-based based 9700P was clobbering it in performance in so many areas. It just never occurred to people, nor to me, that nVidia might have been misrepresenting the architecture of nV30 in such a fundamental fashion--I've never seen a 3d-gpu company deliberately fib openly about something in its architecture as fundamental as "pixels-per-clock." Nobody else had seen it before, either, and so a lot of people including me were trying on various exotic and complicated theories to try and explain the performance differences that were so evident, and which, according to the specifications released by the manufacturers of the gpus, should not have existed.
Anyway, a few months ago, it all came out in the wash when it was discovered by some enterprising people that no nV3x is capable of rendering 8 pixels per clock, and the maximum for the architecture is 4 pixels per clock. So then the conusion we'd all felt earlier about the performance differences between nV30 and R300 were shifted away from that aspect (since now we knew why R300 clobbered nV30 even though being clocked slower) to being confused as to why nVidia believed it could misrepresent something so fundamental and have it go undetected. A lot of people like me are very confused as to the directions nVidia PR has taken this year, as it has been anything but a positive direction for the company to take, and has obviously been unsuccessful.
Tech Report did a story a few months ago in which they posed the following question to Derek Perez (I think it was Perez, though it might have been Kirk):
"Is nV30 an 8-pixel-per-clock rendering architecture?" (paraphrased, because I don't recall the exact wording, but this is what was asked)
nVIdia replied: "Yes, we do 8 ops per clock." nVidia then went on to talk further about ops, and *black & white* "z-pixels," and, finally, " 4 color pixels per clock."
Important above is to note that TR did not ask about "ops per clock" but as to whether the nV30 *ever* rendered 8 color pixels to screen per clock. Based on the information nVidia itself supplied for the answer, Perez's statement should have been: "No, but we do 8 ops per clock"....*chuckle* That was the only answer congruent with the specifications he listed for nV30 in that interview with TR.
Things like "black and white z-pixels" and "ops per clock" are things that are done internally inside the gpu prior to and as a part of the rendering process involved in rendering each color pixel per clock to screen. Just as you cannot render a "texel" to the screen, you cannot render an "op" or a "black & white z-pixel" to screen *independently* of a color pixel...:) Bottom line is that the max pixels per clock that nV3x can render, under any conditions, is 4. nVidia throws the word "peak" in there just to further confuse things. The entire object here for nVidia though is to try and put nV3x on the same "pixel per clock" playing field" as R300. The only time that's ever true is when multitexturing, but even in that case both the R3x0 and nV3x0 are rendering 4 color pixels per clock to the screen. In single textruring, while R3x0 is doing 8 color pixels to screen per clock, nV3x is still doing only 4.
Therefore, nV30/35/38 render 4 pixels per clock, which has always been true...:)
Yes, the issue has been needlessly confused this year--it's not your fault or my fault or anybody else's "fault"--except for nVidia. All this is entirely their fault.
If we really wanted to cofuse things we could talk about the fact that R3x0 does 20 + "ops per clock" inside the gpu in the pixel creation process--Heh...:) But it wouldn't mean anything as the number of ops per clock isn't relational to the number of pixels per clock a gpu can render to screen. So, I'm very happy to stick with the traditional meaning of "pixels per clock" as "final color pixels rendered to screen per clock," as that is the only definition of the term that counts.
I hope this is of aid to people, and if I have somehow further confused the topic, I apologize as that is not my intention! Merry Christmas to all, and a happy New Year!...:)
|
|
|
Dec 21, 2003, 01:11 PM
|
#11
|
|
Banned
Join Date: Jul 2002
Location: Kingston, Ontario .. Canada
Posts: 2,319
Rep Power: 0
|
Quote:
Originally posted by swimtech
WaltC
Thanks for the tech talk on the pixel/texel relationship (among other things) - I really like that, and for taking the time to study the review (great review too!) carefully enough to be able to make the observations you noted here.
Your posts are interesting, informative, and thoughful - ramble on!!!
|
I second that .. I have mentioned before about WaltC's opinions and views ..
Always most welcomed and appreciated here on DH 
|
|
|
Dec 22, 2003, 03:27 AM
|
#12
|
|
watching 1080i
Join Date: Nov 2002
Location: April 13th 2029
Posts: 19,435
Rep Power: 75
|
Pixels-per-clock aside, (which I do believe Nvidia is a big fat liar about- and have thought so for quite some time), Great review.
One thing I can say about the comment-
"On the display quality front there were no noticeable differences between the two cards in 3D...........and I’m quite sure that if you sat most people down in front of a system with each card in it they would be hard pushed to notice any image quality differences. "
I have always had big problems with antialiasing on Nvidia cards, as far as I know they have been using the same method of AA for quite a while and haven't changed it yet-
I think when comparing AA speeds between the 2 cards you should enable 8x on the Nvidia and 4x on the ATI card. Then compare the framerate- below is a good example of this-
NV 4xAA-  NV 8xsAA-
ATI 4xAA-  ATI 6xAA-
as seen here at Xbit
I have always thought that ATI's 2xAA looks better than Nvidia's 4xAA- And that Nvidia could not match ATI's 4x or 6xAA at any of their AA settings including 8x. When I went from a ti4200 to a 9700p that was the biggest difference I noticed-
Every review I read on Nvidia video cards with AA enabled, I want to tell everyone reading it that the AA quality is better on ATI cards, so you really can't compare 4xAA speed on ATI to 4xAA speed on Nvidia, because the image quality is not the same at all.
I think the reason for the "problem" is that Nvidia uses ordered grid multisampling while ATI uses rotated grid multisampling as explained HERE. Maybe it is as simple as that, maybe not- But I know what I see on the screen- with ATI, the antialiasing effect looks good to me- the edges seem to all be smooth and there is no "shimmering", when I had my Nvidia card I remember 4xAA looking OK, but I was never satisfied with it, and I could always see the jaggies right through the antialiasing.
Until Nvidia changes the way they calculate AA, I don't think they are in the same league as ATI in visual quality.
Maybe this is less noticable at higher resolutions, 1280 x 1024 and above- but with all the crumby console ports coming out lately that only will only run at 10 x 7 (and no higher on some systems) while still remaining playable, this problem really stands out. Just a thought
Last edited by BWX; Dec 22, 2003 at 03:46 AM.
|
|
|
Dec 22, 2003, 04:22 PM
|
#13
|
|
Administrator
Join Date: Nov 2002
Location: Cloaked
Posts: 2,853
|
I stand by my statment that the display differences are very small, if noticable at all - during gameplay. When your hammering through a UT2003 map at 200fps you really cant see any difference.
I'm not debating the fact that if you take a section of image magnify it and compare between NV and ATI AA that one looks better than the other however if your going to do that then you should also do the same with AF and you'll see that in many ways the NV image is "true-er", some may say better, than the ATI image (in places)...but i'd say again for that (and a good example is NFSU) when your playing it on the 2 cards there is no real noticable IQ difference. Speed difference, yes. IQ, no.
*Note: I havent had time yet to sit down and compare the latest 5x.xx drivers to the IQ from the 4x.xx drivers on NV cards but it could be that the IQ has improved with the new complier and thats whats causing the closer IQ. Not a change to the AA method itself - just a general improvement in IQ overall. What a turnaround that would be from the Aquamark beta drivers if it were to be the case
|
|
|
|
|
|