- Jan 26, 2000
I found this tidbit over at the beyond3d forums....
1) Comparing # of "3D" voices is a meaningless comparison. Do you even know what a 3D voice is? A 3D voice can be anything from a stereo-pan to an HRTF to a full wave trace. # of 3D voices is a meaningless comparison until you know what KIND of 3D voice it is, that is, what the virtualization algorithm is. And they different *tremendously* in their quality of amount of CPU required.
2) The X-Box is not limited to a "MAX" of 64 3D voices anymore than the gamecube was originally restricted to a MAX of 64. Both the GC and X-Box have programmable DSPs. The GC has a 16-bit 81Mhz DSP coupled to 81Mhz slow RAM and only 8kb of working RAM. The X-Box has two 200Mhz 24-bit DSPs coupled to an 800Mb/s bus, plus a 32 hardware mixers with per voice equalizers, and a voice processor that does HRTF in hardware. I'm sure NVidia could provide a device driver for the DSP that does more than 64 3D voices using a lightweight HRTF or lower quality 3D virtualization, if they wanted to. You can always trade off more voices for quality, especially if you have a 200Mhz DSP and 24-bit precision. Do you even realize the quantization errors you will get using 16-bit math, or the performance degradation of emulating higher precision on the 16-bit GC DSP? The more channels you mix, the more errors you have.
3) The X-Box 3D positional audio is advanced HRTF based with occlusions. The X-Box can simultaneously handle all 64 3D audio streams and an additional 192 2D streams, plus IDLS2 Music synthesis. Can you tell me the 3D virtualization algorithm used by Factor5 Musyx? I'll bet you that those 100 3D voices are not using a full HRTF.
4) Can the GC do 100 3D channels, plus mix in a another 192 2D Stereo channels, such as dynamic music synthesis?
5) The X-Box sends the final result to a second DSP that does Dolby Digital encoding. Those 3D channels will be sent digitally through your Dolby Digital Encoder and directly to the positionally located speakers. Does GC have this? (NO)
In addition to the software processing power of the 2 DSPs, the MCP has additional fixed function hardware to do HRTF. HRTF's are notoriously expensive to compute, but they are the only way to get high quality 3D positioning without going to a 7.1 setup with stereo panning. The equalizer is a 7-band parametric EQ FYI, according to NVidia, it is done in fixed function hardware, but of course, could also be done in software too. IDLS2 is most likely done in the programmable part of the DSP. Face it, the GC DSP is way inferior on paper, and will be inferior in reality. Why this troubles you so much, I have no idea. No one said "don't buy a GC because it has inferior audio". We are not talking purchasing decisions. Someone made the claim that the GC has the best audio. It's obviously wrong.
back to tech talk, because the[Gamecube's] DSP is limited to 16-bit operations, it must waste extra cycles to get higher precision. Without higher precision FP, you get quantization errors. This is similar to using a 16-bit framebuffer. The more voices and effects, the greater the error. Plus, the input samples are limited to 16-bit, with means you lose the part of the dynamic range you'd need for the .1 channel frequencies. All this means, the Macronix DSP has to spend extra cycles to emulate higher precision arithmetic.
In order for the Macronix DSP to come close to competing, it would have to have significant fixed function hardware to make up for its low clock rate and small data bus, in addition, it would probably need significant FP units plus superscalar issue. I find this highly unlikely, especially since the chip is embedded on the Flipper die. All that clucking you're doing about Mhz != Mhz would be true in other contexts, but here its not, and you know it (or maybe you don't)
Another advantage the X-Box has is that the DSPs have enough power to store audio samples in memory compressed and decode them in real time. This saves on system memory, since loading up a humungeous amount of sound effects and voice samples wastes precious ram, and streaming them has too much latency except for music.
The GC could do this to, but it won't be as adept. There's no way its going to do 64 3D HRTF voices plus decompression, plus equalization, plus occlusion, near-field, and macro effects.
Dual 200Mhz DSPs with 24 bit precision????
For sound hardware, this is insane!!