The Kernel wrote:HyperionX wrote:
Don't talk smack about shit you know next to nothing about. The PPC in the X360 is
NOT a PPC970, nor a POWERx derivative (which have OOOE). In fact, as speculated
= here it is the same CPU core as in the PS3, but without the vector units and there are three of them. You may argue against "speculation" but then your argueing that there is another chip exactly identical to the CPU core in PS3 that has never been seen or heard before. I find that exceedingly unlikely, especially considering that only this is the only CPU at IBM has the ability to reach 3Ghz+. In short, it is almost certainly the same, which makes a IOE processor.
That makes little sense considering that the alpha dev kits use PPC 970's.
That's because the
alpha dev kits are nothing more than dual core Macs. You won't see those in beta kits.
Even if true though, you still are wrong about a 50% across the board performance hit as well using reported IPC as a performance metric for determining emulation performance.
That's true, sort of. It won't be 50% on every application, but nearly every application will suffer a hit and those that get >50% hits that balances the average are especially problematic because in an emulator they need to run at full speed or emulation will stall.
Performance of this chip will be highly dependent on application. In order chips are easier to keep fed since you don't have to worry about dependencies or execution units being busy.
In other news, 2 + 2 = 5, up is down, black is white.
This is a total load of 100% wrong BS. This is
specifically the weakness of IOE: dependencies. An IO process must wait for dependencies to resolve before continued computation and the pipeline stalls, one of it's main critical weaknesses against OOO processors. In general while there are variances there are virtually no apps in which a IO processor outperforms an OOOE processor because of limitations like these, except for multithreaded apps and one having a surplus of IO cores. Executions units are rarely busy because the pipeline is stalled so often, which is a BAD thing because you want them to be busy as possible.
The only real problem with an in order chip is cache misses, which can lead to stalls while loads are fetched. This isn't nearly as big of a problem when you are dealing with an emulator as the primary execution code could be small enough to fit in a Level 1 data cache.
This is most definitely not the only "real" problem, but another one. Actually cache
hits are worse for IO than for OOO too because L1 cache has about 2-4 cycle latency and L2 has 10-20ish cycles; Latencies that OOOE processors can mask but an IOE processor can not. And I call your BS that any significant portion of code in an emulator could fucking fit in 32KB of L1 cache. Even zsnes (~500KB) doesn't fit in something that small, nevertheless a fully fledged x86 emulator + Windows kernel and the graphics API. Not even L2 cache could conceivably do it. It's going into main memory here folks.
EPIC doesn't count as while it is IOE it's also a 6-issue core (compared to 2 for the PPC and 3 for Pentiums) with a huge amount of cache (1.5-9MB L3, 256KB L2), and uses the IA-64 instruction set which was designed to lessen it's IOness, and it still loses to Pentiums and Athlons in integer ops.
Itaniums problem with integer performance center around the design philosophy, not it's in order nature. Integer code contains a number of conditional branches which makes it harder for an IA-64 compiler to create larger instruction bundles, which means several of the Itanium execution units might be dormant. This is why compiler performance is so important to the Itanium.
Which by the way is a total red herring. x86 has the same branch conditionals, so does PPC, and so does every other ISA ever made. What you're talking is about the fundemental limits of parallelism and how the IA-64 ISA deals with them. The fact that Itanium simply does not perform that well on integer (given it's resources) is a hit against IO processors at integer in general.
Integer performance is also the same weakness the much slimmer PPC has to face too. For more realistic CPUs, the 50% claim stands. If you dispute the 50% claim, then you dispute the guy who wrote that, not me. Floating point ops will be much better, which is the PPC's strength, but that's useless in emulation so backwards compatibility is still simply impossible.
The person who wrote that admits himself that the performance differences are highly dependent on the situation. Obviously when you are dealing with software that has a great deal of cache access, you are going to have performance loss using an in order design, but this does not include every integer heavy application, especially one custom written for the hardware at a low level the way an emulator is.
The first half is true, the second half came from someplace not credible. Get a link first if your gonna make wild claims about how emulators don't fall under IOE slowdowns.
Face it, backwards compatibility is, if not impossible highly unrealistic barring adding a real x86 chip (plus a proprietary media/IO chip in the Xbox1, but that's another story).