It’s been a while between drinks for DirectX with the latest release, 11, coming out some 6 years ago . This can be partly attributed to the consolization of PC games, putting a damper on the demand for new features, however Vista having exclusivity on DirectX 10 was the biggest factor ensuring that the vast majority of gamers simply didn’t have access to it. Now that the majority of the gaming crowd has caught up and DirectX 11 titles abound demand for a new graphics pipeline that can make the most of new hardware has started to ramp up and Microsoft looks ready to deliver on that with DirectX 12. Hot on the heels of that however is Vulkan, the new OpenGL standard that grew out of AMD’s Mantle API which is shaping up to be a solid competitor.
Underpinning both of these new technologies is a desire for the engines to get out of the way of game developers by getting them as close to the hardware as possible. Indeed if you look at the marketing blurb for either DirectX 12 or Vulkan it’s clear that they want to market their new technology as being lightweight, giving the developers access to more of the graphical power than they would have had previously. The synthetic benchmarks that are making the rounds seem to confirm this showing a lot less time spent sending jobs to the GPUs thus eeking out more performance for the same piece of hardware. However the one feature that’s really intrigued me, and pretty much everyone else, is the possibility of these new APIs allowing SLI or CrossFire like functionality to work across different GPUs, even different brands.
The technology to do this is called Split Frame Rendering (SFR) an alternative way of combining graphics cards. The traditional way of doing SLI/CrossFire is called Alternate Frame Rendering (AFR) which sends odd frames to one card and even frames to the other. This is what necessitates the cards being identical and the reason why you don’t get a 100% performance boost from using 2 cards. SFR on the other hand makes both of the GPUs work in tandem, breaking up a scene into 2 halves and sending one of to each of the graphics cards. Such technology is already available for games that make use of the Mantle API for gamers who have AMD cards with titles like Civilization: Beyond Earth supporting SFR.
For Vulkan and DirectX 12 this technology could be used to send partial frames to 2 distinct types of GPUs, negating the need for special drivers or bridges in order to divvy up frames between 2 GPUs. Of course this then puts the onus on the game developer (or the engine that’s built on top of these APIs) to build in support for this rather than it sitting with the GPU vendor to develop a solution. I don’t think it will be long before we see the leading game engines support SFR natively and so you’d likely see numerous titles being able to take advantage of this technology without major updates required. This is still speculative at this point however and we may end up with similar restrictions around SFR like we currently have for AFR.
There’s dozens more features that are set to come out with these new set of APIs and whilst we won’t see the results of them for some time to come the possibilities they open up are quite exciting. I can definitely recall the marked jump up in graphical fidelity between DirectX 10 and 11 titles so hopefully 12 does the same thing when it graces our PCs. I’m interested to see how Vulkan goes as since it’s grown out of the Mantle API, which showed some very significant performance gains for AMD cards that used it, there’s every chance it’ll be able to deliver on the promises its making. It really harks back to the old days, when wars between supporters of OpenGL and DirectX were as fervent as those between vi and emacs users.
We all know that vi and DirectX are the superior platform, of course.
The story of AMD’s rise to glory on the back of Intel’s failures is well known. Intel, filled with the hubris that can only come from maintaining a dominate market position as long as they had, thought that the world could be brought into the 64bit world on the back of their brand new platform: Itanium. The cost for adopting this platform was high however as it made no attempts to be backwards compatible, forcing you to revamp your entire software stack to take advantage of it (the benefits of which were highly questionable). AMD, seeing the writing on the wall, instead developed their x86-64 architecture which not only promised 64bit compatibility but even went as far as to outclass then current generation Intel processors in 32bit performance. It was then an uphill battle for Intel to play catchup with AMD but the past few years have seen Intel dominate AMD in almost every metric with the one exception of performance per dollar at the low end.
That could be set to change however with AMD announcing their new processors, dubbed Kaveri:
On the surface Kaveri doesn’t seem too different from the regular processors you’ll see on the market today, sporting an on-die graphics card alongside the core compute units. As the above picture shows however the amount of on die space dedicated to said GPU is far more than any other chip currently on the market and indeed the transistor count, which is a cool 2.1 billion, is a testament to this. After that however it starts to look more and more like a traditional quad core CPU with an integrated graphics chip, something few would get excited about, but the real power of AMD’s new Kaveri chips comes from the architectural changes that underpin this insanely complex piece of silicon.
The integration of GPUs onto CPUs has been the standard for some years now with 90% of chips being shipped with an on-die graphics processor. For all intents and purposes the distinction between them and discrete units are their location within the computer as they’re essentially identical at the functional level. There is some advantages gained due to being so close to the CPU (usually to do with latency that’s eliminated by not having to communicate over the PCIe bus) but they’re still typically inferior due to the amount of die space that can be dedicated to them. This was especially true of generations previous to the current one which weren’t much better than the integrated graphics cards that shipped with many motherboards.
Kaveri, however, brings with it something that no other CPU has managed before: a unified memory architecture.
Under the hood under every computer is a whole cornucopia of different styles of memory, each with their own specific purpose. Traditionally the GPU and CPU would each have their own discrete pieces of memory, the CPU with its own pool of RAM (which is typically what people refer to) and the GPU with similar. Integrated graphics would typically take advantage of the system RAM, reserving part a section for its own use. In Kaveri the distinction between the CPU’s and GPUs memory is gone, replaced by a unified view where either processing unit is able to access the others. This might not sound particularly impressive but it’s by far one of the biggest changes to come to computing in recent memory and AMD is undoubtedly the pioneer in this realm.
GPUs power comes from their ability to rapidly process highly parallelizable tasks, examples being things like rendering or number crunching. Traditionally however they’re constrained by how fast they can talk with the more general purpose CPU which is responsible for giving it tasks and interpreting the results. Such activities usually involve costly copy operations that flow through slow interconnects in your PC, drastically reducing the effectiveness of a GPU’s power. Kaveri CPUs on the other hand suffer from no such limitations allowing for seamless communication between the GPU and the CPU enabling them both to perform tasks and share results without the traditional overhead.
The one caveat at this point however is that software needs to be explicitly coded to take advantage of this unified architecture. AMD is working extremely hard to get low level tools to support this, meaning that programs should eventually be able to take advantage of it without much hassle, however it does mean that the Kaveri hardware is arriving long before the software will be able to take advantage of it. It’s sounding a lot like an Itanium moment here, for sure, but as long as AMD holds good to their promises of working with tools developers to take advantage of this (whilst retaining the required backwards compatibility) this has the potential to be another coup for AMD.
If the results from the commercial units are anything to go by then Kaveri looks very promising. Sure it’s not a performance powerhouse but it certainly holds its own against the competition and I’m sure once the tools catch up you’ll start to see benchmarks demonstrating the power of a unified memory architecture. That may be a year or two out from now but rest assured this is likely the future for computing and every other chip manufacturer in the world will be rushing to replicate what AMD has created here.
Ever since the first console was released they have always been at arms length with the greater world of computing. Initially this was just a difference in inputs as consoles were primarily games machines and thus did not require a fully fledged keyboard but over time they grew into being purpose built systems. This is something of a double edged sword as whilst a tightly controlled hardware platform allows developers to code against a set of specifications it also usually meant that every platform was unique which often meant that there was a learning curve for developers every time a new system came out. Sony was particularly guilty of this as the PlayStation 2 and 3 were both notoriously difficult to code for; the latter especially given its unique combination of linear coprocessors and giant non-linear unit.
There was no real indication that this trend was going to stop either as all of the current generation of consoles use some non-standard variant of some comparably esoteric processor. Indeed the only console in recent memory to attempt to use a more standard processor, the original Xbox, was succeeded by a PowerPC driven Xbox360 which would make you think that the current industry standard of x86 processors just weren’t suited to the console environment. Taking into account that the WiiU came out with a PowerPC CPU it seem logical that the next generation would continue this trend but it seems there’s a sea change on the horizon.
Early last year rumours started circulating that the next generation PlayStation, codenamed Orbis, was going to be sporting a x86 based processor but the next generation Xbox, Durango, was most likely going to be continuing with a PowerPC CPU. As it turns out this isn’t the case and Durango will in fact be sporting an x86 (well if you want to be pedantic its x86-64, or x64). This means that its highly likely that code built on the windows platform will be portable to Durango and makes the Xbox the launchpad for the final screen in Microsoft’s Three Screens idea. This essentially means that nearly all major gaming platforms share the same coding base which should make cross platform releases far easier than they have been.
News just in also reveals the specifications of the PlayStation 4 confirming the x86 rumours. It also brings with it some rather interesting news: AMD is looking to be the CPU/GPU manufacturer of choice for the next generation of consoles.
There’s no denying that AMD has had a rough couple years with their most recent quarter posting a net loss of $473 million. It’s not unique to them either as Intel has been dealing with sliding revenue figures as the mobile sector heats up and demand for ARM based processors, which neither of the 2 big chip manufacturer’s provide, skyrockets. Indeed Intel has stated several times that they’re shifting their strategy to try and capture that sector of the market with their most recent announcement being that they won’t be building motherboards any more. AMD seems to have lucked out in securing the CPU for the Orbis (and whilst I can’t find a definitive source it looks like their processor will be in Durango too) and the GPU for both of them which will guarantee them a steady stream of income for quite a while to come. Whether or not this will be enough to reinvigorate the chip giant remains to be seen but there’s no denying that it’s a big win for them.
The end result, I believe, will be an extremely fast maturation of the development frameworks available for the next generation of consoles thanks to their x86 base. What this means is that we’re likely to see titles making the most of the hardware much sooner than we have for other platforms thanks to their ubiquity of their underlying architecture. This will be both a blessing and a curse as whilst the first couple years will see some really impressive titles past that point there might not be a whole lot of room for optimizations. This is ignoring the GPU of course where there always seems to be better ways of doing things but it will be quickly outpaced by its newer brethren. Combine this with the availability of the SteamBox and we could see PCs making a come back as the gaming platform of choice once the consoles start showing their age.