Back in April, graphics specialist NVIDIA (NASDAQ:NVDA) introduced a chip targeted at data centers called the GP100. GP100 packs 3,840 CUDA cores (though only 3,584 are enabled in the first commercial instantiation of GP100, known as the Tesla P100) and is capable of 10.6 teraflops of single-precision floating point performance and 5.3 teraflops of double-precision performance.
On July 21, NVIDIA announced another chip based on the Pascal architecture with 3,840 CUDA cores known as GP102. GP102, unlike GP100, doesn't include much in the way of specialized double precision circuitry, uses GDDR5X memory instead of HBM2 (stacked memory), and likely does not include any of the circuitry that enables connectivity via NVIDIA's proprietary NVLink interconnect.
GP102 is finding its way into a number of products. It powers the recently announced prosumer NVIDIA Titan X, as well as the recently announced Quadro P6000 targeted at professional visualization applications.
I have seen some speculation across the web suggesting that after deploying the GP102 to handle Titan duty, NVIDIA will then bring GP100 out as a gaming part.
Here's why this is extremely unlikely.
For gaming, GP102 is going to be faster
NVIDIA's Titan series of graphics cards are targeted at both ultra-enthusiast gamers as well as prosumers (folks who produce and consume media) looking for lots of single-precision computing power on the cheap.
For these applications, particularly gaming, a GP102-based product is going to be a superior choice to a GP100 based product. How do we know this? Easy.
NVIDIA actually announced a few add-in boards under its Tesla branding based on the GP100 chip. According to the company, the add-in-board version of GP100, which is limited to 250 watts' worth of power consumption, delivers only 9.3 teraflops of single-precision performance.
In that same power envelope, the GP102-based Titan X can deliver approximately 11 teraflops of single-precision performance.
Since games don't benefit from double-precision compute performance -- just single precision -- the GP102-based Titan X would likely be faster than a GP100-based Titan X for the applications that Titan buyers are going to care about.
A full-enabled GP102 might come later, but...
There is some speculation, in light of NVIDIA's previous product launches, that NVIDIA might later release a fully enabled version of GP102 as a next-generation Titan. This wouldn't be too farfetched at first glance, as NVIDIA did something in 2013/2014 with the original Titan and then its successor the Titan Black.
However, I don't think the performance improvement that NVIDIA will be able to deliver with the additional CUDA cores enabled will actually be all that much. Recall that the original Titan X was based on a fully enabled version of a chip known as GM200. Later on, the company introduced a product known as GTX 980 Ti which had roughly 7% of its CUDA cores disabled (similar to the new Titan X versus a full implementation of GP102).
Hardware review site AnandTech found that the 980 Ti was just 3% slower than the original Titan X on average. AnandTech's testing showed that, given a fixed power envelope, the 980 Ti with fewer CUDA cores ran at higher frequencies (higher frequencies for a given architecture means more performance) than the original Titan X, which helped to erase much of the performance deficit that the disabled CUDA cores on the 980 Ti should have caused.
In light of this, I don't think a fully enabled GP102 would be a worthwhile Titan-class product.
For NVIDIA to get a big performance boost from here, the graphics specialist is going to need to release a next-generation graphics architecture -- Volta.