During Nvidia’s GTC presentation, chairman Jensen Huang revealed the first GPUs of the Ada Lovelace generation (RTX 40).
An introduction with Ada Lovelace
Apart from specific GPUs, Ada Lovelace has on average 70 percent more CUDA cores on the same surface than Ampere, in a more efficient architecture for the compute clusters. As previously hinted, Nvidia is building the new GPUs on TSMC’s 4N process (5 nanometers).
Because new streaming processors can now gradually reorganize their division of tasks (‘Shader Execution Reordering’, or SER), Nvidia claims to be able to achieve up to twice the efficiency in power efficiency. Especially when handling ray tracing, GPUs benefit from a streamlined throughput of complex calculations — SER automatically optimizes the delivery of such data to the graphics processor.
The new ray tracing clusters feature Nvidia’s third generation of RT cores, with twice the performance in select tasks surrounding light reflection. The matrix-based Tensor cores are also getting an upgrade with their fourth generation; the format has been learned from Nvidia’s Hopper processor. Potentially, the heaviest Ada Lovelace chips can spit out up to 1,300 teraflops in Tensor commands.
Roughly speaking, this means that Ada Lovelace should be twice as fast (as Ampere) in rasterized applications and four times as fast in ray tracing. That power was measured at roughly the same wattages, which is why Huang writes the new generation as “incredibly high-efficiency”. In contrast, there was zero statement about the maximum powers (TDP or TGP) for the first Ada Lovelace cards.
As expected, the successor of Ampere once again kicks off with the higher segment of video cards. In the previous generation, this also immediately included a GeForce RTX 3070; this time, the RTX xx70 model seems to be taking longer. Select rumors of questionable RTX 4070 specs already predicted something similar.
GeForce RTX 4090
The provisional flagship of the Ada Lovelace generation is again an RTX xx90 model, this time the RTX 4090. The GPU should be a total of three to four times as powerful as the previously released RTX 3090 Ti, while both cards have the same 24 GB of GDDR6X. memory (21 Gbps, 384-bit).
The RTX 4090 runs on Nvidia’s heaviest AD102 GPU, with 16,384 CUDA cores at 2,520 MHz. Huang indicated that the GPU is easy to overclock to above 3.0 GHz. The chairman himself did not mince words, but previous leaks claim a minimum TGP of 450 watts, with peaks of up to approximately 660 watts.
The RTX 4090 will be officially released on October 12, with a suggested retail price of 1,949 euros.
GeForce RTX 4080 (12 GB) and RTX 4080 (16 GB)
The GeForce RTX 4080 is split this generation (once again) into two different models. The standard model has 12 GB of GDDR6X (21 Gbps, 192-bit), while a more luxurious edition comes with 16 GB of GDDR6X (22.5 Gbps, 256-bit). Both cards should be roughly two to four times as powerful as the RTX 3080 Ti from early this year.
The two variants are less identical than previously thought. The heavier model runs on the AD102-300 chipset; the lighter one will do with the AD104-400. This also includes different amount of streaming processors (76 versus 60) and CUDA cores (9728 versus 7680).
In this case too, Huang did not say anything about the alleged consumption of the cards, although recent leaks state that these would be standard TGPs of 320 watts and 285 watts; considerably less than some rumors previously suggested.
The RTX 4080 (16 GB) has a suggested retail price of 1,469 euros; the price of the lighter 12 GB model starts from 1,099 euros. Both models should be available by mid-November 2022.
DLSS 3.0 and AV1 Encoding
As always, Nvidia also unveiled new technologies to put the new architecture on a pedestal. The new optical flow accelerators in the Tensor cores help make smart upscaling just that little bit smoother, pushing Nvidia’s Deep Learning Super Sampling (DLSS) into a third generation.
DLSS 3.0 promises to generate up to three times as many frames in upscaled 4k resolutions, compared to native 4k, in select games. That also applies in combination with Nvidia’s own ray tracing. For example, a Microsoft Flight Simulator draws frame rates of over 110 fps with ‘RTX On’, compared to 54 fps in native 4k without ray tracing.
In addition to a new DLSS, Ada Lovelace also introduces native coding to the new AV1 standard. Using the new Nvidia GPUs, video files and streams can be encoded in AV1, with higher image quality on smaller file sizes (than, for example, H.265). Nvidia is thus following Intel, who already included AV1 in their first Intel Arc GPUs.
RTX Remix, free ray tracing DLC and more
A newer technology is RTX Remix, which allows mod makers to enrich old games (using USD recordings) relatively easily with ray tracing, AI-driven upscalers for textures and other new effects. The tool is offered right away in Nvidia’s Omniverse (famous for the Ampere reveal), shortly after Ada Lovelace appears.
As an example for the new Omniverse capabilities, Portal RTX was shown, a mod for the original PC version of Portal that bakes ray tracing right into the classic game. Portal RTX will be released in November as free downloadable content for gamers who already own the game. Ada Lovelace cards are probably not necessarily needed to run the mod.
Apart from game-focused disclosure, Nvidia also seems to be focusing this generation on AI, content creation, the metaverse and robotics. The bulk of the GTC presentation takes a closer look at how Nvidia’s most powerful GPUs accelerate everything from augmented reality for surgeons to self-driving cars powered by Nvidia’s new Thor processor — comprised of Nvidia’s proprietary Grace, Hopper, and Ada architectures.