Tesla platforms

Built primarily for GPGPU and stream processing, Tesla GPUs were originally launched in 2007, based on the Tesla microarchitecture named after the legendary electrical engineer and scientist, Nikola Tesla. Though the Tesla C-class GPUs were released with dual-link DVI outputs, the general norm was not having any output ports.

The Tesla V100 belongs to the NVIDIA Volta architecture. It comes in 32 GB and 16 GB HBM2 memory versions with a 1,380 MHz boost clock speed, 640 tensor cores, and 5,120 CUDA cores.
As they are purely made with the purpose of accelerating computational applications, such GPUs are most suitable when displaying requirements are nil and computational tasks are the only priority. So, Tesla GPUs are specifically focused toward supercomputing.

They are very power efficient and generate less heat. This makes them ideal for industry-level deployment.

Currently, Tesla GPUs can immensely contribute to simulations and large-scale calculations (especially floating-point calculations—this is where double precision can be a huge advantage). Apart from pure computations, high-end image generation for applications in professional and scientific fields with CUDA or ROCm is also possible.

Tesla and Quadro GPUs have higher double-precision power when compared to consumer-level GeForce GPUs. Double precision makes mathematical computations much more accurate than single precision. GeForce's Titan GPUs also take care of this requirement when scientific computing and machine learning really matters a lot. But there are many other features that can be found only on the professional Quadro and Tesla series within NVIDIA.

To have a deeper understanding of the three, let's compare the specifications of the previous three GPU models.

First, let's see how the Titan V differs from the Quadro RTX 8000 and Tesla V100:

GeForce Titan V

Quadro RTX 8000

Tesla V100

Microarchitecture

Volta

Turing

Volta

Memory size

12 GB

48 GB

16 GB

Memory type

HBM2

GDDR6

HBM2

Memory bus

3,072-bit

384-bit

4,096-bit

Bandwidth

652.8 GB/s

672.0 GB/s

897.0 GB/s

NVLink support

No

Yes

Yes

GPUDirect support

Limited

Full

Full

GeForce Titan V

Quadro RTX 8000

Tesla V100

Pixel Rate

~139.7 GPixel/s

~169.9 GPixel/s

~176.6 GPixel/s

Texture rate

~465.6 GTexel/s

~509.8 GTexel/s

~441.6 GTexel/s

FP16 (half) performance

~27.5 TFLOPS

32.6 TFLOPS

~31.4 TFLOPS

FP32 (float) performance

~14.8 TFLOPS

~16.3 TFLOPs

~14.1 TFLOPS

FP64 (double) performance

Up to 6.875 TFLOPS

~0.5 TFLOPS

~7.8 TFLOPS

GeForce Titan V

Quadro RTX 8000

Tesla V100

GPU clock

1,200 Mhz

1,395 MHz

1,246 MHz

Boost clock

1,455 MHz

1,770 MHz

1,380 MHz

Memory clock

850 MHz

1750 MHz

876 MHz

Effective memory clock

1,700 MHz

14 GHz

1,752 MHz

GeForce Titan V

Quadro RTX 8000

Tesla V100

TensorTFLOPS

110

130.5

125

 

GPUs without tensor cores can equivalently compare with their conventional floating-point operations per second (FLOPS) to benchmark their maximum DL performance.

The Tesla V100 comes in four different variants:

Double precision performance (FP64)

TensorTFLOPS

Tesla V100 PCI-E 16GB

7 TFLOPS

112

Tesla V100 PCI-E 32GB

4.7 TFLOPS

18.7

Tesla V100 SXM 16GB

7.8 TFLOPS

125

Tesla V100 SXM 32GB

5.3 TFLOPS

21.2