Tesla platforms

Built primarily for GPGPU and stream processing, Tesla GPUs were originally launched in 2007, based on the Tesla microarchitecture named after the legendary electrical engineer and scientist, Nikola Tesla. Though the Tesla C-class GPUs were released with dual-link DVI outputs, the general norm was not having any output ports.

The Tesla V100 belongs to the NVIDIA Volta architecture. It comes in 32 GB and 16 GB HBM2 memory versions with a 1,380 MHz boost clock speed, 640 tensor cores, and 5,120 CUDA cores.
As they are purely made with the purpose of accelerating computational applications, such GPUs are most suitable when displaying requirements are nil and computational tasks are the only priority. So, Tesla GPUs are specifically focused toward supercomputing.

They are very power efficient and generate less heat. This makes them ideal for industry-level deployment.

Currently, Tesla GPUs can immensely contribute to simulations and large-scale calculations (especially floating-point calculations—this is where double precision can be a huge advantage). Apart from pure computations, high-end image generation for applications in professional and scientific fields with CUDA or ROCm is also possible.

Tesla and Quadro GPUs have higher double-precision power when compared to consumer-level GeForce GPUs. Double precision makes mathematical computations much more accurate than single precision. GeForce's Titan GPUs also take care of this requirement when scientific computing and machine learning really matters a lot. But there are many other features that can be found only on the professional Quadro and Tesla series within NVIDIA.

To have a deeper understanding of the three, let's compare the specifications of the previous three GPU models.

First, let's see how the Titan V differs from the Quadro RTX 8000 and Tesla V100:

Memory and features comparison:

	GeForce Titan V	Quadro RTX 8000	Tesla V100
Microarchitecture	Volta	Turing	Volta
Memory size	12 GB	48 GB	16 GB
Memory type	HBM2	GDDR6	HBM2
Memory bus	3,072-bit	384-bit	4,096-bit
Bandwidth	652.8 GB/s	672.0 GB/s	897.0 GB/s
NVLink support	No	Yes	Yes
GPUDirect support	Limited	Full	Full

Performance comparison:

	GeForce Titan V	Quadro RTX 8000	Tesla V100
Pixel Rate	~139.7 GPixel/s	~169.9 GPixel/s	~176.6 GPixel/s
Texture rate	~465.6 GTexel/s	~509.8 GTexel/s	~441.6 GTexel/s
FP16 (half) performance	~27.5 TFLOPS	32.6 TFLOPS	~31.4 TFLOPS
FP32 (float) performance	~14.8 TFLOPS	~16.3 TFLOPs	~14.1 TFLOPS
FP64 (double) performance	Up to 6.875 TFLOPS	~0.5 TFLOPS	~7.8 TFLOPS

Clock speeds:

	GeForce Titan V	Quadro RTX 8000	Tesla V100
GPU clock	1,200 Mhz	1,395 MHz	1,246 MHz
Boost clock	1,455 MHz	1,770 MHz	1,380 MHz
Memory clock	850 MHz	1750 MHz	876 MHz
Effective memory clock	1,700 MHz	14 GHz	1,752 MHz

Deep learning performance comparison:

	GeForce Titan V	Quadro RTX 8000	Tesla V100
TensorTFLOPS	110	130.5	125

GPUs without tensor cores can equivalently compare with their conventional floating-point operations per second (FLOPS) to benchmark their maximum DL performance.

The Tesla V100 comes in four different variants:

	Double precision performance (FP64)	TensorTFLOPS
Tesla V100 PCI-E 16GB	7 TFLOPS	112
Tesla V100 PCI-E 32GB	4.7 TFLOPS	18.7
Tesla V100 SXM 16GB	7.8 TFLOPS	125
Tesla V100 SXM 32GB	5.3 TFLOPS	21.2