Deep learning workloads are very sensitive to the performance of the hardware they run on. Even a slight performance bottleneck in a system can drastically reduce the training completion time or limit the throughput.
The best practice to get the most out of your machine is to have at least two GPUs per deep learning workstation.
However, this depends on several factors:
- How many users are there?
- How demanding are your applications?
- Do you need extremely high-performance computing (HPC)?
This article will explore how many GPUs you should have for different use cases and scenarios. Moreover, we’ll check out the details about what kind of limitations virtualization or high-demand workloads may cause.
Table of Contents
How Many GPUs Are Required for Deep Learning Training?
The number of GPUs required for deep learning training depends on the model’s complexity, dataset size, and available resources. Starting with at least 4 GPUs can significantly accelerate training time.
Deep learning training is when a model is built from start to finish. It might include the process of collecting data and training a neural network. This process, which you can think of as ‘building the model,’ is the focus of most deep learning research and is the most crucial part of the whole process.
Training usually forms an iterative process of building, retraining, and validating the model. Retraining could include adjusting the neural network weights to learn a new model.
Validation is when you check that the model is accurate and working as it should after training. Generally, training requires a lot more GPU servers than inference.
When you’re training a model, the GPU servers are constantly working at full power, shuffling data to build the model and feeding the model with examples.
If you’re training images, the GPUs are taking the image data and turning it into the model data, which can be thought of as the ‘program’ that makes the model work.
For training, you need a lot of GPUs for two main reasons:
First, you usually work with larger datasets, which is why you need more GPUs. Second, training requires more GPU power and memory than inference.
Also Read: The Evolution of GPU
How Many GPUs Are Required for Deep Learning Inference?
The inference uses a built-in training model to predict or draw conclusions about unseen data.
This process, known as ‘running the model,’ is the focus of deep learning inference research. Inference can be much faster than training because you don’t usually need to work with the whole dataset, and you don’t have to build the model from scratch.
For inference, you usually need fewer GPUs than for training. For one thing, most inference models don’t require as much memory as training models. These are the most commonly cited reasons, but others are perhaps more important.
The truth is that inference is not a very bandwidth-intensive process. It needs lots of small data, so it doesn’t require the same bandwidth as training, which has to send vast amounts of data simultaneously.
In fact, the research community has concluded that an essential factor for the success of inference at scale is the reduction of data movement.
Inference generally is accomplished with the help of the data sharing process, where the data is distributed across several servers, and only a portion of the information is sent at a time. It can also be achieved by using an in-sourcing technique, which brings the data to the inference engine.
Most inference focuses on speed, so you don’t need as many GPUs working at full power.
Also Read: How to Find Best GPU for Deep Learning
A significant factor determining how many GPU servers you need for a workstation is the number of virtual machines you can host on your hardware. It’s common for organizations to use virtualization on a scale.
For example, you might have one virtual machine that hosts all your production applications and a separate virtual machine that hosts all your experiment infrastructures.
Virtualization is great, but it has its limitations. When you virtualize a GPU, you can host only one GPU in a single machine.
So, if you have 10 workstations with 2 GPUs each, you can only have 2 of them running the virtual machines.
Virtualization can also be complicated and requires additional hardware, management, and maintenance overhead.
Virtualization can be a good fit for some organizations, but it’s not always a good fit for others.
High Demand Workloads
Even the best-optimized GPU hardware might struggle under the high demand of some workloads. Even if you have the most potent GPU hardware, it might not be enough for high-demand workloads.
For example, training models for image recognition usually rely heavily on GPUs. It’s common for a deep learning workstation to include a single GPU for training and a huge amount of GPU power for inference.
However, training models for image recognition is impractical with a single GPU. Unless you have a GPU with a high amount of memory, it’s simply too slow.
Even using multiple GPUs to improve performance might not be enough to train the models necessary for a workload like this.
You’d need to have enough GPUs to spread a load of training across all the GPUs, but not so many that they struggled to keep up with the demand.
So, How Many GPUs Do You Need?
The answer to this question depends on the tasks that your deep learning workstation needs to perform.
If there are hundreds of thousands of training images or categories, then a single GPU will not be able to handle those tasks alone. In this case, multiple GPUs can be used together to achieve higher performance than if only one GPU was used.
However, if only small amounts of data are being processed at once, for example, 1-2 million, then it may not make sense for your computer system or server setup! It might be better to buy another graphics card instead—a single high-end GPU such as an NVIDIA GeForce GTX 1080 Ti.
It will outperform many low-end CPUs when performing deep learning workloads like training neural networks using big data sets because they have more cores and faster clock speeds than current generation CPUs from Intel.
How To Determine What Training or Inference Problem You’re Trying to Solve?
The best way to determine how many GPUs are needed depends on what kind of training or inference problem you’re trying to solve with deep learning:
It might make sense if your goal is to set up a particular number of CPU-only machines and use them as personal cloud instances (e.g., running Pytorch). Not only because these machines don’t need expensive GPUs but also because they can operate at very high speeds while still being inexpensive compared with dedicated servers from companies like AWS or Google Cloud Platform (GCP).
In fact, since these models are trained using machine learning algorithms instead of traditional programming languages such as C++ or Java—and therefore don’t require any special technical knowledge—they tend not to require much more than 512MB RAM per instance!
5 Tips to Make The Best of Fewer GPUs
To make the best use of the GPU cloud, here are a few suggestions:
- Install your GPUs with the correct orientation. They must have the exhaust facing the open air and the intake facing the case. Keep them away as much as possible to maintain ideal airflow.
- Reduce the power usage of each GPU by lowering the power profile in your OS or by installing a card with low power consumption. Install the latest drivers.
- If you are using a laptop, close any unneeded programs. If you are using a desktop, try to close any programs that are not being used. This will reduce power consumption and heat generation.
- Use a low-speed fan curve to reduce the load on the card. Install an additional card with low power consumption.
- Purchase a new card with low power consumption, or use one of your current cards with low power consumption.
Deep learning workstations can be very powerful, but you’ll need to consider how many GPUs you’ll need for your specific use case to ensure you have the resources available.
When evaluating whether or not to use GPUs in your deep learning workstation, you must consider how many GPUs you need and how much memory and bandwidth they provide. You can also consider whether you want to go with a single card or multiple cards on one motherboard.
You can always scale up with more hardware if you can’t find the answer with just one workstation. Consider also that some workloads might not be practical with just a single GPU.
People also reading: