What Really Determines the Speed of Your PyTorch Code? CUDA Benchmarking Guide
Originally published on HackerNoon Anyone who works with PyTorch model code starts asking the same questions: Why is this taking so long? How do I make my training loop faster? Whether you’re an ML engineer, a researcher or just decided to play around with a random ML repository over the weekend, you will eventually try to understand how to speed your code up. However, before we can do that, we need to learn how to measure performance correctly. And then draw the right conclusions from these measurements. This article is about exactly that, about properly benchmarking CUDA or PyTorch code. ...