So first of all, there are TPUs, Tensor Processing Units, like this one that Google bought https://coral.ai/ / https://coral.ai/products/ that are more specialised. They’re ASICs.
A tensor processing unit (TPU) is an AI accelerator application-specific integrated circuit (ASIC) developed by Google specifically for neural network machine learning, particularly using Google’s own TensorFlow software.[1] Google began using TPUs internally in 2015, and in 2018 made them available for third party use, both as part of its cloud infrastructure and by offering a smaller version of the chip for sale.
Second, like, you don’t even need physical cards. You can rent a server at Hetzner, or just buy “Compute” on AWS or Google, etc.
So, how to train your dragon?
Gaming cards are not as fast as TPUs, but they’re pretty good for gaming. That’s something to consider too.
“Which graphics card for deep learning?”
2019 is bit outdated. Before the latest AMD RX 57*/58*
(eg “RX 580X”) series.
Latest advice, 2020 August:
“AMD Ryzen Threadripper 2950x with 2 x Nvidia RTX 2080 Ti.”
NVIDIA has better software support, usually. It’s almost like vi vs. emacs – an eternal battle of the hardware Gods, to increase FLOPS. AMD vs. NVIDIA, newt vs snake, red vs. blue.
AMD has “Vega” 7nm manufacturing process. It’s ahead, for now.
Well, ok here we go, for AMD: holy moly $1899 https://www.amd.com/en/graphics/servers-radeon-instinct-mi
Recent tech radar says:
Best graphics cards at a glance
- AMD Radeon RX 5700
- Nvidia GeForce RTX 2080 Ti
- AMD Radeon RX 5600 XT
- Nvidia GeForce RTX 2070 Super
- Nvidia GeForce GTX 1660 Super
- AMD Radeon VII
- Nvidia GeForce RTX 2080 Super
- Zotac GeForce GTX 1080 Ti Mini
- Gigabyte GeForce GTX 1660 OC 6G
- PNY GeForce GTX 1660 Ti XLR8 Gaming OC
NVIDIA has this Edge TPU, https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-xavier-nx/?nvid=nv-int-csfg-78188#cid=gtcev_nv-int-csfg_en-us
JETSON XAVIER NX – 21 teraflops. $399
Tera Operations per Second (TOPS).
21 TOPS (at 15 W) or up to 14 TOPS (at 10 W).
Tera is a lot of OPS.
Anyway, what to think of all this? Graphics cards are pretty expensive. And there’s a whole new world of IoT edge computing devices, which are more what we’re interested in, anyway.
For graphics cards, about a year ago, GTX1060 (6GB) was the best deal. AMD was out of the race. But then they got 7nm processing and whipped up some cool sounding CPUs, in like 16 and 32 core versions. So however shitty their software is, they make very efficient, parallelised products, using CPU and GPU, and have historically been the one that follows open standards. NVIDIA is proprietary. But CUDA used to be practically the only game in town.
Anyway, we can just see how long it takes to train the detectron2 chicken and egg image segmentation code.
I can probably just leave my 4 CPU cores training over night for the things we want to do, or set up the raspberry pi to work on something.