Tommy Jones
Tommy Jones

@thos_jones

17 Tweets 8 reads Apr 16, 2020
I spent $3,000+ on computer parts this weekend to build a data science/machine learning box. And I have some thoughts. 1/n
I did a ton of research and lots of it was helpful. Shout out to @markpalatucci/how-to-build-a-silent-multi-gpu-water-cooled-deep-learning-rig-for-under-10k-aefcdd1f96a5" target="_blank" rel="noopener" onclick="event.stopPropagation()">medium.com and l7.curtisnorthcutt.com and timdettmers.com to name a very short few. 2/n
Also, in fairness, I haven't actually built the computer yet. But that's not what this thread is about. 3/n
I think that a lot of the low-level (i.e. what specific parts to buy) advice is pretty solid but some of the conceptual (e.g. how much RAM do you need, how powerful should your processor be) advice is missing the mark. 4/n
Google anything about "building a computer for machine learning" or "cpu for machine learning" or "memory for machine learning" and it will take you directly to a bunch of "hardware for deep learning" posts. (Note, those *were* my shout outs.) 5/n
The problems here are twofold and related. (1) Machine learning isn't just deep learning and (2) even for deep learning *in the real world* you aren't going to just get a beautiful and perfect dataset handed to you. So much has been written about (1), I'll just focus on (2). 6/n
A lot of the advice surrounding thoughts around CPU RAM (memory) focus on *getting data in and out of the GPU*. You get stuff like, "The CPU does little computation when you run your deep nets on a GPU. Mostly it (1) initiates GPU function calls, (2) executes CPU functions." 7/n
Yes, if you hand me a perfect little data set that needs no work, then sure. My CPU only hands things to my GPU. Win all the Kaggle! But in real life, things are different. You don't just need feature engineering, you also need data exploration. 8/n
So, a thing you might do is cluster your data to see if it has a natural grouping. But here's a thing with clustering, you have to calculate a distance matrix. For a moderately-sized data set this is both compute intensive and memory intensive. 9/n
For 'n' rows of your data, you're going to get an n X n matrix back. Each slot takes up 46 bytes. For 20,000 points that's 18.4 Gb just sitting there. But if you have 1,000,000 points, that's 46 TB, buckaroo. So maybe buy as much RAM as you can. 10/n
I forgot to mention that that's the size of your distance matrix *just sitting there*. During computation this can get even bigger, depending on abstraction. The only way to be 100% efficient is write your distance function from scratch in C or FORTRAN. Feeling up to it? 11/n
How long did that computation take? Well, that depends on your processor and the distance function you chose. But in the case of a simple "dot" product, your best case is O(n^2 / 2). For a 100 X 100 matrix that's 5,000 computations. 12/n
So your CPU better be pretty quick to get through all that, especially if you have those 1,000,000 points. (That's 500, BILLION computations!) 13/n
*IF* you bothered to/are capable of setting up a low-level (i.e. C or FORTRAN) BLAS that takes advantage of natural parallelism, then you can speed up that computation. But you need lots of cores. As many as possible. Far more than you'd need just to feed data to that GPU. 14/n
Ok. I'm 15 tweets in and burying the lead. Punchline coming up. 16/n
If you want to build a serious computer for machine learning. (All parts of real-life machine learning, failed experiments and all.) You need as much CPU and RAM as you can afford. And after that, if you are doing deep learning, you also need some $$ GPUs. n/n
And when you choose that GPU, the two most useful resources to me were this lambdalabs.com and this timdettmers.com (n+1)/n

Loading suggestions...