Every ML engineer will be impacted. Web server probably less so.

singhrac · on July 8, 2023

I think it’s the other way around? I haven’t run into any ML code that could be multithreaded that wasn’t written in C++, but have often run into server tasks that could use a polling thread, etc.

All the ML code is written in lower level languages and that’s very unlikely to change, GIL or no.

brrrrrm · on July 8, 2023

driving multiple GPUs on the same node is better handled by threads. Python is forced to use multiprocessing

singhrac · on July 8, 2023

Yeah, you're right - even though CUDA is async, doing any preprocessing (in Python) can be harder if you don't have shared memory (the start-up latency hit of multiprocessing is not a problem in this context). I've only ever encountered "embarrassingly parallel" data-feeding problems, where the memory overhead of multiprocessing was small, but I could see other situations. Comment retracted.