torch.compile is a pt2.0 feature and has nothing to do with handwritten cuda kernels
> How easy is it to run on older GPUs
this is a torch cpp extension
https://github.com/HazyResearch/ThunderKittens/blob/8daffc9c...
so you're going to have the same exact issue (whatever issue you're having)
torch.compile is a pt2.0 feature and has nothing to do with handwritten cuda kernels
> How easy is it to run on older GPUs
this is a torch cpp extension
https://github.com/HazyResearch/ThunderKittens/blob/8daffc9c...
so you're going to have the same exact issue (whatever issue you're having)