Show HN: Run PyTorch locally with a remote GPU backend

4 points by alyxya 16 hours ago

I integrated a remote GPU execution backend into PyTorch through the same system that custom hardware accelerators get integrated into PyTorch. You can create a remote machine and obtain its CUDA device whenever you want to create or move tensors onto the remote GPU.

  machine = mycelya_torch.RemoteMachine("modal", "A100")
  cuda_device = machine.device("cuda")
  x = torch.randn(1000, 1000, device=cuda_device)
  y = torch.randn(1000, 1000).to(cuda_device)

I made it reasonably performant by having most operations dispatch asynchronously whenever possible. For cases where slow performance is unavoidable such as uploading many GB of weights onto the GPU, I added a decorator that can be applied to functions to turn it into a remotely executed function. For the most part, the function should behave the same with or without the decorator; the main difference is whether the function code is executed locally or remotely.

  import mycelya_torch
  from transformers import AutoModelForCausalLM, AutoTokenizer

  @mycelya_torch.remote
  def load_model(model_name: str):
      tokenizer = AutoTokenizer.from_pretrained(model_name)
      model = AutoModelForCausalLM.from_pretrained(
          model_name, torch_dtype="auto", device_map="auto"
      )
      return model, tokenizer

You can only use it with Modal as the cloud provider right now, and it's free to use with their free monthly credits. I appreciate any feedback and bug reports :)