Machine Learning Bedroom Lab Setup

Machine Learning Bedroom Lab Setup
Razor Core X with Nvidia RTX 4060 ti 16 GB using Thunderbolt 3 connection on an Intel Macbook Pro

I've started working on my first ML models. And while I've waited a few dozen times, every now and then, when I've trained the model, I've started to feel the passing of time more awfully.

And so, I've searched for ways to make the workflow a little bit faster.

First idea was to work with cloud instances, but I didn't had the best experience with Jupiter notebooks in the cloud, and VM might be relatively pricey for research purposes. I'm not even starting the discussions about privacy concerns.

And because of these reasons I've started to look to create a ML and data science bedroom lab setup.

Use dedicated AMD card of Intel based Macbook Pros using MPS in PyTorch

First, I've tried to use the internal 4 GB VRAM AMD card using Apple's MPS (Metal Performance Shaders). For this I had to specify the device as "mps" to PyTorch.

has_gpu = torch.cuda.is_available()
has_mps = torch.backends.mps.is_built()
device = "mps" if has_mps \
    else "cuda" if has_gpu else "cpu"
print(f"Python Platform: {platform.platform()}")
print(f"PyTorch Version: {torch.__version__}")
print()
print("GPU is", "available" if has_gpu else "NOT AVAILABLE")
print("GPU name: ", torch.cuda.get_device_name(0) if has_gpu else "NOT AVAILABLE")
print("MPS (Apple Metal) is", "AVAILABLE" if has_mps else "NOT AVAILABLE")
print(f"Target device is {device}")
...
def train(model, device, train_loader, optimizer, epoch):
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)
...
...
def test(model, device, test_loader):
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
...
...
def main():
    model = MLNet().to(device)
    for epoch in range(1, EPOCHS + 1):
        train(model, device, train_loader, optimizer, epoch)
        test(model, device, test_loader)
...

Use an eGPU setup: Razor Core X with NVIDIA RTX 4060 ti 16GB VRAM

Then, I've read that I can have an eGPU plugged to my mac and load the training data in VRAM and start the training using the Tensor Cores.

And, so, I had to test this scenario...

Basically, the above code would be the same, only that now, the device will be "cuda" instead of "mps".

I've just placed an order for a RTX 4060 ti 16 GB VRAM, and I can't wait for the GPU to arrive to test it's performance.

GPU Alternatives to NVIDIA RTX 4060 ti 16 GB taken into considerations

I've looked into other options as well:

  • NVIDIA RTX 3060 12 GB - Different architecture than 4000 series that would draw it slower than a 4000 series.
  • NVIDIA RTX 3090 ti 24 GB - While the architecture is different and will be relatively slower than a 4000 series, the 24 GB VRAM would allow me to train and run an LLM locally.

Because I'm nowhere near training LLMs yet, I've decided to go with a 4000 series but with a relatively medium VRAM capacity. And so, I've chosen the RTX 4060 ti 16 GB.

Of course, I could have waited for the 5000 series, but because Razor Core X runs via Thunderbolt 3, and the power supply is only 650W max, a more performant GPU wouldn't make sense for my workflow, at this moment at least. 😄

I'll update this post when I'll receive the 4060 ti gpu.

Stay tuned.