Running Local LLMs at Home: RTX 3060 vs 4090 vs 5090 on Qwen3 and Gemma 4

Wed, 01 Jul 2026 00:00:00 +0000

A handful of consumer GPUs now sit at the center of serious homelab inference. The interesting question is no longer “can I run a 30B-class model” but “which card should I buy for the workload I actually have.” Below is a method-grounded look at three tiers — the RTX 3060 12GB, RTX 4090 24GB, and RTX 5090 32GB — running today’s small-active-parameter mixture-of-experts models: Qwen3.5-35B-A3B and its newer sibling Qwen3.6-35B-A3B, Gemma 4 26B-A4B, and the dense Gemma 4 31B.

Homelab on Rishi's Blog

Running Local LLMs at Home: RTX 3060 vs 4090 vs 5090 on Qwen3 and Gemma 4