<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Homelab on Rishi's Blog</title><link>https://blog.bansalai.com/tags/homelab/</link><description>Recent content in Homelab on Rishi's Blog</description><generator>Hugo</generator><language>en-US</language><copyright>&amp;copy; 2026 Rishi Bansal</copyright><lastBuildDate>Wed, 01 Jul 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://blog.bansalai.com/tags/homelab/index.xml" rel="self" type="application/rss+xml"/><item><title>Running Local LLMs at Home: RTX 3060 vs 4090 vs 5090 on Qwen3 and Gemma 4</title><link>https://blog.bansalai.com/posts/homelab-local-llm-gpu-benchmark/</link><pubDate>Wed, 01 Jul 2026 00:00:00 +0000</pubDate><guid>https://blog.bansalai.com/posts/homelab-local-llm-gpu-benchmark/</guid><description>&lt;p&gt;A handful of consumer GPUs now sit at the center of serious homelab inference. The interesting question is no longer &amp;ldquo;can I run a 30B-class model&amp;rdquo; but &amp;ldquo;which card should I buy for the workload I actually have.&amp;rdquo; Below is a method-grounded look at three tiers — the RTX 3060 12GB, RTX 4090 24GB, and RTX 5090 32GB — running today&amp;rsquo;s small-active-parameter mixture-of-experts models: &lt;strong&gt;Qwen3.5-35B-A3B&lt;/strong&gt; and its newer sibling &lt;strong&gt;Qwen3.6-35B-A3B&lt;/strong&gt;, &lt;strong&gt;Gemma 4 26B-A4B&lt;/strong&gt;, and the dense &lt;strong&gt;Gemma 4 31B&lt;/strong&gt;.&lt;/p&gt;</description></item></channel></rss>