Jovan Chan - DEV Community

Jovan Chan

Jul 7

WWDC 2026 Home Lab Verdict: What Apple's Foundation Models, Core AI, and Siri Actually Deliver for Local AI

#apple #wwdc2026 #foundationmodels #ondeviceai

6 min read

Jovan Chan

Jul 6

WSL 3 GPU Passthrough for Local AI on Windows in 2026: Near-Native Ollama, llama.cpp, and PyTorch

#wsl #windows #localllm #ollama

6 min read

Jovan Chan

Jul 6

Why Local LLMs Got Good in 2026: Multi-Token Prediction, Speculative Decoding, and the MoE Efficiency Leap

#localllm #moe #speculativedecoding #multitokenprediction

6 min read

Jovan Chan

Jul 6

RTX PRO 6000 Blackwell for Local AI in 2026: 96GB GDDR7, the 120B+ MoE Threshold, and Whether a Workstation Card Makes Sense for Home Labs

#gpu #rtxpro6000 #blackwell #localai

6 min read

Jovan Chan

Jul 5

Open WebUI Can't Connect to Ollama? Every Fix for the Server Connection Error (2026)

#ollama #openwebui #docker #localllm

6 min read

Jovan Chan

Jul 5

Open-Source LLM Shootout 2026: Qwen3.6 vs Gemma 4 vs Llama 4 vs GLM-5.1 vs DeepSeek V4 — Which Fits Your GPU?

#localllm #qwen3 #gemma #llama4

6 min read

Jovan Chan

Jul 5

Ollama v0.30 on Apple Silicon: What the Stable MLX Release Actually Changed From the Preview

#ollama #applesilicon #mlx #localai

6 min read

Jovan Chan

Jul 4

Ollama Not Using GPU? Fix CPU-Only Inference on Windows, WSL2, and Linux (2026)

#ollama #gpu #localllm #troubleshooting

6 min read

Jovan Chan

Jul 4

Ollama Keeps Reloading the Model? Fix VRAM Unloading, Cold Starts, and Model Swapping (2026)

#ollama #localllm #troubleshooting #vram

5 min read

Jovan Chan

Jul 4

vLLM Won't Start? Every Fix for the Engine Init, CUDA, and OOM Errors (2026)

#vllm #localllm #cuda #gpu

6 min read

Jovan Chan

Jul 3

How to Run a 70B Model on a Single 24GB GPU in 2026 (and When You Shouldn't)

#localllm #gpu #llama #ollama

6 min read

Jovan Chan

Jul 3

Ornith-1.0 for Local AI in 2026: Which GPU Runs DeepReinforce's MIT-Licensed Coding Model?

#ornith #deepreinforce #localllm #coding

6 min read

Jovan Chan

Jul 3

OpenAI's Jalapeño Inference Chip: Does It Change Your Local GPU vs Cloud Math in 2026?

#openai #cloudgpu #rtx3090 #localllm

6 min read

Jovan Chan

Jul 2

Ollama Slow? How to Get More Tokens per Second From the GPU You Already Have (2026)

#ollama #localllm #gpu #performance

6 min read

Jovan Chan

Jul 2

NVIDIA Cosmos 3 Nano for Local AI in 2026: 16B Omnimodel, BF16-Only, and Whether Your Consumer RTX Can Actually Run It

#nvidia #cosmos #physicalai #gpu

6 min read

Jovan Chan

Jul 2

Microsoft Aion 1.0 on Windows 2026: The 14B On-Device Model and What Your Copilot+ PC Actually Needs

#npu #windows #localllm #copilotpluspc

6 min read

Jovan Chan

Jul 1

Open-Source Vision Language Models 2026: Which to Self-Host

#vlm #visionlanguagemodels #qwen3vl #selfhosted

6 min read

Jovan Chan

Jul 1

NVIDIA Cosmos 3 Nano Self-Hosting Guide 2026: vLLM Setup

#nvidia #cosmos #vllm #selfhosted

5 min read

Jovan Chan

Jul 1

Langflow Security Hardening 2026: Patch CVE-2026-5027

#langflow #security #selfhosted #cve

5 min read

Jovan Chan

Jul 1

Claude Code Dynamic Workflows in 2026: Fan Out 1,000 Subagents, What It Costs, and When to Use It Over OpenCode or Cursor

#claudecode #workflow #setupguide #opencode

5 min read

Jovan Chan

Jul 1

Terminal-Bench 2.1 in June 2026: The #1 Model Is One You Can't Use — Here's the Leaderboard That Actually Matters

#terminalbench #claudecode #codexcli #gpt55

6 min read

Jovan Chan

Jul 1

Roo Code Shut Down in 2026: The Best Alternatives and How to Migrate Your Setup

#roocode #cline #kilocode #vscodeextension

6 min read

Jovan Chan

Jun 30

LM Studio \"Failed to Load Model\"? Decode the Exit Code, Then Fix It (2026)

#lmstudio #localllm #troubleshooting #gpu

6 min read

Jovan Chan

Jun 30

Dell Deskside Agentic AI 2026: GB10, GB300, and the 87% Cloud Savings Claim Examined

#gpu #ai #localllm #hardware

6 min read

Jovan Chan

Jun 30

Apple M7 AI Chips in 2026: Should Home Lab Mac Buyers Wait or Buy M5 Now?

#applesilicon #mac #localllm #hardware

6 min read

Jovan Chan

Jun 30

Kimi K2.7 Code Local Setup 2026: vLLM, SGLang, GGUF

#kimi #llm #moe #selfhosted

5 min read

Jovan Chan

Jun 30

EXO Framework Setup Guide 2026: Pool Devices for Big LLMs

#exo #distributedinference #selfhosted #ai

6 min read

Jovan Chan

Jun 30

DiffusionGemma 26B Review 2026: 4x Faster, At a Cost

#gemma #diffusion #vllm #selfhosted

5 min read

Jovan Chan

Jun 30

Microsoft MAI-Code-1-Flash in GitHub Copilot 2026: 137B/5B MoE, Sub-Second Latency, and Whether It Beats Claude Haiku 4.5 on Your Bill

#githubcopilot #microsoft #review

6 min read

Jovan Chan

Jun 30

Kilo Code vs OpenCode vs Cline 2026: Three Free Open-Source Agents, One Winner

#kilocode #opencode #cline #comparison

5 min read

Jovan Chan

Jun 30

Fable 5 and Mythos 5 Got Pulled by a Government Order: The Fallback Setup Your AI Coding Stack Needs

#claude #localllm #cline #cursor

5 min read

Jovan Chan

Jun 29

Ollama 'llama runner process has terminated'? Read the Exit Code, Then Fix It (2026)

#ollama #troubleshooting #localllm #cuda

6 min read

Jovan Chan

Jun 29

NVIDIA Skipping New Consumer GPUs in 2026: What the GDDR7 Shortage Means for Your Home Lab Budget

#gpu #nvidia #rtx3090 #rtx4090

6 min read

Jovan Chan

Jun 29

NVIDIA Nemotron 3 Ultra for Local AI in 2026: 550B/55B-Active MoE, 1M Context, NVFP4 — Which Consumer GPU Can Actually Run It

#nemotron #nvidia #localllm #moe

6 min read

Jovan Chan

Jun 28

NPU vs Discrete GPU for Local LLMs in 2026: Why Computex Laptops Lose on Tokens/Second Despite the TOPS Claims

#npu #gpu #localllm #rtx3090

6 min read

Jovan Chan

Jun 28

MOSS-TTS in ComfyUI 2026: Zero-Shot Voice Cloning From a 10-Second Clip on Your RTX or Mac

#comfyui #mosstts #voicecloning #tts

6 min read

Jovan Chan

Jun 28

MiniMax M3 Local AI Hardware Guide 2026: The 428B Open-Weight Model You (Probably) Can't Run at Home

#minimaxm3 #localllm #vram #moe

6 min read

Jovan Chan

Jun 27

LM Studio Locally + LM Link 2026: Control Your Home GPU Rig From Your iPhone

#lmstudio #lmlink #locally #iphone

6 min read

Jovan Chan

Jun 27

Kimi K2.7 Code for Local AI in 2026: VRAM Requirements, the 1T-Parameter Reality, and Which GPU Crosses Into Usable Speed

#kimik2 #localllm #moe #hardwareguide

6 min read

Jovan Chan

Jun 27

GLM 5.2 for Local AI in 2026: 744B MoE, MIT License, and Why It's Effectively Cloud-Only at Home

#glm #localllm #moe #vram

6 min read

Jovan Chan

Jun 27

MOSS-TTS 1.5 Review 2026: Apache Voice Cloning on 8GB

#tts #voicecloning #selfhosted #ai

6 min read

Jovan Chan

Jun 27

MiniMax M3 Review 2026: Open-Weight 1M-Context Frontier

#minimax #llm #moe #localllm

5 min read

Jovan Chan

Jun 27

GPTQ vs AWQ vs GGUF for vLLM 2026: Which 4-Bit Wins

#vllm #quantization #gptq #awq

5 min read

Jovan Chan

Jun 27

Agentjacking 2026: How a Fake Sentry Error Hijacks Cursor, Claude Code, and Cline — and the Settings That Cut Your Exposure

#security #cursor #claudecode #cline

5 min read

Jovan Chan

Jun 27

Goose AI Agent Review 2026: Apache 2.0, Any LLM, and the Best Free Local Coding Agent?

#goose #cline #aider #claudecode

5 min read

Jovan Chan

Jun 26

Gemma 4 QAT for Local AI in 2026: How Google's June 5 Checkpoints Put the 26B in 15GB

#gemma #google #qat #quantization

6 min read

Jovan Chan

Jun 26

EXO Framework in 2026: Can You Pool RTX 3090s to Beat a DGX Spark? The Honest Distributed-Inference Reality

#distributedinference #localllm #gpu #rtx3090

6 min read

Jovan Chan

Jun 26

DiffusionGemma 26B for Local AI in 2026: 18GB VRAM, 4 Faster Generation, and Which Consumer GPUs Actually Saturate the 1,000 tok/s Ceiling

#google #diffusiongemma #localllm #gpu

6 min read

Jovan Chan

Jun 26

Google Colab CLI: Run AI Agents on Cloud GPUs 2026

#googlecolab #gpu #aider #openinterpreter

6 min read

Jovan Chan

Jun 26

DeepSeek V4 Pro Review 2026: MIT 1.6T MoE for Self-Hosters

#deepseek #llm #moe #localllm

5 min read

Jovan Chan

Jun 26

Bonsai Image 4B Review 2026: 1-Bit Local Image Gen

#bonsaiimage #imagegeneration #flux #quantization

6 min read

Jovan Chan

Jun 26

Google Colab CLI review 2026: free and cheap GPUs for Claude Code, Codex, and Cursor agents — from your terminal

#claudecode #codex #localllm #setupguide

6 min read

Jovan Chan

Jun 26

GLM 5.2 as your Cursor and Cline backend in 2026: MIT-licensed open-weight coding model, the config that works, and the honest cost math

#glm #cursor #cline #continuedev

6 min read

Jovan Chan

Jun 26

GitHub Copilot Max $100/Month: Is the New Heavy-Use Tier Worth It vs Cursor Pro and Claude Code?

#githubcopilot #pricing #comparison #cursor

5 min read

Jovan Chan

Jun 25

Qualcomm's $10B Tenstorrent Bid: What RISC-V AI Cards Mean for Home Labs in 2026

#tenstorrent #riscv #qualcomm #aiaccelerator

6 min read

Jovan Chan

Jun 25

GMKtec EVO-X2 Review 2026: A Sub-$2,000 Mini PC That Runs 235B Models on Ryzen AI Max+ 395

#amd #ryzenaimax #strixhalo #minipc

6 min read

Jovan Chan

Jun 25

AMD Ryzen AI Halo vs NVIDIA DGX Spark 2026: Which 128GB AI Dev Kit Actually Pays Off

#amd #nvidia #ryzenaimax #dgxspark

6 min read

Jovan Chan

Jun 25

Qwen3.6-35B-A3B Local Setup 2026: Ollama and 24GB VRAM

#ollama #llm #coding #selfhosted

6 min read

Jovan Chan

Jun 25

Qwen3-Coder-Next Local Setup Guide 2026: Ollama and GGUF

#ollama #llm #coding #selfhosted

5 min read

Jovan Chan

Jun 25

OpenHands Review 2026: The 76K-Star Coding Agent

#openhands #codingagents #ai #opensource

5 min read

Writing Debut