DEV Community

# nvidia

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
CUDA 13.3 Lands, AI Writes Blackwell Kernels, & FP4 VRAM Optimization for LLMs

CUDA 13.3 Lands, AI Writes Blackwell Kernels, & FP4 VRAM Optimization for LLMs

Comments
3 min read
FlashAttention CUDA Kernel, Strix Halo MOE Boost, & NVIDIA DLSS 4.5 Driver Update

FlashAttention CUDA Kernel, Strix Halo MOE Boost, & NVIDIA DLSS 4.5 Driver Update

Comments
3 min read
From Chatbot to Agent — Tool Calling with NVIDIA NIM

From Chatbot to Agent — Tool Calling with NVIDIA NIM

1
Comments
7 min read
PatentLLM: CUDA TileLang/Triton B200 5x Speedup, RTX 5090 Power, PTX Grammar

PatentLLM: CUDA TileLang/Triton B200 5x Speedup, RTX 5090 Power, PTX Grammar

Comments
3 min read
Tesla P40 in a Homelab: 24GB of Inference on a Budget

Tesla P40 in a Homelab: 24GB of Inference on a Budget

Comments
6 min read
RTX 5080 Undervolt Benchmarks, CGO-Free CUDA API Binding, & AMD GPU Compatibility Fix

RTX 5080 Undervolt Benchmarks, CGO-Free CUDA API Binding, & AMD GPU Compatibility Fix

Comments
3 min read
Diffusion Language Models Are Here: Deep Dive into NVIDIA's Nemotron-Labs DLM Architecture

Diffusion Language Models Are Here: Deep Dive into NVIDIA's Nemotron-Labs DLM Architecture

Comments
15 min read
NVIDIA's Nemotron Diffusion: One Model, Three Generation Modes, 6 Faster

NVIDIA's Nemotron Diffusion: One Model, Three Generation Modes, 6 Faster

Comments
3 min read
AMD GPU/AI Launches, Legacy Driver Update & CUDA Optimization Platform

AMD GPU/AI Launches, Legacy Driver Update & CUDA Optimization Platform

Comments
3 min read
RTX 5090 Cooling, BeeLlama VRAM Opts, Resizable BAR Performance Gains

RTX 5090 Cooling, BeeLlama VRAM Opts, Resizable BAR Performance Gains

1
Comments
4 min read
LLM Compilers, GGUF Quantization, & Radeon RX 9060 Benchmarks

LLM Compilers, GGUF Quantization, & Radeon RX 9060 Benchmarks

Comments
3 min read
Go+CUDA Optimization, LLM VRAM Benchmarks & NVIDIA G-SYNC Firmware 1.1.6

Go+CUDA Optimization, LLM VRAM Benchmarks & NVIDIA G-SYNC Firmware 1.1.6

2
Comments
3 min read
Who Wins the Future: Chips vs Frontier LLMs (2026)

Who Wins the Future: Chips vs Frontier LLMs (2026)

1
Comments
17 min read
Intel Xe3P Leaks 160GB LPDDR5X; FlashAttention-2 in CuTe & Custom CUDA GPT-2 Engine

Intel Xe3P Leaks 160GB LPDDR5X; FlashAttention-2 in CuTe & Custom CUDA GPT-2 Engine

Comments
3 min read
GPU Bottleneck Analyzer, NVIDIA Rubin VRAM Demands, and Qwen VRAM Optimization

GPU Bottleneck Analyzer, NVIDIA Rubin VRAM Demands, and Qwen VRAM Optimization

1
Comments
4 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.