Markola - joshuago's ai-research Bookmarks

19 JUN 2026

Of the many things now being built and called ‘world models,’ which functional pieces actually compose that capacity — and what is each one for?

ai-research

15 MAY 2026

Interaction Models: A Scalable Approach to Human-AI Collaboration

An interaction model trained from scratch with real-time responsiveness enabled by adopting a multi-stream, micro-turn design. Big break from the common turn-based models.

ai-research

01 MAY 2026

DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence

Paper from DeepSeek detailing LLM architectural and infrastructure changes made to make DeepSeek-V4 low-cost and token-efficient.

ai-research

06 APR 2026

Mixture of Experts (MoEs) in Transformers

Different tokens activate different experts, based on their hidden representations.

ai-research

06 APR 2026

Mixture of Experts Explained

A look at the building blocks of MoEs, how they’re trained, and the tradeoffs to consider when serving them for inference.

ai-research

25 MAR 2026

TurboQuant: Redefining AI efficiency with extreme compression

A set of advanced theoretically grounded quantization algorithms that enable massive compression for large language models and vector search engines.

ai-research

12 MAR 2026

Introducing Nemotron 3 Super: An Open Hybrid Mamba-Transformer MoE for Agentic Reasoning

Built on a fully open, end-to-end data pipeline that spans pretraining, post-training, and interactive reinforcement learning, which gives developers reproducible building blocks.

ai-research data-infra

19 NOV 2025

AMD GPUs go brrr

HipKittens is an opinionated collection of programming primitives to help developers realize the hardware's capabilities. Includes optimized register tiles, 8-wave and 4-wave kernel patterns instead of wave-specialization to schedule work within processors. Also includes chiplet-optimized cache reuse patterns to schedule work across processors.

ai-research

joshuago’s ai-research Bookmarks