Foundation Models & Frontier AI Labs

418 items · default last 14 days

Quoting Andreas Kling

Simon Willison's Weblog now

[AINews] not much happened today

Latent Space 5h

AI enthusiasts are in a race against time, AI skeptics are in a race against entropy

Simon Willison's Weblog 12h

Reality: The Final Eval — Lukas Petersson and Axel Backlund of Andon Labs

Latent Space 15h

Quoting Emanuel Maiberg, 404 Media

Simon Willison's Weblog 19h

How Endava is redesigning software delivery around AI agents

OpenAI News yest

Dreaming: Better memory for a more helpful ChatGPT

OpenAI News yest

[AINews] Reve 2 and Ideogram 4: Layouts in Imagegen

Latent Space yest

Biodefense in the Intelligence Age

OpenAI News yest

🔬Scaling Past Informal AI - Carina Hong, Axiom Math

Latent Space yest

⚡️Satya Nadella: No Priors x Latent Space Crossover Special at Microsoft Build

Latent Space yest

Introducing the Services Track and Partner Hub of the Claude Partner Network

Anthropic News yest

What we learned mapping a year’s worth of AI-enabled cyber threats

Anthropic News yest

Introducing new capabilities to GPT-Rosalind

OpenAI News yest

Uber Caps Usage of AI Tools Like Claude Code to Manage Costs

Simon Willison's Weblog Jun 3

How Wasmer used Codex to build a Node.js runtime for the edge

OpenAI News Jun 3

A blueprint for democratic governance of frontier AI

OpenAI News Jun 3

OpenAI public policy agenda

OpenAI News Jun 3

[AINews] Microsoft Build: MAI-Thinking-1 and MAI Family models

Latent Space Jun 3

Microsoft's new MAI models

Simon Willison's Weblog Jun 2

datasette-agent-micropython 0.1a0

Simon Willison's Weblog Jun 2

micropython-wasm 0.1a1

Simon Willison's Weblog Jun 2

California Brown Pelican

Simon Willison's Weblog Jun 2

GitHub's plan for Agents — Kyle Daigle, GitHub

Latent Space Jun 2

Expanding Project Glasswing

Anthropic News Jun 2

Travelers deploys AI-powered claims countrywide with OpenAI

OpenAI News Jun 2

Codex for every role, tool, and workflow

OpenAI News Jun 2

Advancing youth safety and opportunity through global leadership

OpenAI News Jun 2

Pasted File Editor

Simon Willison's Weblog Jun 2

micropython-wasm 0.1a0

Simon Willison's Weblog Jun 2

[AINews] NVIDIA Cosmos 3, Nemotron 3 Ultra, and RTX Spark

Latent Space Jun 2

Codex is becoming a productivity tool for everyone

OpenAI News Jun 2

Hackers Simply Asked Meta AI to Give Them Access to High-Profile Instagram Accounts. It Worked

Simon Willison's Weblog Jun 1

Our views on AI policy and political advocacy

OpenAI News Jun 1

Anthropic confidentially submits draft S-1 to the SEC

Anthropic News Jun 1

Why Video Agent models are next — Ethan He, xAI Grok Imagine

Latent Space Jun 1

Building the infrastructure for the Intelligence Age in Michigan

OpenAI News Jun 1

OpenAI frontier models and Codex are now available on AWS

OpenAI News Jun 1

May 2026 newsletter

Simon Willison's Weblog Jun 1

Introducing Claude Design by Anthropic Labs

Anthropic News Jun 1

What 81,000 people want from AI

Anthropic News Jun 1

Claude is a space to think

Anthropic News Jun 1

datasette 1.0a32

Simon Willison's Weblog May 31

Anthropic raises $65B in Series H funding at $965B post-money valuation

Anthropic News May 31

Introducing Claude Opus 4.8

Anthropic News May 31

Anthropic opens Milan office to support Italian enterprise, research, and developers

Anthropic News May 31

Anthropic appoints KiYoung Choi as Representative Director of Korea ahead of Seoul office opening

Anthropic News May 31

Anthropic co-founder Chris Olah's remarks on Pope Leo XIV's encyclical "Magnifica humanitas"

Anthropic News May 31

Project Glasswing: An initial update

Anthropic News May 31

Widening the conversation on frontier AI

Anthropic News May 31

KPMG integrates Claude across its core business and workforce of more than 276,000 in strategic alliance

Anthropic News May 31

Anthropic acquires Stainless

Anthropic News May 31

PwC is deploying Claude to build technology, execute deals, and reinvent enterprise functions for clients

Anthropic News May 31

Claude Code Enterprise

Anthropic News May 31

Full leaderboardFull

LLM Stats — AI Updates / Leaderboard May 31

Claude Mythos Preview

LLM Stats — AI Updates / Leaderboard May 31

Artificial Analysis Coding Agent Index

Artificial Analysis May 31

Artificial Analysis Openness Index

Artificial Analysis May 31

The solution might be cancelling my AI subscription

Simon Willison's Weblog May 31

Quoting Karen Kwok for Reuters Breakingviews

Simon Willison's Weblog May 31

How we contain Claude across products

Simon Willison's Weblog May 30

Running Python ASGI apps in the browser via Pyodide + a service worker

Simon Willison's Weblog May 30

I Am Retiring from Tech to Live Offline

Simon Willison's Weblog May 30

Quoting Daniel Jalkut

Simon Willison's Weblog May 30

[AINews] Founders and Forward Deployed Engineers

Latent Space May 30

Boston Children’s uses AI to unlock new diagnoses

OpenAI News May 29

How Braintrust turns customer requests into code with Codex

OpenAI News May 29

datasette 1.0a31

Simon Willison's Weblog May 29

Strengthening societal resilience with Rosalind Biodefense

OpenAI News May 29

[AINews] Anthropic raises $965B Series H, releases Opus 4.8 and Dynamic Workflows/ultracode

Latent Space May 29

Anthropic's run-rate revenue hits $47 billion

Simon Willison's Weblog May 29

A shared playbook for trustworthy third party evaluations

OpenAI News May 29

Claude Opus 4.8: "a modest but tangible improvement"

Simon Willison's Weblog May 28

llm-anthropic 0.25.1

Simon Willison's Weblog May 28

markdown-svg-renderer

Simon Willison's Weblog May 28

The Age of Async Agents — Cognition's Walden Yan & OpenInspect's Cole Murray

Latent Space May 28

How Endava builds an agentic organization with Codex

OpenAI News May 28

[AINews] Cognition raises $1B in $26B Series D

Latent Space May 28

OpenAI’s Frontier Governance Framework

OpenAI News May 28

MUFG aims to become AI-native with OpenAI

OpenAI News May 28

sqlite AGENTS.md

Simon Willison's Weblog May 27

🔬ESM: The Bitter Lesson is Coming for Proteins - Alex Rives, BioHub

Latent Space May 27

I think Anthropic and OpenAI have found product-market fit

Simon Willison's Weblog May 27

Cisco and OpenAI redefine enterprise engineering with Codex

OpenAI News May 27

Building self-improving tax agents with Codex

OpenAI News May 27

Quoting Kyle Ferrana

Simon Willison's Weblog May 27

[AINews] New AI Infra decacorns: Fireworks, Baseten (with OpenRouter on the way)

Latent Space May 27

Warp’s big bet on building open source with GPT-5.5

OpenAI News May 27

Election information and safeguards in 2026

OpenAI News May 27

The pressure

Simon Willison's Weblog May 26

ArcANE: Do Role-Playing Language Agents Stay in Character at the Right Time?

Hugging Face Daily Papers now

Seoul National University

Hugging Face Daily Papers now

TIDE: Proactive Multi-Problem Discovery via Template-Guided Iteration

Hugging Face Daily Papers now

AdaPlanBench: Evaluating Adaptive Planning in Large Language Model Agents under World and User Constraints

Hugging Face Daily Papers now

University of Illinois at Urbana-Champaign

Hugging Face Daily Papers now

VideoKR: Towards Knowledge- and Reasoning-Intensive Video Understanding

Hugging Face Daily Papers now

RobotValues: Evaluating Household Robots When Human Values Conflict

Hugging Face Daily Papers now

Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation

Hugging Face Daily Papers now

University of Zurich, Department of Computational Linguistics

Hugging Face Daily Papers now

LoomVideo: Unifying Multimodal Inputs into Video Generation and Editing

Hugging Face Daily Papers now

Personal AI Agent for Camera Roll VQA

Hugging Face Daily Papers now

Rethinking Continual Experience Internalization for Self-Evolving LLM Agents

Hugging Face Daily Papers now

[NEW MODEL] SupraLabs just released a new model! - Supra-50M-Reasoning

r/LocalLLaMA 1h

RTX Pro 4500 Blackwell Performance Numbers

r/LocalLLaMA 2h

Gemma 4 12B is my new main squeeze

r/LocalLLaMA 5h

hello there! i made a tool to explore kokoro.

r/LocalLLaMA 7h

Here is my llama.cpp NVFP4/MXFP6 GGUF quantizer tool

r/LocalLLaMA 7h

Finally finished my LLM server: EPYC 9575F, 4× RTX 3090 (96GB VRAM), 768GB ECC RAM

r/LocalLLaMA 8h

How LLM-driven NPCs work in Ultima Online (ServUO)

r/LocalLLaMA 9h

RTX Spark Ads: DJT Edition

r/LocalLLaMA 11h

finally

r/LocalLLaMA 11h

OpenAI, Grupo Folha and Grupo UOL announce strategic content partnership

OpenAI News May 25

Higgs Audio v3 TTS 4B. Built for voice chat. Support 100 languages and inline control.

r/LocalLLaMA 13h

You guys were right - Qwen 3.6 35B IS good...and KV Cache DOES matter.

r/LocalLLaMA 16h

Nvidia's been paying shills on LinkedIn

r/LocalLLaMA 20h

Today made me realize just how bad things have gotten without Meta

r/LocalLLaMA 20h

VibeOS - Fully Hallucinated Operating System

r/LocalLLaMA 21h

KVarN: new KV-cache quant from Huawei. 3–5× KV cache compression with actual speed-up instead of slow-down, and unlike TurboQuant it holds up on reasoning (Apache 2.0, vLLM single flag)

r/LocalLLaMA 21h

Cosmos 3: Omnimodal World Models for Physical AI

Hugging Face Daily Papers yest

Audio Interaction Model

Hugging Face Daily Papers yest

Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories

Hugging Face Daily Papers yest

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

Hugging Face Daily Papers yest

Qwen-Image-Flash: Beyond Objective Design

Hugging Face Daily Papers yest

M^3Eval: Multi-Modal Memory Evaluation through Cognitively-Grounded Video Tasks

Hugging Face Daily Papers yest

OVO-S-Bench: A Hierarchical Benchmark for Streaming Spatial Intelligence in Multimodal LLMs

Hugging Face Daily Papers yest

Intern Large Models

Hugging Face Daily Papers yest

Echo-Infinity: Learning Evolving Memory for Real-Time Infinite Video Generation

Hugging Face Daily Papers yest

ThoughtFold: Folding Reasoning Chains via Introspective Preference Learning

Hugging Face Daily Papers yest

Streaming Communication in Multi-Agent Reasoning

Hugging Face Daily Papers yest

Benchmarks are Not Enough: RAMP for Runtime Assessing of Agentic Models in Production Systems

Hugging Face Daily Papers yest

nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16 · Hugging Face

r/LocalLLaMA yest

nex-agi/Nex-N2-mini • Huggingface

r/LocalLLaMA yest

Gemma 4 QAT confirmed to release soon!

r/LocalLLaMA yest

Gemma 4 12b 8Q Heretic Oneshot Coding

r/LocalLLaMA yest

The first Gemma 4 12B finetunes are ready

r/LocalLLaMA yest

Me visiting this sub

r/LocalLLaMA yest

NVIDIA Nemotron 3 Ultra

Ollama Blog yest

Trump signs narrower executive order on AI oversight after industry objections

r/LocalLLaMA yest

How can the numbers be this massive within a month ??

r/LocalLLaMA yest

New Google Gemma 4 12B Claims Near-26B Performance - We Tested Both!

r/LocalLLaMA yest

Gemma 4 12B first coding agent test on a 4080 Super

r/LocalLLaMA yest

gemma-4-12b-it vs Qwen3.5-9B on shared benchmarks: Qwen is overall winner beating gemma in 5/8 benchmarks despite a smaller footprint

r/LocalLLaMA yest

More Gemma 4 models incoming

r/LocalLLaMA yest

Introducing Gemma 4 12B: a unified, encoder-free multimodal model

r/LocalLLaMA yest

Let us let Google know that we want the Gemma 4 124b

r/LocalLLaMA yest

google/gemma-4-12B · Hugging Face

r/LocalLLaMA yest

OCC-RAG: Optimal Cognitive Core for Faithful Question Answering

Hugging Face Daily Papers yest

Trust Region On-Policy Distillation

Hugging Face Daily Papers yest

From Activation to Causality: Discovery of Causal Visual Representations in the Human Brain

Hugging Face Daily Papers yest

Massachusetts Institute of Technology

Hugging Face Daily Papers yest

Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking

Hugging Face Daily Papers yest

KVarN: Variance-Normalized KV-Cache Quantization Mitigates Error Accumulation in Reasoning Tasks

Hugging Face Daily Papers yest

HUAWEI Computing Systems Lab

Hugging Face Daily Papers yest

A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL

Hugging Face Daily Papers yest

MIRA: Mid-training Rubric Anchoring for Source-Aware Data Selection

Hugging Face Daily Papers yest

World Models Meet Language Models: On the Complementarity of Concrete and Abstract Reasoning

Hugging Face Daily Papers yest

AutoMedBench: Towards Medical AutoResearch with Agentic AI Models

Hugging Face Daily Papers yest

University of California, Santa Cruz

Hugging Face Daily Papers yest

ui: Mermaid Diagrams in chat + interactive preview by allozaur · Pull Request #24032 · ggml-org/llama.cpp

r/LocalLLaMA yest

Take Three: What’s the rub on memory sessions?

r/LocalLLaMA yest

Qwen 3.7 Plus just briefly appeared and then disappeared on OpenRouter.

r/LocalLLaMA yest

How does the new abliteration tool Apostate compare with others? - Abliterlitics

r/LocalLLaMA yest

Tensor split mode: CUDA error on latest llama.cpp with Qwen-3.6-27b

r/LocalLLaMA yest

How much VRAM needed for Qwen 3.6 27B Q8 with 262K context?

r/LocalLLaMA Jun 3

Calling it now Microsoft is buying Unsloth.

r/LocalLLaMA Jun 3

Holo3.1 35B/9B/4B/0.8B (Qwen 3.5 finetunes)

r/LocalLLaMA Jun 3

Another shout out to llama.cpp build b9455 2x3090

r/LocalLLaMA Jun 3

Microsoft Aion 1.0 Instruct and Aion 1.0 Plan models!

r/LocalLLaMA Jun 3

[AINews] All Model Labs are now Agent Labs

Latent Space May 23

Nous Research — Hermes Desktop

r/LocalLLaMA Jun 3

Why do we benchmark quants on perplexity and prose but never on tool call validity?

r/LocalLLaMA Jun 3

I Put a Datacenter GPU in My Gaming PC for £200

r/LocalLLaMA Jun 2

Minimax M3 appears to have no political censorship

r/LocalLLaMA Jun 2

I have become George Jetson: my job is now Yes/No supervision for a machine I don’t fully understand.

r/LocalLLaMA Jun 2

Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs

Hugging Face Daily Papers Jun 2

On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters

Hugging Face Daily Papers Jun 2

A Matter of TASTE: Improving Coverage and Difficulty of Agent Benchmarks

Hugging Face Daily Papers Jun 2

Technion Israel institute of technology

Hugging Face Daily Papers Jun 2

K-BrowseComp: A Web Browsing Agent Benchmark Grounded in Korean Contexts

Hugging Face Daily Papers Jun 2

Carnegie Mellon University

Hugging Face Daily Papers Jun 2

Draft-OPD: On-Policy Distillation for Speculative Draft Models

Hugging Face Daily Papers Jun 2

Domino: Decoupling Causal Modeling from Autoregressive Drafting in Speculative Decoding

Hugging Face Daily Papers Jun 2

Shanghai Jiao Tong University

Hugging Face Daily Papers Jun 2

Linear Ensembles Wash Away Watermarks: On the Fragility of Distributional Perturbations in LLMs

Hugging Face Daily Papers Jun 2

King's College London

Hugging Face Daily Papers Jun 2

VLMs are Good Teachers for Video Reasoning via Adaptive Test-Time Optimization

Hugging Face Daily Papers Jun 2

1-bit Bonsai Image 4B and Ternary Bonsai Image 4B Image Generation for Local Devices with just 0.93 GB and 1.21 GB respectively of Diffusion Transformer Footprint. So tiny!

r/LocalLLaMA Jun 2

ui: Add Thinking mode toggle with reasoning effort levels + improvements for Chat Form Add Action UI by allozaur · Pull Request #23434 · ggml-org/llama.cpp

r/LocalLLaMA Jun 2

Tiny LLM Benchmark: Jetson Orin Nano Super 8GB - Four Power Modes × Eight Models

r/LocalLLaMA Jun 2

Building a free, offline LLM “tutor” grounded in one university textbook — RAG, LoRA, or both? Sanity check wanted

r/LocalLLaMA Jun 2

Ignoring benchmarks, how do the newest local models (gemma 4 31B, 26BA4B, Qwen 3.6) “feel” to you? What do you think they compare to?

r/LocalLLaMA Jun 2

Replaced Claude with local Qwen3.6-27B in my multi-agent orchestrator for 2 weeks

r/LocalLLaMA Jun 2

Dual rtx 3090 build

r/LocalLLaMA Jun 2

Qwen 3.6-35B-A3B with 977 tk/s prompt processing and 262k context window on Intel Arc B70 Pro

r/LocalLLaMA Jun 2

Intel Arc Pro B70 llama.cpp benchmarks posted

r/LocalLLaMA Jun 2

[AINews] New AI Infra unicorns: Exa, Modal, TurboPuffer

Latent Space May 22

NVIDIA releases Cosmos 3 Omnimodal world modelson HF

r/LocalLLaMA Jun 2

Moss tts 1.5 8b Examples. It is the currently best voice cloning model for English as of June 2026

r/LocalLLaMA Jun 2

How Virgin Atlantic ships faster with Codex

OpenAI News May 22

OpenAI named a Leader in enterprise coding agents by Gartner

OpenAI News May 22

Stop asking what model to run. There are literally only two.

r/LocalLLaMA Jun 1

RTX Spark does not have 600GB/s Bandwith

r/LocalLLaMA Jun 1

Giving Agents Computers — Ivan Burazin, Daytona

Latent Space May 21

We’re launching the Google DeepMind Accelerator program in Asia Pacific to tackle environmental risks

Google DeepMind Blog May 21

I trusted random person on this subreddit and bought 3080 20gb made of chinesium

r/LocalLLaMA Jun 1

GrepSeek: Training Search Agents for Direct Corpus Interaction

Hugging Face Daily Papers Jun 1

University of Massachusetts Amherst

Hugging Face Daily Papers Jun 1

COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation

Hugging Face Daily Papers Jun 1

Trust-Region Behavior Blending for On-Policy Distillation

Hugging Face Daily Papers Jun 1

Representation Forcing for Bottleneck-Free Unified Multimodal Models

Hugging Face Daily Papers Jun 1

SwanVoice: Expressive Long-Form Zero-Shot Speech Synthesis for Both Monologue and Dialogue

Hugging Face Daily Papers Jun 1

Mellum2 Technical Report

Hugging Face Daily Papers Jun 1

LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards

Hugging Face Daily Papers Jun 1

Knowledge Engineer Group @ Tsinghua University

Hugging Face Daily Papers Jun 1

GGT-100K: Generative Ground Truth for Generalizable Real-World Image Restoration

Hugging Face Daily Papers Jun 1

Function2Scene: 3D Indoor Scene Layout from Functional Specifications

Hugging Face Daily Papers Jun 1

Towards Streaming Synchronized Spatial Audio Generation via Autoregressive Diffusion Transformer

Hugging Face Daily Papers Jun 1

llama: limit max outputs of `llama_context` by am17an · Pull Request #23861 · ggml-org/llama.cpp

r/LocalLLaMA Jun 1

So qwen3.7-4b when?

r/LocalLLaMA Jun 1

i dedicate this meme to you r/LocalLLaMA

r/LocalLLaMA Jun 1

For Ling-2.6-1T, what would make the size feel justified first: quality per token, local serving reality, or long context stability?

r/LocalLLaMA Jun 1

Mellum2 Goes Open Source: A Fast Model for AI Workflows | The JetBrains AI Blog

r/LocalLLaMA Jun 1

Mellum 2 12B A2.5B

r/LocalLLaMA Jun 1

Cheap V100 32gb

r/LocalLLaMA Jun 1

AdventHealth advances whole-person care with OpenAI

OpenAI News May 21

Entire world: We need more GPUs. Meanwhile, Jensen Huang:

r/LocalLLaMA Jun 1

A 1B humanizer that matches human writing on an AI detector

r/LocalLLaMA Jun 1

Just found a 1-click RCE in pewdiepie's Odysseus Chat

r/LocalLLaMA Jun 1

Open Models - May 2026

r/LocalLLaMA Jun 1

next MiniMax will be released in ~10 Days

r/LocalLLaMA Jun 1

[AINews] OpenAI GPT-next disproves 80 year old Erdős planar unit distance problem for under $1000

Latent Space May 21

NVIDIA announces Nemotron 3 Ultra

r/LocalLLaMA Jun 1

LongDS-Bench: On the Failure of Long-Horizon Agentic Data Analysis

Hugging Face Daily Papers Jun 1

Hide-and-Seek in Trajectories: Discovering Failure Signals for VLA Runtime Monitoring

Hugging Face Daily Papers Jun 1

University of Wisconsin-Madison

Hugging Face Daily Papers Jun 1

when you spend 5 days fine-tuning a model and it still confidently makes things up

r/LocalLLaMA Jun 1

MiniMax M3 - Coding & Agentic Frontier, 1M Context, Multimodal

r/LocalLLaMA Jun 1

Introducing physics AI at Mistral: the foundation for engineering acceleration.

Mistral AI News Jun 1

Connect the dots: Build with built-in and custom MCPs in Studio

Mistral AI News Jun 1

Minimax M3 seems to be rolling out on the API

r/LocalLLaMA Jun 1

Get you some GPUs, it's not worth the hacks around lack of RAM

r/LocalLLaMA Jun 1

Semantic Step Prediction: Multi-Step Latent Forecasting in LLM Reasoning Trajectories via Step Sampling

r/LocalLLaMA May 31

GPU Prices. Buy now, or buy later?

r/LocalLLaMA May 31

Railway: The Agent-Native Cloud — Jake Cooper

Latent Space May 20

G7 agrees on shared language around open-source AI and open weights AI

r/LocalLLaMA May 31

God dammit Qwen

r/LocalLLaMA May 31

I ported NVIDIA Parakeet (speech-to-text) to ggml: same output as NeMo, faster, GGUF-quantized, no Python

r/LocalLLaMA May 31

Gemini Deep Research

Gemini API Release Notes / Changelog May 31

Gemini 3.1 Flash Image

Gemini API Release Notes / Changelog May 31

Video-to-image generation

Gemini API Release Notes / Changelog May 31

gemini-3.1-flash-lite

Gemini API Release Notes / Changelog May 31

antigravity-preview-05-2026

Gemini API Release Notes / Changelog May 31

gemini-robotics-er-1.6-preview

Gemini API Release Notes / Changelog May 31

deep-research-preview-04-2026

Gemini API Release Notes / Changelog May 31

deep-research-max-preview-04-2026

Gemini API Release Notes / Changelog May 31

Gemini 3.1 Flash TTS Preview

Gemini API Release Notes / Changelog May 31

veo-3.1-lite-generate-preview

Gemini API Release Notes / Changelog May 31

gemini-3.1-flash-lite-preview

Gemini API Release Notes / Changelog May 31

gemini-3.1-flash-live-preview

Gemini API Release Notes / Changelog May 31

Product Vibe gets to work. The unified agent for long-horizon productivity and coding, launching with Work and Code modes. Plus, a new Vibe VS Code extension. May 28, 2026 Mistral

Mistral AI News May 31

Company AI Now Summit 2026 Innovations for global enterprises solving the world’s hardest problems. May 28, 2026 Mistral

Mistral AI News May 31

Product Introducing Search Toolkit Production search pipelines, anywhere. May 28, 2026 Mistral

Mistral AI News May 31

Research Physics AI research that’s shaping the industry. Published breakthroughs pushing the state of the art. May 27, 2026 Mistral

Mistral AI News May 31

Company Emmi joins Mistral to accelerate the AI-native industry May 23, 2026 Mistral AI

Mistral AI News May 31

Product Remote agents in Vibe. Powered by Mistral Medium 3.5. Introducing Mistral Medium 3.5, remote coding agents in Vibe, plus new Work mode in Le Chat for complex tasks. May 22, 2026 Mistral AI

Mistral AI News May 31

Product Workflows for work that runs the business Workflows is now in public preview. April 27, 2026 Mistral AI

Mistral AI News May 31

Engineering Spaces: A CLI Built for Humans and Agents March 31, 2026 Mistral AI

Mistral AI News May 31

Research Speaking of Voxtral Voxtral TTS: A frontier, open-weights text-to-speech model that’s fast, instantly adaptable, and produces lifelike speech for voice agents. March 23, 2026 Mistral AI

Mistral AI News May 31

Product Introducing Forge Today, we’re introducing Forge, a system for enterprises to build frontier-grade AI models grounded in their proprietary knowledge. March 17, 2026 Mistral AI

Mistral AI News May 31

Research Introducing Mistral Small 4 March 16, 2026 Mistral AI

Mistral AI News May 31

Company Mistral AI partners with NVIDIA to accelerate open frontier models March 16, 2026 Mistral AI

Mistral AI News May 31

AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

Hugging Face Daily Papers May 31

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

Hugging Face Daily Papers May 31

OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources

Hugging Face Daily Papers May 31

CollectionLoRA: Collecting 50 Effects in 1 LoRA via Multi-Teacher On-Policy Distillation

Hugging Face Daily Papers May 31

minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models

Hugging Face Daily Papers May 31

YoCausal: How Far is Video Generation from World Model? A Causality Perspective

Hugging Face Daily Papers May 31

Why Far Looks Up: Probing Spatial Representation in Vision-Language Models

Hugging Face Daily Papers May 31

GenClaw: Code-Driven Agentic Image Generation

Hugging Face Daily Papers May 31

How LoRA Remembers? A Parametric Memory Law for LLM Finetuning

Hugging Face Daily Papers May 31

EarlyTom: Early Token Compression Completes Fast Video Understanding

Hugging Face Daily Papers May 31

Native Audio-Visual Alignment for Generation

Hugging Face Daily Papers May 31

UniSteer: Text-Guided Flow Matching in Activation Space for Versatile LLM Steering

Hugging Face Daily Papers May 31

What's this sub geebral opinion on quantisizing the KV cache

r/LocalLLaMA May 31

Whats actually happening when a model spills out of VRAM into system memory?

r/LocalLLaMA May 31

Llama Studio v0.2.0

r/LocalLLaMA May 31

Qwen3.6-35B vs Gemma4-26B on 7900 XTX

r/LocalLLaMA May 31

(YT) PewDiePie released his harness/webui

r/LocalLLaMA May 31

We might have a winner with the upcoming N1X

r/LocalLLaMA May 31

Added an old 2070 Super to my rig and I can't go back...worse, now I need more

r/LocalLLaMA May 31

13 abliterated Gemma 4 E2B variants, 44 GPU hours, Benchmark and Comparison - Abliterlitics

r/LocalLLaMA May 31

Stepfun 3.7 Flash is very good

r/LocalLLaMA May 31

Flash Attention for llama.cpp on RDNA3: 47% less KV VRAM than Vulkan f16 K, KLD almost losselss on F16 K / q4_0 V. Part 1.

r/LocalLLaMA May 31

<Think> toggle button for llama.cp web chat for QWEN3.6

r/LocalLLaMA May 31

[AINews] Google I/O 2026: Gemini 3.5 Flash, Omni (NanoBanana for Video), Spark (background agents), and Antigravity 2.0

Latent Space May 20

It's funny how everything changes, yet somehow stays the same.

r/LocalLLaMA May 31

Dell confirms XPS laptop with NVIDIA N1X at Computex ( basically a DGX Spark GB10 for consumers with Windows )

r/LocalLLaMA May 31

My home data center

r/LocalLLaMA May 31

Someone out there likely needs this

r/LocalLLaMA May 30

[AINews] How to land a job at a frontier lab (on Pretraining)

Latent Space May 19

Fast-tracking genetic leads to reverse cellular aging

Google DeepMind Blog May 18

The Autonomous Drone Tech Stack & Economics of Drones — Yaroslav Azhnyuk, The Fourth Law & Guest Host Noah Smith, Noahpinion

Latent Space May 18

Simulate real-world places with Project Genie and Street View

Google DeepMind Blog May 17

Introducing Gemini Omni

Google DeepMind Blog May 17

Introducing Google Antigravity 2.0

Google DeepMind Blog May 17

Gemini for Science: AI experiments and tools for a new era of discovery

Google DeepMind Blog May 17

Making it easier to understand how content was created and edited

Google DeepMind Blog May 17

OpenJarvis: a local-first personal AI is now available to run with Ollama

Ollama Blog May 28

Strengthening Singapore’s AI Future: A New National Partnership

Google DeepMind Blog May 16

Finding the molecular switches behind new infectious diseases

Google DeepMind Blog May 16

Opening new paths in aging research

Google DeepMind Blog May 16

Accelerating discovery of liver disease mechanisms

Google DeepMind Blog May 16

Uniting biological toolkits for a new approach to ALS

Google DeepMind Blog May 16

Uncovering repurposed medicines to fight liver fibrosis

Google DeepMind Blog May 16

[AINews] Cerebras' $60B IPO: Slowly, then All at Once

Latent Space May 16

How WeatherNext helped the National Hurricane Center better predict Hurricane Melissa’s historic landfall in Jamaica

Google DeepMind Blog May 16

Gemini 3.5: frontier intelligence with action

Google DeepMind Blog May 15

Import AI 458: Reckoning with the future; and a singularity story

Import AI May 26

Epidemiology of Model Collapse: Modeling Synthetic Data Contamination via Bilayer SIR Dynamics

arXiv cs.CL (Computation and Language) 8h

Predict and Reconstruct: Joint Objectives for Self-Supervised Language Representation Learning

arXiv cs.CL (Computation and Language) 8h

Improving Heart-Focused Medical Question Answering in LLMs via Variance-Aware Rubric Rewards with GRPO

arXiv cs.CL (Computation and Language) 8h

Generic Triple-Latent Compression with Gated Associative Retrieval

arXiv cs.CL (Computation and Language) 8h

PEFT of SLM for Telecommunications Customer Support: A Comparative Study of LoRA Configurations with Energy Consumption Analysis

arXiv cs.CL (Computation and Language) 8h

MCBench: A Multicontext Safety Assessment Benchmark for Omni Large Language Models

arXiv cs.CL (Computation and Language) 8h

Efficient Punctuation Restoration via Weighted Lookahead Scoring Method for Streaming ASR Systems

arXiv cs.CL (Computation and Language) 8h

From Scoring to Explanations: Evaluating SHAP and LLM Rationales for Rubric-based Teaching Quality Assessment

arXiv cs.CL (Computation and Language) 8h

Multi-Granularity Reasoning for Natural Language Inference

arXiv cs.CL (Computation and Language) 8h

LANTERN: Layered Archival and Temporal Episodic Retrieval Network for Long-Context LLM Conversations

arXiv cs.CL (Computation and Language) 8h

The Granularity Gap: A Multi-Dimensional Longitudinal Audit of Sycophancy in Gemini Models

arXiv cs.CL (Computation and Language) 8h

LoRi: Low-Rank Distillation for Implicit Reasoning

arXiv cs.CL (Computation and Language) 8h

A Model of Multi-turn Human Persuadability Using Probabilistic Belief Tracing

arXiv cs.CL (Computation and Language) 8h

Self-supervised User Profile Generation for Personalization

arXiv cs.CL (Computation and Language) 8h

Trajectory Dynamics in Language Model Hidden States Predict Human Processing Costs Beyond Surprisal

arXiv cs.CL (Computation and Language) 8h

POLARIS: Guiding Small Models to Write Long Stories

arXiv cs.CL (Computation and Language) yest

Discourse-Role Labels as Presentation-Time Variables for Context Use in Language Models

arXiv cs.CL (Computation and Language) yest

Computational conceptual history of scientific concepts: From early digital methods to LLMs

arXiv cs.CL (Computation and Language) yest

SaliMory: Orchestrating Cognitive Memory for Conversational Agents

arXiv cs.CL (Computation and Language) yest

When Retrieval Doesn't Help: A Large-Scale Study of Biomedical RAG

arXiv cs.CL (Computation and Language) yest

Expert-Aware Refusal Steering

arXiv cs.CL (Computation and Language) yest

A Systematic Analysis of Linguistic Features in AI-Generated Text Detection Across Domains and Models

arXiv cs.CL (Computation and Language) yest

ACAT: A Collaborative Platform for Efficient Aspect-Based Sentiment Dataset Annotation

arXiv cs.CL (Computation and Language) yest

Cross-Prompt Generalization in Detecting AI-Generated Fake News Using Interpretable Linguistic Features

arXiv cs.CL (Computation and Language) yest

MM-BizRAG: Rethinking Multimodal Retrieval-Augmented Generation for General Purpose Enterprise Q&A

arXiv cs.CL (Computation and Language) yest

Supportive Token Revealing for Fast Diffusion Language Model Decoding

arXiv cs.CL (Computation and Language) yest

Can I Take Another Dose? Evaluating LLM Decision-Making Under Temporal Uncertainty in OTC Dosing QA

arXiv cs.CL (Computation and Language) yest

Long Live Fine-Tuning: Task-Specific Transformers Outperform Zero-Shot LLMs for Misinformation Response Classification on Reddit

arXiv cs.CL (Computation and Language) yest

Using Text-Based Causal Inference to Disentangle Factors Influencing Online Review Ratings

arXiv cs.CL (Computation and Language) yest

LazyAttention: Efficient Retrieval-Augmented Generation with Deferred Positional Encoding

arXiv cs.CL (Computation and Language) yest

Announcing Day-0 Support for NVIDIA Nemotron 3 Ultra on vLLM

vLLM Blog yest

IdiomX A Multilingual Benchmark for Idiom Understanding, Retrieval, and Interpretation

arXiv cs.CL (Computation and Language) Jun 3

Greener Than Humans? Environmental Attitudes in Large Language Models

arXiv cs.CL (Computation and Language) Jun 3

On the Persistent Effects of Lexicality in Large Language Mod

arXiv cs.CL (Computation and Language) Jun 3

Topics as Proxies for Sociodemographics: How Conversational Context Affects LLM Answers

arXiv cs.CL (Computation and Language) Jun 3

Do Value Vectors in Deep Layers Need Context from the Residual Stream?

arXiv cs.CL (Computation and Language) Jun 3

Translating Classical Poetry into Modern Prose

arXiv cs.CL (Computation and Language) Jun 3

Fixing FOLIO and MALLS: Verified Annotations and an LLM-assisted Framework to Focus Human Relabeling

arXiv cs.CL (Computation and Language) Jun 3

Economy of Minds: Emerging Multi-Agent Intelligence with Economic Interactions

arXiv cs.CL (Computation and Language) Jun 3

Adaptive Latent Agentic Reasoning

arXiv cs.CL (Computation and Language) Jun 3

Linear Probes Detect Task Format, Not Reasoning Mode in Language Model Hidden States

arXiv cs.CL (Computation and Language) Jun 3

WRIT: Write-Read Intensive Trajectory Synthesis for Multi-Turn User-Facing Agents

arXiv cs.CL (Computation and Language) Jun 3

The Ghost Annotator: a Framework to Explore Human Label Variation in Content Moderation through Conformal Prediction

arXiv cs.CL (Computation and Language) Jun 3

Linguistic Productivity in Large Language Models: Models Coerce, but do not Preempt

arXiv cs.CL (Computation and Language) Jun 3

Fast-dLLM++: Fr\'{e}chet Profile Decoding for Faster Diffusion LLM Inference

arXiv cs.CL (Computation and Language) Jun 3

EURO-5K: When Does Domain Pretraining Matter? Benchmarking Transformers for EU Reporting Obligation Extraction

arXiv cs.CL (Computation and Language) Jun 3

Fast & Efficient LLM Inference with vLLM: A New Course with DeepLearning.AI

vLLM Blog Jun 3

Import AI 457: AI stuxnet; cursed Muon optimizer; and positive alignment

Import AI May 18

DraDDP: A Multimodal Multi-Party Dialogue Discourse Parsing Dataset

arXiv cs.CL (Computation and Language) Jun 2

Toward Robust In-Context Learning: Leveraging Out-of-distribution Proxies for Target Inaccessible Demonstration Retrieval

arXiv cs.CL (Computation and Language) Jun 2

AEyeDE: An Attention-Based Attribution Framework for AI-Generated Text Detection

arXiv cs.CL (Computation and Language) Jun 2

CSRP: Chain-of-Thought Reasoning for Chinese Text Correction via Reinforcement Learning with Efficiency-Aware Rewards

arXiv cs.CL (Computation and Language) Jun 2

SENSE: Semantic Embedding Navigation with Soft-gated Evaluation for Retrieval-based Speculative Decoding

arXiv cs.CL (Computation and Language) Jun 2

lmfaoooo at SemEval-2026 Task 1: Humor Is an Audience. Preference Modeling for Constrained Humor Generation

arXiv cs.CL (Computation and Language) Jun 2

TrustLDM: Benchmarking Trustworthiness in Language Diffusion Models

arXiv cs.CL (Computation and Language) Jun 2

ART: Attention Run-time Termination for Efficient Large Language Model Decoding

arXiv cs.CL (Computation and Language) Jun 2

Cognitive-Linguistic Indicators of Depression in Online Communities: Analysed by DistilBERT and Holographic Reduced Representation

arXiv cs.CL (Computation and Language) Jun 2

A Multi-Domain Red Teaming Framework for Safety, Robustness, and Fairness Evaluation of Medical Large Language Models

arXiv cs.CL (Computation and Language) Jun 2

TCAR-Gen: Temporal Graph Retrieval with Evidence Fusion for Knowledge-Grounded Generation

arXiv cs.CL (Computation and Language) Jun 2

LLMs for Cardiovascular Risk Prediction from Structured Clinical Data

arXiv cs.CL (Computation and Language) Jun 2

Graph-Augmented Retrieval for Cross-Entity Financial Sentiment Analysis: A Comparative Study

arXiv cs.CL (Computation and Language) Jun 2

DLLM-JEPA: Joint Embedding Predictive Architectures for Masked Diffusion Language Models

arXiv cs.CL (Computation and Language) Jun 2

Agreement Metrics for LLM-as-Judge Evaluation: What to Report and Why

arXiv cs.CL (Computation and Language) Jun 2

Session-Aware Agentic Routing: Continuity-Aware Model Selection for Long-Horizon LLM Agents

vLLM Blog Jun 2

Accelerating vLLM-Omni Inference with AutoRound Quantization

vLLM Blog Jun 2

Protocol for evaluating ChatGPT in biomedical association generation and verification using a RAG-enabled, cross-model majority voting workflow

arXiv cs.CL (Computation and Language) Jun 1

Exploring Autonomous Agentic Data Engineering for Model Specialization

arXiv cs.CL (Computation and Language) Jun 1

Domain Adaptation and Reasoning Frameworks in Language Models: A Controlled Experiment with Historical Cosmology

arXiv cs.CL (Computation and Language) Jun 1

Cross-Lingual Steering for Figurative Language Generation

arXiv cs.CL (Computation and Language) Jun 1

Can LLM Teams Play What? Where? When?

arXiv cs.CL (Computation and Language) Jun 1

Knowledge Graph-Enhanced Zero-Shot Topic Classification: A Multi-Strategy Comparative Study

arXiv cs.CL (Computation and Language) Jun 1

Your Multimodal Speech Model Says I Have a Face for Radio

arXiv cs.CL (Computation and Language) Jun 1

When English Rewrites Local Knowledge: Global Narrative Dominance in Large Language Models

arXiv cs.CL (Computation and Language) Jun 1

Configurable Reward Model for Balanced Safety Alignment

arXiv cs.CL (Computation and Language) Jun 1

CanLegalRAGBench: Evaluating Retrieval-Augmented Generation on Canadian Case Law

arXiv cs.CL (Computation and Language) Jun 1

Linear Ensembles Wash Away Watermarks: On the Fragility of Distributional Perturbations in LLMs

arXiv cs.CL (Computation and Language) Jun 1

Auditing LLM Benchmarks with Item Response Theory

arXiv cs.CL (Computation and Language) Jun 1

Evaluating using Mock Tool Calls to Quarantine Untrusted Prompt Inputs

arXiv cs.CL (Computation and Language) Jun 1

Generalistic or Specific Embeddings, Which is Better? An Empirical Study on Search for Clinical Coding in Non-English Languages

arXiv cs.CL (Computation and Language) Jun 1

Refining Word-Based Grammatical Error Annotation for L2 Korean

arXiv cs.CL (Computation and Language) Jun 1

vLLM on the DGX Spark: Architecture, Configuration, and Local Evaluation

vLLM Blog Jun 1

Accelerating Laguna XS.2 Inference with vLLM, Speculators, and LLM Compressor

vLLM Blog May 28

Native RL APIs in vLLM

vLLM Blog May 28

Speculators v0.5.0: DFlash Support and Online Training

vLLM Blog May 28

From Text to Multimodal Routing: Hardening Vision Signals in vLLM Semantic Router

vLLM Blog May 28

Import AI 456: RSI and economic growth; radical optionality for AI regulation; and a neural computer

Import AI May 11

EAGLE 3.1: Advancing Speculative Decoding Through Collaboration Between the EAGLE Team, vLLM, and TorchSpec

vLLM Blog May 26

Import AI 455: AI systems are about to start building themselves.

Import AI May 4

vLLM x Novita AI: PegaFlow for Production-Grade External KV Cache

vLLM Blog May 18

Elastic Expert Parallelism in vLLM

vLLM Blog May 14

Announcing VeRL-Omni: Easy, Fast, and Stable RL Training for Diffusion and Omni-Modality Models

vLLM Blog May 14

A First Comprehensive Study of TurboQuant: Accuracy and Performance

vLLM Blog May 11

vLLM Tops the Artificial Analysis Leaderboard

vLLM Blog May 11

Serving Agentic Workloads at Scale with vLLM x Mooncake

vLLM Blog May 6

Import AI 454: Automating alignment research; safety study of a Chinese model; HiFloat4

Import AI Apr 20

Import AI 453: Breaking AI agents; MirrorCode; and ten views on gradual disempowerment

Import AI Apr 13

Run Highly Efficient Multimodal Agentic AI with NVIDIA Nemotron 3 Nano Omni Using vLLM

vLLM Blog Apr 28

DeepSeek V4 in vLLM: Efficient Long-context Attention

vLLM Blog Apr 24

The State of FP8 KV-Cache and Attention Quantization in vLLM

vLLM Blog Apr 22

Import AI 452: Scaling laws for cyberwar; rising tides of AI automation; and a puzzle over gDP forecasting

Import AI Apr 6

Foundation Models & Frontier AI Labs

Keyboard