Foundation Models & Frontier AI Labs

418 items · default last 14 days

Quoting Andreas Kling Simon Willison's Weblog now
[AINews] not much happened today Latent Space 5h
AI enthusiasts are in a race against time, AI skeptics are in a race against entropy Simon Willison's Weblog 12h
Reality: The Final Eval — Lukas Petersson and Axel Backlund of Andon Labs Latent Space 15h
Quoting Emanuel Maiberg, 404 Media Simon Willison's Weblog 19h
How Endava is redesigning software delivery around AI agents OpenAI News yest
Dreaming: Better memory for a more helpful ChatGPT OpenAI News yest
[AINews] Reve 2 and Ideogram 4: Layouts in Imagegen Latent Space yest
Biodefense in the Intelligence Age OpenAI News yest
🔬Scaling Past Informal AI - Carina Hong, Axiom Math Latent Space yest
⚡️Satya Nadella: No Priors x Latent Space Crossover Special at Microsoft Build Latent Space yest
Introducing the Services Track and Partner Hub of the Claude Partner Network Anthropic News yest
What we learned mapping a year’s worth of AI-enabled cyber threats Anthropic News yest
Introducing new capabilities to GPT-Rosalind OpenAI News yest
Uber Caps Usage of AI Tools Like Claude Code to Manage Costs Simon Willison's Weblog Jun 3
How Wasmer used Codex to build a Node.js runtime for the edge OpenAI News Jun 3
A blueprint for democratic governance of frontier AI OpenAI News Jun 3
OpenAI public policy agenda OpenAI News Jun 3
[AINews] Microsoft Build: MAI-Thinking-1 and MAI Family models Latent Space Jun 3
Microsoft's new MAI models Simon Willison's Weblog Jun 2
datasette-agent-micropython 0.1a0 Simon Willison's Weblog Jun 2
micropython-wasm 0.1a1 Simon Willison's Weblog Jun 2
California Brown Pelican Simon Willison's Weblog Jun 2
GitHub's plan for Agents — Kyle Daigle, GitHub Latent Space Jun 2
Expanding Project Glasswing Anthropic News Jun 2
Travelers deploys AI-powered claims countrywide with OpenAI OpenAI News Jun 2
Codex for every role, tool, and workflow OpenAI News Jun 2
Advancing youth safety and opportunity through global leadership OpenAI News Jun 2
Pasted File Editor Simon Willison's Weblog Jun 2
micropython-wasm 0.1a0 Simon Willison's Weblog Jun 2
[AINews] NVIDIA Cosmos 3, Nemotron 3 Ultra, and RTX Spark Latent Space Jun 2
Codex is becoming a productivity tool for everyone OpenAI News Jun 2
Hackers Simply Asked Meta AI to Give Them Access to High-Profile Instagram Accounts. It Worked Simon Willison's Weblog Jun 1
Our views on AI policy and political advocacy OpenAI News Jun 1
Anthropic confidentially submits draft S-1 to the SEC Anthropic News Jun 1
Why Video Agent models are next — Ethan He, xAI Grok Imagine Latent Space Jun 1
Building the infrastructure for the Intelligence Age in Michigan OpenAI News Jun 1
OpenAI frontier models and Codex are now available on AWS OpenAI News Jun 1
May 2026 newsletter Simon Willison's Weblog Jun 1
Introducing Claude Design by Anthropic Labs Anthropic News Jun 1
What 81,000 people want from AI Anthropic News Jun 1
Claude is a space to think Anthropic News Jun 1
datasette 1.0a32 Simon Willison's Weblog May 31
Anthropic raises $65B in Series H funding at $965B post-money valuation Anthropic News May 31
Introducing Claude Opus 4.8 Anthropic News May 31
Anthropic opens Milan office to support Italian enterprise, research, and developers Anthropic News May 31
Anthropic appoints KiYoung Choi as Representative Director of Korea ahead of Seoul office opening Anthropic News May 31
Anthropic co-founder Chris Olah's remarks on Pope Leo XIV's encyclical "Magnifica humanitas" Anthropic News May 31
Project Glasswing: An initial update Anthropic News May 31
Widening the conversation on frontier AI Anthropic News May 31
KPMG integrates Claude across its core business and workforce of more than 276,000 in strategic alliance Anthropic News May 31
Anthropic acquires Stainless Anthropic News May 31
PwC is deploying Claude to build technology, execute deals, and reinvent enterprise functions for clients Anthropic News May 31
Claude Code Enterprise Anthropic News May 31
Full leaderboardFull LLM Stats — AI Updates / Leaderboard May 31
Claude Mythos Preview LLM Stats — AI Updates / Leaderboard May 31
Artificial Analysis Coding Agent Index Artificial Analysis May 31
Artificial Analysis Openness Index Artificial Analysis May 31
The solution might be cancelling my AI subscription Simon Willison's Weblog May 31
Quoting Karen Kwok for Reuters Breakingviews Simon Willison's Weblog May 31
How we contain Claude across products Simon Willison's Weblog May 30
Running Python ASGI apps in the browser via Pyodide + a service worker Simon Willison's Weblog May 30
I Am Retiring from Tech to Live Offline Simon Willison's Weblog May 30
Quoting Daniel Jalkut Simon Willison's Weblog May 30
[AINews] Founders and Forward Deployed Engineers Latent Space May 30
Boston Children’s uses AI to unlock new diagnoses OpenAI News May 29
How Braintrust turns customer requests into code with Codex OpenAI News May 29
datasette 1.0a31 Simon Willison's Weblog May 29
Strengthening societal resilience with Rosalind Biodefense OpenAI News May 29
[AINews] Anthropic raises $965B Series H, releases Opus 4.8 and Dynamic Workflows/ultracode Latent Space May 29
Anthropic's run-rate revenue hits $47 billion Simon Willison's Weblog May 29
A shared playbook for trustworthy third party evaluations OpenAI News May 29
Claude Opus 4.8: "a modest but tangible improvement" Simon Willison's Weblog May 28
llm-anthropic 0.25.1 Simon Willison's Weblog May 28
markdown-svg-renderer Simon Willison's Weblog May 28
The Age of Async Agents — Cognition's Walden Yan & OpenInspect's Cole Murray Latent Space May 28
How Endava builds an agentic organization with Codex OpenAI News May 28
[AINews] Cognition raises $1B in $26B Series D Latent Space May 28
OpenAI’s Frontier Governance Framework OpenAI News May 28
MUFG aims to become AI-native with OpenAI OpenAI News May 28
sqlite AGENTS.md Simon Willison's Weblog May 27
🔬ESM: The Bitter Lesson is Coming for Proteins - Alex Rives, BioHub Latent Space May 27
I think Anthropic and OpenAI have found product-market fit Simon Willison's Weblog May 27
Cisco and OpenAI redefine enterprise engineering with Codex OpenAI News May 27
Building self-improving tax agents with Codex OpenAI News May 27
Quoting Kyle Ferrana Simon Willison's Weblog May 27
[AINews] New AI Infra decacorns: Fireworks, Baseten (with OpenRouter on the way) Latent Space May 27
Warp’s big bet on building open source with GPT-5.5 OpenAI News May 27
Election information and safeguards in 2026 OpenAI News May 27
The pressure Simon Willison's Weblog May 26
ArcANE: Do Role-Playing Language Agents Stay in Character at the Right Time? Hugging Face Daily Papers now
Seoul National University Hugging Face Daily Papers now
TIDE: Proactive Multi-Problem Discovery via Template-Guided Iteration Hugging Face Daily Papers now
AdaPlanBench: Evaluating Adaptive Planning in Large Language Model Agents under World and User Constraints Hugging Face Daily Papers now
University of Illinois at Urbana-Champaign Hugging Face Daily Papers now
VideoKR: Towards Knowledge- and Reasoning-Intensive Video Understanding Hugging Face Daily Papers now
RobotValues: Evaluating Household Robots When Human Values Conflict Hugging Face Daily Papers now
Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation Hugging Face Daily Papers now
University of Zurich, Department of Computational Linguistics Hugging Face Daily Papers now
LoomVideo: Unifying Multimodal Inputs into Video Generation and Editing Hugging Face Daily Papers now
Personal AI Agent for Camera Roll VQA Hugging Face Daily Papers now
Rethinking Continual Experience Internalization for Self-Evolving LLM Agents Hugging Face Daily Papers now
[NEW MODEL] SupraLabs just released a new model! - Supra-50M-Reasoning r/LocalLLaMA 1h
RTX Pro 4500 Blackwell Performance Numbers r/LocalLLaMA 2h
Gemma 4 12B is my new main squeeze r/LocalLLaMA 5h
hello there! i made a tool to explore kokoro. r/LocalLLaMA 7h
Here is my llama.cpp NVFP4/MXFP6 GGUF quantizer tool r/LocalLLaMA 7h
Finally finished my LLM server: EPYC 9575F, 4× RTX 3090 (96GB VRAM), 768GB ECC RAM r/LocalLLaMA 8h
How LLM-driven NPCs work in Ultima Online (ServUO) r/LocalLLaMA 9h
RTX Spark Ads: DJT Edition r/LocalLLaMA 11h
finally r/LocalLLaMA 11h
OpenAI, Grupo Folha and Grupo UOL announce strategic content partnership OpenAI News May 25
Higgs Audio v3 TTS 4B. Built for voice chat. Support 100 languages and inline control. r/LocalLLaMA 13h
You guys were right - Qwen 3.6 35B IS good...and KV Cache DOES matter. r/LocalLLaMA 16h
Nvidia's been paying shills on LinkedIn r/LocalLLaMA 20h
Today made me realize just how bad things have gotten without Meta r/LocalLLaMA 20h
VibeOS - Fully Hallucinated Operating System r/LocalLLaMA 21h
KVarN: new KV-cache quant from Huawei. 3–5× KV cache compression with actual speed-up instead of slow-down, and unlike TurboQuant it holds up on reasoning (Apache 2.0, vLLM single flag) r/LocalLLaMA 21h
Cosmos 3: Omnimodal World Models for Physical AI Hugging Face Daily Papers yest
Audio Interaction Model Hugging Face Daily Papers yest
Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories Hugging Face Daily Papers yest
Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning Hugging Face Daily Papers yest
Qwen-Image-Flash: Beyond Objective Design Hugging Face Daily Papers yest
M^3Eval: Multi-Modal Memory Evaluation through Cognitively-Grounded Video Tasks Hugging Face Daily Papers yest
OVO-S-Bench: A Hierarchical Benchmark for Streaming Spatial Intelligence in Multimodal LLMs Hugging Face Daily Papers yest
Intern Large Models Hugging Face Daily Papers yest
Echo-Infinity: Learning Evolving Memory for Real-Time Infinite Video Generation Hugging Face Daily Papers yest
ThoughtFold: Folding Reasoning Chains via Introspective Preference Learning Hugging Face Daily Papers yest
Streaming Communication in Multi-Agent Reasoning Hugging Face Daily Papers yest
Benchmarks are Not Enough: RAMP for Runtime Assessing of Agentic Models in Production Systems Hugging Face Daily Papers yest
nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16 · Hugging Face r/LocalLLaMA yest
nex-agi/Nex-N2-mini • Huggingface r/LocalLLaMA yest
Gemma 4 QAT confirmed to release soon! r/LocalLLaMA yest
Gemma 4 12b 8Q Heretic Oneshot Coding r/LocalLLaMA yest
The first Gemma 4 12B finetunes are ready r/LocalLLaMA yest
Me visiting this sub r/LocalLLaMA yest
NVIDIA Nemotron 3 Ultra Ollama Blog yest
Trump signs narrower executive order on AI oversight after industry objections r/LocalLLaMA yest
How can the numbers be this massive within a month ?? r/LocalLLaMA yest
New Google Gemma 4 12B Claims Near-26B Performance - We Tested Both! r/LocalLLaMA yest
Gemma 4 12B first coding agent test on a 4080 Super r/LocalLLaMA yest
gemma-4-12b-it vs Qwen3.5-9B on shared benchmarks: Qwen is overall winner beating gemma in 5/8 benchmarks despite a smaller footprint r/LocalLLaMA yest
More Gemma 4 models incoming r/LocalLLaMA yest
Introducing Gemma 4 12B: a unified, encoder-free multimodal model r/LocalLLaMA yest
Let us let Google know that we want the Gemma 4 124b r/LocalLLaMA yest
google/gemma-4-12B · Hugging Face r/LocalLLaMA yest
OCC-RAG: Optimal Cognitive Core for Faithful Question Answering Hugging Face Daily Papers yest
Trust Region On-Policy Distillation Hugging Face Daily Papers yest
From Activation to Causality: Discovery of Causal Visual Representations in the Human Brain Hugging Face Daily Papers yest
Massachusetts Institute of Technology Hugging Face Daily Papers yest
Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking Hugging Face Daily Papers yest
KVarN: Variance-Normalized KV-Cache Quantization Mitigates Error Accumulation in Reasoning Tasks Hugging Face Daily Papers yest
HUAWEI Computing Systems Lab Hugging Face Daily Papers yest
A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL Hugging Face Daily Papers yest
MIRA: Mid-training Rubric Anchoring for Source-Aware Data Selection Hugging Face Daily Papers yest
World Models Meet Language Models: On the Complementarity of Concrete and Abstract Reasoning Hugging Face Daily Papers yest
AutoMedBench: Towards Medical AutoResearch with Agentic AI Models Hugging Face Daily Papers yest
University of California, Santa Cruz Hugging Face Daily Papers yest
ui: Mermaid Diagrams in chat + interactive preview by allozaur · Pull Request #24032 · ggml-org/llama.cpp r/LocalLLaMA yest
Take Three: What’s the rub on memory sessions? r/LocalLLaMA yest
Qwen 3.7 Plus just briefly appeared and then disappeared on OpenRouter. r/LocalLLaMA yest
How does the new abliteration tool Apostate compare with others? - Abliterlitics r/LocalLLaMA yest
Tensor split mode: CUDA error on latest llama.cpp with Qwen-3.6-27b r/LocalLLaMA yest
How much VRAM needed for Qwen 3.6 27B Q8 with 262K context? r/LocalLLaMA Jun 3
Calling it now Microsoft is buying Unsloth. r/LocalLLaMA Jun 3
Holo3.1 35B/9B/4B/0.8B (Qwen 3.5 finetunes) r/LocalLLaMA Jun 3
Another shout out to llama.cpp build b9455 2x3090 r/LocalLLaMA Jun 3
Microsoft Aion 1.0 Instruct and Aion 1.0 Plan models! r/LocalLLaMA Jun 3
[AINews] All Model Labs are now Agent Labs Latent Space May 23
Nous Research — Hermes Desktop r/LocalLLaMA Jun 3
Why do we benchmark quants on perplexity and prose but never on tool call validity? r/LocalLLaMA Jun 3
I Put a Datacenter GPU in My Gaming PC for £200 r/LocalLLaMA Jun 2
Minimax M3 appears to have no political censorship r/LocalLLaMA Jun 2
I have become George Jetson: my job is now Yes/No supervision for a machine I don’t fully understand. r/LocalLLaMA Jun 2
Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs Hugging Face Daily Papers Jun 2
On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters Hugging Face Daily Papers Jun 2
A Matter of TASTE: Improving Coverage and Difficulty of Agent Benchmarks Hugging Face Daily Papers Jun 2
Technion Israel institute of technology Hugging Face Daily Papers Jun 2
K-BrowseComp: A Web Browsing Agent Benchmark Grounded in Korean Contexts Hugging Face Daily Papers Jun 2
Carnegie Mellon University Hugging Face Daily Papers Jun 2
Draft-OPD: On-Policy Distillation for Speculative Draft Models Hugging Face Daily Papers Jun 2
Domino: Decoupling Causal Modeling from Autoregressive Drafting in Speculative Decoding Hugging Face Daily Papers Jun 2
Shanghai Jiao Tong University Hugging Face Daily Papers Jun 2
Linear Ensembles Wash Away Watermarks: On the Fragility of Distributional Perturbations in LLMs Hugging Face Daily Papers Jun 2
King's College London Hugging Face Daily Papers Jun 2
VLMs are Good Teachers for Video Reasoning via Adaptive Test-Time Optimization Hugging Face Daily Papers Jun 2
1-bit Bonsai Image 4B and Ternary Bonsai Image 4B Image Generation for Local Devices with just 0.93 GB and 1.21 GB respectively of Diffusion Transformer Footprint. So tiny! r/LocalLLaMA Jun 2
ui: Add Thinking mode toggle with reasoning effort levels + improvements for Chat Form Add Action UI by allozaur · Pull Request #23434 · ggml-org/llama.cpp r/LocalLLaMA Jun 2
Tiny LLM Benchmark: Jetson Orin Nano Super 8GB - Four Power Modes × Eight Models r/LocalLLaMA Jun 2
Building a free, offline LLM “tutor” grounded in one university textbook — RAG, LoRA, or both? Sanity check wanted r/LocalLLaMA Jun 2
Ignoring benchmarks, how do the newest local models (gemma 4 31B, 26BA4B, Qwen 3.6) “feel” to you? What do you think they compare to? r/LocalLLaMA Jun 2
Replaced Claude with local Qwen3.6-27B in my multi-agent orchestrator for 2 weeks r/LocalLLaMA Jun 2
Dual rtx 3090 build r/LocalLLaMA Jun 2
Qwen 3.6-35B-A3B with 977 tk/s prompt processing and 262k context window on Intel Arc B70 Pro r/LocalLLaMA Jun 2
Intel Arc Pro B70 llama.cpp benchmarks posted r/LocalLLaMA Jun 2
[AINews] New AI Infra unicorns: Exa, Modal, TurboPuffer Latent Space May 22
NVIDIA releases Cosmos 3 Omnimodal world modelson HF r/LocalLLaMA Jun 2
Moss tts 1.5 8b Examples. It is the currently best voice cloning model for English as of June 2026 r/LocalLLaMA Jun 2
How Virgin Atlantic ships faster with Codex OpenAI News May 22
OpenAI named a Leader in enterprise coding agents by Gartner OpenAI News May 22
Stop asking what model to run. There are literally only two. r/LocalLLaMA Jun 1
RTX Spark does not have 600GB/s Bandwith r/LocalLLaMA Jun 1
Giving Agents Computers — Ivan Burazin, Daytona Latent Space May 21
We’re launching the Google DeepMind Accelerator program in Asia Pacific to tackle environmental risks Google DeepMind Blog May 21
I trusted random person on this subreddit and bought 3080 20gb made of chinesium r/LocalLLaMA Jun 1
GrepSeek: Training Search Agents for Direct Corpus Interaction Hugging Face Daily Papers Jun 1
University of Massachusetts Amherst Hugging Face Daily Papers Jun 1
COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation Hugging Face Daily Papers Jun 1
Trust-Region Behavior Blending for On-Policy Distillation Hugging Face Daily Papers Jun 1
Representation Forcing for Bottleneck-Free Unified Multimodal Models Hugging Face Daily Papers Jun 1
SwanVoice: Expressive Long-Form Zero-Shot Speech Synthesis for Both Monologue and Dialogue Hugging Face Daily Papers Jun 1
Mellum2 Technical Report Hugging Face Daily Papers Jun 1
LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards Hugging Face Daily Papers Jun 1
Knowledge Engineer Group @ Tsinghua University Hugging Face Daily Papers Jun 1
GGT-100K: Generative Ground Truth for Generalizable Real-World Image Restoration Hugging Face Daily Papers Jun 1
Function2Scene: 3D Indoor Scene Layout from Functional Specifications Hugging Face Daily Papers Jun 1
Towards Streaming Synchronized Spatial Audio Generation via Autoregressive Diffusion Transformer Hugging Face Daily Papers Jun 1
llama: limit max outputs of `llama_context` by am17an · Pull Request #23861 · ggml-org/llama.cpp r/LocalLLaMA Jun 1
So qwen3.7-4b when? r/LocalLLaMA Jun 1
i dedicate this meme to you r/LocalLLaMA r/LocalLLaMA Jun 1
For Ling-2.6-1T, what would make the size feel justified first: quality per token, local serving reality, or long context stability? r/LocalLLaMA Jun 1
Mellum2 Goes Open Source: A Fast Model for AI Workflows | The JetBrains AI Blog r/LocalLLaMA Jun 1
Mellum 2 12B A2.5B r/LocalLLaMA Jun 1
Cheap V100 32gb r/LocalLLaMA Jun 1
AdventHealth advances whole-person care with OpenAI OpenAI News May 21
Entire world: We need more GPUs. Meanwhile, Jensen Huang: r/LocalLLaMA Jun 1
A 1B humanizer that matches human writing on an AI detector r/LocalLLaMA Jun 1
Just found a 1-click RCE in pewdiepie's Odysseus Chat r/LocalLLaMA Jun 1
Open Models - May 2026 r/LocalLLaMA Jun 1
next MiniMax will be released in ~10 Days r/LocalLLaMA Jun 1
[AINews] OpenAI GPT-next disproves 80 year old Erdős planar unit distance problem for under $1000 Latent Space May 21
NVIDIA announces Nemotron 3 Ultra r/LocalLLaMA Jun 1
LongDS-Bench: On the Failure of Long-Horizon Agentic Data Analysis Hugging Face Daily Papers Jun 1
Hide-and-Seek in Trajectories: Discovering Failure Signals for VLA Runtime Monitoring Hugging Face Daily Papers Jun 1
University of Wisconsin-Madison Hugging Face Daily Papers Jun 1
when you spend 5 days fine-tuning a model and it still confidently makes things up r/LocalLLaMA Jun 1
MiniMax M3 - Coding & Agentic Frontier, 1M Context, Multimodal r/LocalLLaMA Jun 1
Introducing physics AI at Mistral: the foundation for engineering acceleration. Mistral AI News Jun 1
Connect the dots: Build with built-in and custom MCPs in Studio Mistral AI News Jun 1
Minimax M3 seems to be rolling out on the API r/LocalLLaMA Jun 1
Get you some GPUs, it's not worth the hacks around lack of RAM r/LocalLLaMA Jun 1
Semantic Step Prediction: Multi-Step Latent Forecasting in LLM Reasoning Trajectories via Step Sampling r/LocalLLaMA May 31
GPU Prices. Buy now, or buy later? r/LocalLLaMA May 31
Railway: The Agent-Native Cloud — Jake Cooper Latent Space May 20
G7 agrees on shared language around open-source AI and open weights AI r/LocalLLaMA May 31
God dammit Qwen r/LocalLLaMA May 31
I ported NVIDIA Parakeet (speech-to-text) to ggml: same output as NeMo, faster, GGUF-quantized, no Python r/LocalLLaMA May 31
Gemini Deep Research Gemini API Release Notes / Changelog May 31
Gemini 3.1 Flash Image Gemini API Release Notes / Changelog May 31
Video-to-image generation Gemini API Release Notes / Changelog May 31
gemini-3.1-flash-lite Gemini API Release Notes / Changelog May 31
antigravity-preview-05-2026 Gemini API Release Notes / Changelog May 31
gemini-robotics-er-1.6-preview Gemini API Release Notes / Changelog May 31
deep-research-preview-04-2026 Gemini API Release Notes / Changelog May 31
deep-research-max-preview-04-2026 Gemini API Release Notes / Changelog May 31
Gemini 3.1 Flash TTS Preview Gemini API Release Notes / Changelog May 31
veo-3.1-lite-generate-preview Gemini API Release Notes / Changelog May 31
gemini-3.1-flash-lite-preview Gemini API Release Notes / Changelog May 31
gemini-3.1-flash-live-preview Gemini API Release Notes / Changelog May 31
Product Vibe gets to work. The unified agent for long-horizon productivity and coding, launching with Work and Code modes. Plus, a new Vibe VS Code extension. May 28, 2026 Mistral Mistral AI News May 31
Company AI Now Summit 2026 Innovations for global enterprises solving the world’s hardest problems. May 28, 2026 Mistral Mistral AI News May 31
Product Introducing Search Toolkit Production search pipelines, anywhere. May 28, 2026 Mistral Mistral AI News May 31
Research Physics AI research that’s shaping the industry. Published breakthroughs pushing the state of the art. May 27, 2026 Mistral Mistral AI News May 31
Company Emmi joins Mistral to accelerate the AI-native industry May 23, 2026 Mistral AI Mistral AI News May 31
Product Remote agents in Vibe. Powered by Mistral Medium 3.5. Introducing Mistral Medium 3.5, remote coding agents in Vibe, plus new Work mode in Le Chat for complex tasks. May 22, 2026 Mistral AI Mistral AI News May 31
Product Workflows for work that runs the business Workflows is now in public preview. April 27, 2026 Mistral AI Mistral AI News May 31
Engineering Spaces: A CLI Built for Humans and Agents March 31, 2026 Mistral AI Mistral AI News May 31
Research Speaking of Voxtral Voxtral TTS: A frontier, open-weights text-to-speech model that’s fast, instantly adaptable, and produces lifelike speech for voice agents. March 23, 2026 Mistral AI Mistral AI News May 31
Product Introducing Forge Today, we’re introducing Forge, a system for enterprises to build frontier-grade AI models grounded in their proprietary knowledge. March 17, 2026 Mistral AI Mistral AI News May 31
Research Introducing Mistral Small 4 March 16, 2026 Mistral AI Mistral AI News May 31
Company Mistral AI partners with NVIDIA to accelerate open frontier models March 16, 2026 Mistral AI Mistral AI News May 31
AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security Hugging Face Daily Papers May 31
Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments Hugging Face Daily Papers May 31
OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources Hugging Face Daily Papers May 31
CollectionLoRA: Collecting 50 Effects in 1 LoRA via Multi-Teacher On-Policy Distillation Hugging Face Daily Papers May 31
minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models Hugging Face Daily Papers May 31
YoCausal: How Far is Video Generation from World Model? A Causality Perspective Hugging Face Daily Papers May 31
Why Far Looks Up: Probing Spatial Representation in Vision-Language Models Hugging Face Daily Papers May 31
GenClaw: Code-Driven Agentic Image Generation Hugging Face Daily Papers May 31
How LoRA Remembers? A Parametric Memory Law for LLM Finetuning Hugging Face Daily Papers May 31
EarlyTom: Early Token Compression Completes Fast Video Understanding Hugging Face Daily Papers May 31
Native Audio-Visual Alignment for Generation Hugging Face Daily Papers May 31
UniSteer: Text-Guided Flow Matching in Activation Space for Versatile LLM Steering Hugging Face Daily Papers May 31
What's this sub geebral opinion on quantisizing the KV cache r/LocalLLaMA May 31
Whats actually happening when a model spills out of VRAM into system memory? r/LocalLLaMA May 31
Llama Studio v0.2.0 r/LocalLLaMA May 31
Qwen3.6-35B vs Gemma4-26B on 7900 XTX r/LocalLLaMA May 31
(YT) PewDiePie released his harness/webui r/LocalLLaMA May 31
We might have a winner with the upcoming N1X r/LocalLLaMA May 31
Added an old 2070 Super to my rig and I can't go back...worse, now I need more r/LocalLLaMA May 31
13 abliterated Gemma 4 E2B variants, 44 GPU hours, Benchmark and Comparison - Abliterlitics r/LocalLLaMA May 31
Stepfun 3.7 Flash is very good r/LocalLLaMA May 31
Flash Attention for llama.cpp on RDNA3: 47% less KV VRAM than Vulkan f16 K, KLD almost losselss on F16 K / q4_0 V. Part 1. r/LocalLLaMA May 31
<Think> toggle button for llama.cp web chat for QWEN3.6 r/LocalLLaMA May 31
[AINews] Google I/O 2026: Gemini 3.5 Flash, Omni (NanoBanana for Video), Spark (background agents), and Antigravity 2.0 Latent Space May 20
It's funny how everything changes, yet somehow stays the same. r/LocalLLaMA May 31
Dell confirms XPS laptop with NVIDIA N1X at Computex ( basically a DGX Spark GB10 for consumers with Windows ) r/LocalLLaMA May 31
My home data center r/LocalLLaMA May 31
Someone out there likely needs this r/LocalLLaMA May 30
[AINews] How to land a job at a frontier lab (on Pretraining) Latent Space May 19
Fast-tracking genetic leads to reverse cellular aging Google DeepMind Blog May 18
The Autonomous Drone Tech Stack & Economics of Drones — Yaroslav Azhnyuk, The Fourth Law & Guest Host Noah Smith, Noahpinion Latent Space May 18
Simulate real-world places with Project Genie and Street View Google DeepMind Blog May 17
Introducing Gemini Omni Google DeepMind Blog May 17
Introducing Google Antigravity 2.0 Google DeepMind Blog May 17
Gemini for Science: AI experiments and tools for a new era of discovery Google DeepMind Blog May 17
Making it easier to understand how content was created and edited Google DeepMind Blog May 17
OpenJarvis: a local-first personal AI is now available to run with Ollama Ollama Blog May 28
Strengthening Singapore’s AI Future: A New National Partnership Google DeepMind Blog May 16
Finding the molecular switches behind new infectious diseases Google DeepMind Blog May 16
Opening new paths in aging research Google DeepMind Blog May 16
Accelerating discovery of liver disease mechanisms Google DeepMind Blog May 16
Uniting biological toolkits for a new approach to ALS Google DeepMind Blog May 16
Uncovering repurposed medicines to fight liver fibrosis Google DeepMind Blog May 16
[AINews] Cerebras' $60B IPO: Slowly, then All at Once Latent Space May 16
How WeatherNext helped the National Hurricane Center better predict Hurricane Melissa’s historic landfall in Jamaica Google DeepMind Blog May 16
Gemini 3.5: frontier intelligence with action Google DeepMind Blog May 15
Import AI 458: Reckoning with the future; and a singularity story Import AI May 26
Epidemiology of Model Collapse: Modeling Synthetic Data Contamination via Bilayer SIR Dynamics arXiv cs.CL (Computation and Language) 8h
Predict and Reconstruct: Joint Objectives for Self-Supervised Language Representation Learning arXiv cs.CL (Computation and Language) 8h
Improving Heart-Focused Medical Question Answering in LLMs via Variance-Aware Rubric Rewards with GRPO arXiv cs.CL (Computation and Language) 8h
Generic Triple-Latent Compression with Gated Associative Retrieval arXiv cs.CL (Computation and Language) 8h
PEFT of SLM for Telecommunications Customer Support: A Comparative Study of LoRA Configurations with Energy Consumption Analysis arXiv cs.CL (Computation and Language) 8h
MCBench: A Multicontext Safety Assessment Benchmark for Omni Large Language Models arXiv cs.CL (Computation and Language) 8h
Efficient Punctuation Restoration via Weighted Lookahead Scoring Method for Streaming ASR Systems arXiv cs.CL (Computation and Language) 8h
From Scoring to Explanations: Evaluating SHAP and LLM Rationales for Rubric-based Teaching Quality Assessment arXiv cs.CL (Computation and Language) 8h
Multi-Granularity Reasoning for Natural Language Inference arXiv cs.CL (Computation and Language) 8h
LANTERN: Layered Archival and Temporal Episodic Retrieval Network for Long-Context LLM Conversations arXiv cs.CL (Computation and Language) 8h
The Granularity Gap: A Multi-Dimensional Longitudinal Audit of Sycophancy in Gemini Models arXiv cs.CL (Computation and Language) 8h
LoRi: Low-Rank Distillation for Implicit Reasoning arXiv cs.CL (Computation and Language) 8h
A Model of Multi-turn Human Persuadability Using Probabilistic Belief Tracing arXiv cs.CL (Computation and Language) 8h
Self-supervised User Profile Generation for Personalization arXiv cs.CL (Computation and Language) 8h
Trajectory Dynamics in Language Model Hidden States Predict Human Processing Costs Beyond Surprisal arXiv cs.CL (Computation and Language) 8h
POLARIS: Guiding Small Models to Write Long Stories arXiv cs.CL (Computation and Language) yest
Discourse-Role Labels as Presentation-Time Variables for Context Use in Language Models arXiv cs.CL (Computation and Language) yest
Computational conceptual history of scientific concepts: From early digital methods to LLMs arXiv cs.CL (Computation and Language) yest
SaliMory: Orchestrating Cognitive Memory for Conversational Agents arXiv cs.CL (Computation and Language) yest
When Retrieval Doesn't Help: A Large-Scale Study of Biomedical RAG arXiv cs.CL (Computation and Language) yest
Expert-Aware Refusal Steering arXiv cs.CL (Computation and Language) yest
A Systematic Analysis of Linguistic Features in AI-Generated Text Detection Across Domains and Models arXiv cs.CL (Computation and Language) yest
ACAT: A Collaborative Platform for Efficient Aspect-Based Sentiment Dataset Annotation arXiv cs.CL (Computation and Language) yest
Cross-Prompt Generalization in Detecting AI-Generated Fake News Using Interpretable Linguistic Features arXiv cs.CL (Computation and Language) yest
MM-BizRAG: Rethinking Multimodal Retrieval-Augmented Generation for General Purpose Enterprise Q&A arXiv cs.CL (Computation and Language) yest
Supportive Token Revealing for Fast Diffusion Language Model Decoding arXiv cs.CL (Computation and Language) yest
Can I Take Another Dose? Evaluating LLM Decision-Making Under Temporal Uncertainty in OTC Dosing QA arXiv cs.CL (Computation and Language) yest
Long Live Fine-Tuning: Task-Specific Transformers Outperform Zero-Shot LLMs for Misinformation Response Classification on Reddit arXiv cs.CL (Computation and Language) yest
Using Text-Based Causal Inference to Disentangle Factors Influencing Online Review Ratings arXiv cs.CL (Computation and Language) yest
LazyAttention: Efficient Retrieval-Augmented Generation with Deferred Positional Encoding arXiv cs.CL (Computation and Language) yest
Announcing Day-0 Support for NVIDIA Nemotron 3 Ultra on vLLM vLLM Blog yest
IdiomX A Multilingual Benchmark for Idiom Understanding, Retrieval, and Interpretation arXiv cs.CL (Computation and Language) Jun 3
Greener Than Humans? Environmental Attitudes in Large Language Models arXiv cs.CL (Computation and Language) Jun 3
On the Persistent Effects of Lexicality in Large Language Mod arXiv cs.CL (Computation and Language) Jun 3
Topics as Proxies for Sociodemographics: How Conversational Context Affects LLM Answers arXiv cs.CL (Computation and Language) Jun 3
Do Value Vectors in Deep Layers Need Context from the Residual Stream? arXiv cs.CL (Computation and Language) Jun 3
Translating Classical Poetry into Modern Prose arXiv cs.CL (Computation and Language) Jun 3
Fixing FOLIO and MALLS: Verified Annotations and an LLM-assisted Framework to Focus Human Relabeling arXiv cs.CL (Computation and Language) Jun 3
Economy of Minds: Emerging Multi-Agent Intelligence with Economic Interactions arXiv cs.CL (Computation and Language) Jun 3
Adaptive Latent Agentic Reasoning arXiv cs.CL (Computation and Language) Jun 3
Linear Probes Detect Task Format, Not Reasoning Mode in Language Model Hidden States arXiv cs.CL (Computation and Language) Jun 3
WRIT: Write-Read Intensive Trajectory Synthesis for Multi-Turn User-Facing Agents arXiv cs.CL (Computation and Language) Jun 3
The Ghost Annotator: a Framework to Explore Human Label Variation in Content Moderation through Conformal Prediction arXiv cs.CL (Computation and Language) Jun 3
Linguistic Productivity in Large Language Models: Models Coerce, but do not Preempt arXiv cs.CL (Computation and Language) Jun 3
Fast-dLLM++: Fr\'{e}chet Profile Decoding for Faster Diffusion LLM Inference arXiv cs.CL (Computation and Language) Jun 3
EURO-5K: When Does Domain Pretraining Matter? Benchmarking Transformers for EU Reporting Obligation Extraction arXiv cs.CL (Computation and Language) Jun 3
Fast & Efficient LLM Inference with vLLM: A New Course with DeepLearning.AI vLLM Blog Jun 3
Import AI 457: AI stuxnet; cursed Muon optimizer; and positive alignment Import AI May 18
DraDDP: A Multimodal Multi-Party Dialogue Discourse Parsing Dataset arXiv cs.CL (Computation and Language) Jun 2
Toward Robust In-Context Learning: Leveraging Out-of-distribution Proxies for Target Inaccessible Demonstration Retrieval arXiv cs.CL (Computation and Language) Jun 2
AEyeDE: An Attention-Based Attribution Framework for AI-Generated Text Detection arXiv cs.CL (Computation and Language) Jun 2
CSRP: Chain-of-Thought Reasoning for Chinese Text Correction via Reinforcement Learning with Efficiency-Aware Rewards arXiv cs.CL (Computation and Language) Jun 2
SENSE: Semantic Embedding Navigation with Soft-gated Evaluation for Retrieval-based Speculative Decoding arXiv cs.CL (Computation and Language) Jun 2
lmfaoooo at SemEval-2026 Task 1: Humor Is an Audience. Preference Modeling for Constrained Humor Generation arXiv cs.CL (Computation and Language) Jun 2
TrustLDM: Benchmarking Trustworthiness in Language Diffusion Models arXiv cs.CL (Computation and Language) Jun 2
ART: Attention Run-time Termination for Efficient Large Language Model Decoding arXiv cs.CL (Computation and Language) Jun 2
Cognitive-Linguistic Indicators of Depression in Online Communities: Analysed by DistilBERT and Holographic Reduced Representation arXiv cs.CL (Computation and Language) Jun 2
A Multi-Domain Red Teaming Framework for Safety, Robustness, and Fairness Evaluation of Medical Large Language Models arXiv cs.CL (Computation and Language) Jun 2
TCAR-Gen: Temporal Graph Retrieval with Evidence Fusion for Knowledge-Grounded Generation arXiv cs.CL (Computation and Language) Jun 2
LLMs for Cardiovascular Risk Prediction from Structured Clinical Data arXiv cs.CL (Computation and Language) Jun 2
Graph-Augmented Retrieval for Cross-Entity Financial Sentiment Analysis: A Comparative Study arXiv cs.CL (Computation and Language) Jun 2
DLLM-JEPA: Joint Embedding Predictive Architectures for Masked Diffusion Language Models arXiv cs.CL (Computation and Language) Jun 2
Agreement Metrics for LLM-as-Judge Evaluation: What to Report and Why arXiv cs.CL (Computation and Language) Jun 2
Session-Aware Agentic Routing: Continuity-Aware Model Selection for Long-Horizon LLM Agents vLLM Blog Jun 2
Accelerating vLLM-Omni Inference with AutoRound Quantization vLLM Blog Jun 2
Protocol for evaluating ChatGPT in biomedical association generation and verification using a RAG-enabled, cross-model majority voting workflow arXiv cs.CL (Computation and Language) Jun 1
Exploring Autonomous Agentic Data Engineering for Model Specialization arXiv cs.CL (Computation and Language) Jun 1
Domain Adaptation and Reasoning Frameworks in Language Models: A Controlled Experiment with Historical Cosmology arXiv cs.CL (Computation and Language) Jun 1
Cross-Lingual Steering for Figurative Language Generation arXiv cs.CL (Computation and Language) Jun 1
Can LLM Teams Play What? Where? When? arXiv cs.CL (Computation and Language) Jun 1
Knowledge Graph-Enhanced Zero-Shot Topic Classification: A Multi-Strategy Comparative Study arXiv cs.CL (Computation and Language) Jun 1
Your Multimodal Speech Model Says I Have a Face for Radio arXiv cs.CL (Computation and Language) Jun 1
When English Rewrites Local Knowledge: Global Narrative Dominance in Large Language Models arXiv cs.CL (Computation and Language) Jun 1
Configurable Reward Model for Balanced Safety Alignment arXiv cs.CL (Computation and Language) Jun 1
CanLegalRAGBench: Evaluating Retrieval-Augmented Generation on Canadian Case Law arXiv cs.CL (Computation and Language) Jun 1
Linear Ensembles Wash Away Watermarks: On the Fragility of Distributional Perturbations in LLMs arXiv cs.CL (Computation and Language) Jun 1
Auditing LLM Benchmarks with Item Response Theory arXiv cs.CL (Computation and Language) Jun 1
Evaluating using Mock Tool Calls to Quarantine Untrusted Prompt Inputs arXiv cs.CL (Computation and Language) Jun 1
Generalistic or Specific Embeddings, Which is Better? An Empirical Study on Search for Clinical Coding in Non-English Languages arXiv cs.CL (Computation and Language) Jun 1
Refining Word-Based Grammatical Error Annotation for L2 Korean arXiv cs.CL (Computation and Language) Jun 1
vLLM on the DGX Spark: Architecture, Configuration, and Local Evaluation vLLM Blog Jun 1
Accelerating Laguna XS.2 Inference with vLLM, Speculators, and LLM Compressor vLLM Blog May 28
Native RL APIs in vLLM vLLM Blog May 28
Speculators v0.5.0: DFlash Support and Online Training vLLM Blog May 28
From Text to Multimodal Routing: Hardening Vision Signals in vLLM Semantic Router vLLM Blog May 28
Import AI 456: RSI and economic growth; radical optionality for AI regulation; and a neural computer Import AI May 11
EAGLE 3.1: Advancing Speculative Decoding Through Collaboration Between the EAGLE Team, vLLM, and TorchSpec vLLM Blog May 26
Import AI 455: AI systems are about to start building themselves. Import AI May 4
vLLM x Novita AI: PegaFlow for Production-Grade External KV Cache vLLM Blog May 18
Elastic Expert Parallelism in vLLM vLLM Blog May 14
Announcing VeRL-Omni: Easy, Fast, and Stable RL Training for Diffusion and Omni-Modality Models vLLM Blog May 14
A First Comprehensive Study of TurboQuant: Accuracy and Performance vLLM Blog May 11
vLLM Tops the Artificial Analysis Leaderboard vLLM Blog May 11
Serving Agentic Workloads at Scale with vLLM x Mooncake vLLM Blog May 6
Import AI 454: Automating alignment research; safety study of a Chinese model; HiFloat4 Import AI Apr 20
Import AI 453: Breaking AI agents; MirrorCode; and ten views on gradual disempowerment Import AI Apr 13
Run Highly Efficient Multimodal Agentic AI with NVIDIA Nemotron 3 Nano Omni Using vLLM vLLM Blog Apr 28
DeepSeek V4 in vLLM: Efficient Long-context Attention vLLM Blog Apr 24
The State of FP8 KV-Cache and Attention Quantization in vLLM vLLM Blog Apr 22
Import AI 452: Scaling laws for cyberwar; rising tides of AI automation; and a puzzle over gDP forecasting Import AI Apr 6

Keyboard

j / k
move between items
Space
expand / collapse
o
open original
s
save / unsave
m
mark read
/
focus search
?
this help