Realtime LLM news
Follow model releases, benchmark shifts, and research signals without losing the source.
EvalKit refreshes this feed from free public sources every 15 minutes and falls back to curated citations if a source is temporarily unavailable.
Barret Zoph is out at OpenAI again after just five months
Five months after returning to OpenAI, Barret Zoph - the company's head of enterprise AI sales - has departed, The Verge has learned. Zoph returned to OpenAI in mid-January after a stint as co-founder and CTO of Thinking Machines Lab, the...
2026-06-19Presstodaypressopenai
Billionaire Ambani wants AI in every call, app, and home
Reliance is weaving AI into telecom services used by more than 500 million people.
2026-06-19Presstodaypress
Encryption, spyware, and now Mythos: History shows why cyber export control doesn’t work
For the last 30 years, stopping the flow of cybersecurity-related software has proven to be ineffective. It's unclear why it would work now with Anthropic’s cybersecurity model Mythos.
2026-06-19Presstodaypressanthropic
Is the US government’s Anthropic ban accidentally helping the brand?
Just as last week was ending, the US government forced Anthropic to pull its two newest models, Fable 5 and Mythos 5, citing national security concerns after Amazon researchers allegedly found a way to bypass Fable 5’s guardrails. Cybersec...
2026-06-19Presstodaypressanthropic
Source: Elastic agrees to buy CRV-backed Deductive AI for up to $85M
Deductive AI, a startup that uses AI to catch and resolve bugs in software, was founded just three years ago.
2026-06-19Presstodaypress
The CEO of Allbirds’ new AI biz has a plan, but no team
Call it a startup with a sole founder and a very large seed round, but what's next is less clear.
2026-06-19Presstodaypress
The film about Sam Altman has been dropped by Amazon MGM
Luca Guadagnino's film about OpenAI CEO Sam Altman, Artificial, has reportedly been dropped by Amazon MGM. The film, which stars Andrew Garfield and covers the rollercoaster five days in 2023 spanning Altman's termination and reinstatement...
2026-06-19Presstodaypressopenai
Adobe’s redesigned AI studio remembers what your creations look like
Adobe is introducing some new capabilities for its Firefly AI assistant, alongside a "reimagined" AI studio that lets you edit and generate new designs from a single interface. The new Firefly experience launching today in private beta is...
2026-06-18Presstodaypress
AI inference startup Baseten reportedly raising $1.5B months after its last mega-round
Startup Baseten is reportedly close to finalizing a $1.5 billion round at a $13 billion as the “inference gold rush" marches on.
2026-06-18Presstodaypressinference
Midjourney goes from generating cat images to full-body ultrasound scans
Midjourney CEO David Holz just showed off the company's first hardware product and plans to build a San Francisco spa, which he admitted is a bit different from the "cat pictures" produced by its AI image generator. Dubbed The Midjourney S...
2026-06-18Presstodaypress
Photoshop and Premiere now have AI assistants
Adobe's plan to stick AI assistants into all of its Creative Cloud suite is now fully underway, with new chatbots now rolling out to its biggest editing and design apps. As part of a public beta launching today, Photoshop, Premiere, Illust...
2026-06-18Presstodaypress
Snap spins off AI video team into new company, Dotmo, due to costs
The Snapchat maker is spinning off yet another internal unit. Dotmo will be composed of current Snap staff who are leaving the social media company to focus on AI video development.
2026-06-18Presstodaypress
Who decides when AI is too dangerous?
On today’s episode of Decoder, my guest is Hayden Field, senior AI reporter for The Verge. Often when Hayden comes on the show, it’s because something has gone wrong in the world of AI. Last weekend, that something was a pretty intense mix...
2026-06-18Presstodaypress
Anthropic got hit by export rules nobody understands
Anthropic has spent much of this week fighting to get its newest AI models back online after the Trump administration abruptly ordered the company to cut access for all foreign nationals, including users inside the US and its own employees...
2026-06-17Presstodaypressanthropic
The US banned Anthropic’s Fable 5 release, but the numbers don’t seem to care
Just as last week was ending, the US government forced Anthropic to pull its two newest models, Fable 5 and Mythos 5, citing national security concerns after Amazon researchers allegedly found a way to bypass Fable 5’s guardrails. Cybersec...
2026-06-19Pressreleasepressanthropic
Improving health intelligence in ChatGPT
Learn how GPT-5.5 Instant improves ChatGPT’s health and wellness responses with stronger reasoning, better context, clearer communication, and physician-informed evaluations.
2026-06-18Officialreleaseofficialeval
New usage analytics and updated spend controls for enterprises
OpenAI introduces new spend controls and usage analytics for ChatGPT Enterprise, helping organizations manage costs and scale AI with confidence.
2026-06-18Officialreleaseofficialgpt
Using AI to help physicians diagnose rare genetic diseases affecting children
Researchers used an OpenAI reasoning model to help diagnose rare diseases, identifying 18 new diagnoses in previously unsolved cases.
2026-06-18Officialreleaseofficialmodel
A near-autonomous AI chemist improves a challenging reaction in medicinal chemistry
OpenAI and Molecule.one show how a near-autonomous AI chemist using GPT-5.4 improved a key drug-making reaction, advancing medicinal chemistry research.
2026-06-17Officialreleaseofficialgpt
Beyond Global Replanning: Hierarchical Recovery for Cross-Device Agent Systems
Real-world computer-use tasks often span multiple applications and devices, requiring agents to coordinate heterogeneous environments under dynamic runtime failures. Existing multi-device agent systems support task decomposition and cross-...
2026-06-18Researchresearchagent
CATCH-ME if you RAG: a dataset of Contextually Annotated multi-Turn Counterspeech against Hate and Misinformation Exchanges
Online hate speech and misinformation frequently overlap, yet NLP research has mainly treated them in isolation. While LLMs represent a scalable solution for assisting humans in the generation of counterspeech for both threats, zero-shot m...
2026-06-18Researchresearchllm
DeepSWIP: Quotient-WMC Counterfactuals for Neural Probabilistic Logic Programs
Neurosymbolic systems such as DeepProbLog combine neural perception with probabilistic logic, but standard inference is associational. Counterfactual reasoning additionally requires a causal semantics for interventions and evidence. We int...
2026-06-18Researchresearchinferencereasoning
Execution-State Capsules: Graph-Bound Execution-State Checkpoint and Restore for Low-Latency, Small-Batch, On-Device Physical-AI Serving
Mainstream LLM serving systems reuse prefix work mainly through paged or radix key-value (KV) caches. This is highly effective for high-throughput, high-concurrency serving, but it manages only one positional fragment of execution state: t...
2026-06-18Researchresearchllm
How Transparent is DiffusionGemma?
LLM reasoning transparency is a critical affordance for understanding model decisions, mitigating misuse and misalignment, and debugging surprising model behaviors. However, DiffusionGemma performs a larger fraction of its computation in a...
2026-06-18Researchresearchllmmodel
LedgerAgent: Structured State for Policy-Adherent Tool-Calling Agents
Policy-adherent tool-calling agents in customer-service domains must maintain task states across turns while calling tools and obeying domain policies. Task states consist of relevant facts, identifiers, constraints, and conditions observe...
2026-06-18Researchresearchagent
Multi-Task Bayesian In-Context Learning
Bayesian predictive inference provides a principled framework for uncertainty quantification, data efficiency, and robust generalization. However, exact inference is often intractable, and scalable approximations may remain computationally...
2026-06-18Researchresearchinference
Optimal Deterministic Multicalibration and Omniprediction
A model is multicalibrated on a collection of group weights $G$ if it is calibrated -- i.e. unbiased even conditional on its prediction -- not just overall, but also after reweighting contexts by each $g \in G$. It is a useful property for...
2026-06-18Researchresearchmodel
PsyScore: A Psychometrically-Aware Framework for Trait-Adaptive Essay Scoring and ZPD-Scaffolded Feedback
Effective Automated Essay Scoring (AES) are expected to support both reliable assessment and actionable instructional feedback. However, existing approaches often treat scoring and feedback as separate components: neural scoring models pro...
2026-06-18Researchresearchmodel
SARLO-80: Worldwide Slant SAR Language Optic Dataset 80cm
Multimodal foundation models have advanced rapidly thanks to large optical benchmarks, but comparable resources for synthetic aperture radar (SAR) remain limited. Existing SAR--optical datasets largely rely on low-resolution, intensity-onl...
2026-06-18Researchresearchbenchmarkfoundation
Scalable Training of Spatially Grounded 2D Vision-Language Models for Radiology
We study how to train visually grounded vision-language models (VLMs) for radiology without manual spatial annotations. We introduce RefRad2D, a large-scale bilingual (German/English) dataset of 1.2M CT and MR image-text pairs derived from...
2026-06-18Researchresearchlanguagemodel
Sovereign Execution Brokers: Enforcing Certificate-Bound Authority in Agentic Control Planes
Autonomous agents are increasingly connected to cloud, deployment, and data-control workflows, but production mutation authority should not reside inside non-deterministic reasoning processes. Existing access-control mechanisms authorize i...
2026-06-18Researchresearchagentreasoning
Structuring and Tokenizing Distributed User Interest Context for Generative Recommendation
Generative recommendation is an emerging paradigm that has shown promise in industrial recommendation systems, aiming to predict users' next interactions from their historical behaviors. At the core of generative recommendation lies item t...
2026-06-18Researchresearch
StylisticBias: A Few Human Visual Cues Drive Most Social Biases in MLLMs
Multimodal large language models (MLLMs) are increasingly deployed in personally and societally consequential settings, yet the visual cues that shape how these models judge people remain poorly understood. Prior work often compares differ...
2026-06-18Researchresearchlanguagelarge
Token-Operations-Oriented Inference Optimization Techniques for Large Models
Large model inference optimization serves as a key foundation for supporting the scalable, low-cost, and highly stable operation of large model services. Centered on token-oriented inference optimization technology, this paper proposes for...
2026-06-18Researchresearchinferencemodel
UNIEGO: Proxies as Mediators for Unified Egocentric Video Representation Learning
Egocentric video understanding is inherently limited by the narrow perspective of wearable cameras: a single viewpoint, a single modality, a single model cannot capture the full richness of human action. We argue that a truly expressive eg...
2026-06-18Researchresearchmodel
Your Mouse and Eyes Secretly Leak Your Preference: LLM Alignment using Implicit Feedback from Users
To align a Large Language Model (LLM), most existing methods collect explicit human feedback and train a reward model to predict the human preference based on the response text. These existing methods have two key limitations. First, the u...
2026-06-18Researchresearchlanguagelarge