Visual Fraction Model

Open Source MiniMax M3 Outperforms Opus 4.7 for a Fraction of the Cost

Discover how the open-source MiniMax M3 AI model outperforms GPT 5.5 and Opus 4.7 in coding benchmarks while offering ...

IEEE

An Audio-Visual Speech Separation Model Inspired by Cortico-Thalamo-Cortical Circuits

Abstract: Audio-visual approaches involving visual inputs have laid the foundation for recent progress in speech separation. However, the optimization of the concurrent usage of auditory and visual ...

Morningstar

Perceptron AI Launches Physical AI Model That Matches Frontier Labs at a Fraction of the Cost

Perceptron AI today announced the launch of its model purpose-built for video understanding and embodied reasoning. It delivers performance competitive with leading frontier models – including Google, ...

The New York Times

White House Considers Vetting A.I. Models Before They Are Released

The Trump administration, which took a noninterventionist approach to artificial intelligence, is now discussing imposing oversight on A.I. models before they are made publicly available. By Tripp ...

MIT Technology Review

Three reasons why DeepSeek’s new model matters

The long-awaited V4 is more efficient and a win for Chinese chipmakers. On April 24, Chinese AI firm DeepSeek released a preview of V4, its long-awaited new flagship model. The model can process much ...

CNN

China’s AI upstart DeepSeek drops new model. Will it make waves like last year?

China’s DeepSeek unveiled a preview version of its much-anticipated new model on Friday, promising to rival models from OpenAI, Anthropic and Google a year after the then little-known start up took ...

Campus Technology

Anthropic Launches Opus 4.7 AI Model, Focusing on Coding, Visual Tasks, and Cybersecurity Guardrails

Opus 4.7's most significant improvements are in complex, long-running software engineering tasks and high-resolution image processing, with the model now accepting images more than three times larger ...

VentureBeat

Mistral's Small 4 consolidates reasoning, vision and coding into one model — at a fraction of the inference cost

Enterprises that have been juggling separate models for reasoning, multimodal tasks, and agentic coding may be able to simplify their stack: Mistral’s new Small 4 brings all three into a single ...

GitHub

Visual-ERM: Reward Modeling for Visual Equivalence

🌈 Official repository for Visual-ERM, a multimodal generative reward model for vision-to-code tasks. 🔥 Task-agnostic reward supervision. A single reward model generalizes across multiple ...

Microsoft

Phi-4-reasoning-vision and the lessons of training a multimodal reasoning model

In this post, we share the motivations, design choices, experiments, and learnings that informed its development, as well as an evaluation of the model’s performance and guidance on how to use it. Our ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results