@www.marktechpost.com
//
DeepSeek has released a major update to its R1 reasoning model, dubbed DeepSeek-R1-0528, marking a significant step forward in open-source AI. The update boasts enhanced performance in complex reasoning, mathematics, and coding, positioning it as a strong competitor to leading commercial models like OpenAI's o3 and Google's Gemini 2.5 Pro. The model's weights, training recipes, and comprehensive documentation are openly available under the MIT license, fostering transparency and community-driven innovation. This release allows researchers, developers, and businesses to access cutting-edge AI capabilities without the constraints of closed ecosystems or expensive subscriptions.
The DeepSeek-R1-0528 update brings several core improvements. The model's parameter count has increased from 671 billion to 685 billion, enabling it to process and store more intricate patterns. Enhanced chain-of-thought layers deepen the model's reasoning capabilities, making it more reliable in handling multi-step logic problems. Post-training optimizations have also been applied to reduce hallucinations and improve output stability. In practical terms, the update introduces JSON outputs, native function calling, and simplified system prompts, all designed to streamline real-world deployment and enhance the developer experience.
Specifically, DeepSeek R1-0528 demonstrates a remarkable leap in mathematical reasoning. On the AIME 2025 test, its accuracy improved from 70% to an impressive 87.5%, rivaling OpenAI's o3. This improvement is attributed to "enhanced thinking depth," with the model now utilizing significantly more tokens per question, indicating more thorough and systematic logical analysis. The open-source nature of DeepSeek-R1-0528 empowers users to fine-tune and adapt the model to their specific needs, fostering further innovation and advancements within the AI community.
Recommended read:
References :
- pub.towardsai.net: DeepSeek R1 : Is It Right For You? (A Practical Self‑Assessment for Businesses and Individuals)
- AI News | VentureBeat: VentureBeat article on DeepSeek R1-0528.
- Analytics Vidhya: New Deepseek R1-0528 Update is INSANE
- Kyle Wiggers ?: DeepSeek updates its R1 reasoning AI model, releases it on Hugging Face
- MacStories: Testing DeepSeek R1-0528 on the M3 Ultra Mac Studio and Installing Local GGUF Models with Ollama on macOS
- www.analyticsvidhya.com: When DeepSeek R1 launched in January, it instantly became one of the most talked-about open-source models on the scene, gaining popularity for its sharp reasoning and impressive performance. Fast-forward to today, and DeepSeek is back with a so-called “minor trial upgradeâ€, but don’t let the modest name fool you. DeepSeek-R1-0528 delivers major leaps in reasoning, […]
- www.marktechpost.com: DeepSeek, the Chinese AI Unicorn, has released an updated version of its R1 reasoning model, named DeepSeek-R1-0528. This release enhances the model’s capabilities in mathematics, programming, and general logical reasoning, positioning it as a formidable open-source alternative to leading models like OpenAI’s o3 and Google’s Gemini 2.5 Pro. Technical Enhancements The R1-0528 update introduces significant […]
- NextBigFuture.com: DeepSeek R1 has significantly improved its depth of reasoning and inference capabilities by leveraging increased computational resources and introducing algorithmic optimization mechanisms during post-training.
- MarkTechPost: Information about DeepSeek's R1-0528 model and its enhancements in math and code performance.
- Pandaily: In the early hours of May 29, Chinese AI startup DeepSeek quietly open-sourced the latest iteration of its R1 large language model, DeepSeek-R1-0528, on the Hugging Face platform .
- www.computerworld.com: Reports that DeepSeek releases a new version of its R1 reasoning AI model.
- techcrunch.com: DeepSeek updates its R1 reasoning AI model, releases it on Hugging Face
- the-decoder.com: Deepseek's R1 model closes the gap with OpenAI and Google after major update
- Simon Willison: Some notes on the new DeepSeek-R1-0528 - a completely different model from the R1 they released in January, despite having a very similar name Terrible LLM naming has managed to infect the Chinese AI labs too
- Analytics India Magazine: The new DeepSeek-R1 Is as good as OpenAI o3 and Gemini 2.5 Pro
- RunPod Blog: The 'Minor Upgrade' That's Anything But: DeepSeek R1-0528 Deep Dive
- simonwillison.net: Some notes on the new DeepSeek-R1-0528 - a completely different model from the R1 they released in January, despite having a very similar name Terrible LLM naming has managed to infect the Chinese AI labs too
- TheSequence: This article provides an overview of the new DeepSeek R1-0528 model and notes its improvements over the prior model released in January.
- Kyle Wiggers ?: News about the release of DeepSeek's updated R1 AI model, emphasizing its increased censorship.
- Fello AI: Reports that the R1-0528 model from DeepSeek is matching the capabilities of OpenAI's o3 and Google's Gemini 2.5 Pro.
- felloai.com: Latest DeepSeek Update Called R1-0528 Is Matching OpenAI’s o3 & Gemini 2.5 Pro
- www.tomsguide.com: DeepSeek’s latest update is a serious threat to ChatGPT and Google — here’s why
Chris McKay@Maginative
//
OpenAI has released its latest AI models, o3 and o4-mini, designed to enhance reasoning and tool use within ChatGPT. These models aim to provide users with smarter and faster AI experiences by leveraging web search, Python programming, visual analysis, and image generation. The models are designed to solve complex problems and perform tasks more efficiently, positioning OpenAI competitively in the rapidly evolving AI landscape. Greg Brockman from OpenAI noted the models "feel incredibly smart" and have the potential to positively impact daily life and solve challenging problems.
The o3 model stands out due to its ability to use tools independently, which enables more practical applications. The model determines when and how to utilize tools such as web search, file analysis, and image generation, thus reducing the need for users to specify tool usage with each query. The o3 model sets new standards for reasoning, particularly in coding, mathematics, and visual perception, and has achieved state-of-the-art performance on several competition benchmarks. The model excels in programming, business, consulting, and creative ideation.
Usage limits for these models vary, with o3 at 50 queries per week, and o4-mini at 150 queries per day, and o4-mini-high at 50 queries per day for Plus users, alongside 10 Deep Research queries per month. The o3 model is available to ChatGPT Pro and Team subscribers, while the o4-mini models are used across ChatGPT Plus. OpenAI says o3 is also beneficial in generating and critically evaluating novel hypotheses, especially in biology, mathematics, and engineering contexts.
Recommended read:
References :
- Simon Willison's Weblog: OpenAI are really emphasizing tool use with these: For the first time, our reasoning models can agentically use and combine every tool within ChatGPT—this includes searching the web, analyzing uploaded files and other data with Python, reasoning deeply about visual inputs, and even generating images. Critically, these models are trained to reason about when and how to use tools to produce detailed and thoughtful answers in the right output formats, typically in under a minute, to solve more complex problems.
- the-decoder.com: OpenAI’s new o3 and o4-mini models reason with images and tools
- venturebeat.com: OpenAI launches o3 and o4-mini, AI models that ‘think with images’ and use tools autonomously
- www.analyticsvidhya.com: o3 and o4-mini: OpenAI’s Most Advanced Reasoning Models
- www.tomsguide.com: OpenAI's o3 and o4-mini models
- Maginative: OpenAI’s latest models—o3 and o4-mini—introduce agentic reasoning, full tool integration, and multimodal thinking, setting a new bar for AI performance in both speed and sophistication.
- THE DECODER: OpenAI’s new o3 and o4-mini models reason with images and tools
- Analytics Vidhya: o3 and o4-mini: OpenAI’s Most Advanced Reasoning Models
- www.zdnet.com: These new models are the first to independently use all ChatGPT tools.
- The Tech Basic: OpenAI recently released its new AI models, o3 and o4-mini, to the public. Smart tools employ pictures to address problems through pictures, including sketch interpretation and photo restoration.
- thetechbasic.com: OpenAI’s new AI Can “See†and Solve Problems with Pictures
- www.marktechpost.com: OpenAI Introduces o3 and o4-mini: Progressing Towards Agentic AI with Enhanced Multimodal Reasoning
- MarkTechPost: OpenAI Introduces o3 and o4-mini: Progressing Towards Agentic AI with Enhanced Multimodal Reasoning
- analyticsindiamag.com: Access to o3 and o4-mini is rolling out today for ChatGPT Plus, Pro, and Team users.
- THE DECODER: OpenAI is expanding its o-series with two new language models featuring improved tool usage and strong performance on complex tasks.
- gHacks Technology News: OpenAI released its latest models, o3 and o4-mini, to enhance the performance and speed of ChatGPT in reasoning tasks.
- www.ghacks.net: OpenAI Launches o3 and o4-Mini models to improve ChatGPT's reasoning abilities
- Data Phoenix: OpenAI releases new reasoning models o3 and o4-mini amid intense competition. OpenAI has launched o3 and o4-mini, which combine sophisticated reasoning capabilities with comprehensive tool integration.
- Shelly Palmer: OpenAI Quietly Reshapes the Landscape with o3 and o4-mini. OpenAI just rolled out a major update to ChatGPT, quietly releasing three new models (o3, o4-mini, and o4-mini-high) that offer the most advanced reasoning capabilities the company has ever shipped.
- THE DECODER: Safety assessments show that OpenAI's o3 is probably the company's riskiest AI model to date
- shellypalmer.com: OpenAI Quietly Reshapes the Landscape with o3 and o4-mini
- BleepingComputer: OpenAI details ChatGPT-o3, o4-mini, o4-mini-high usage limits
- TestingCatalog: OpenAI’s o3 and o4‑mini bring smarter tools and faster reasoning to ChatGPT
- simonwillison.net: Introducing OpenAI o3 and o4-mini
- bdtechtalks.com: What to know about o3 and o4-mini, OpenAI’s new reasoning models
- bdtechtalks.com: What to know about o3 and o4-mini, OpenAI’s new reasoning models
- thezvi.wordpress.com: OpenAI has finally introduced us to the full o3 along with o4-mini. Greg Brockman (OpenAI): Just released o3 and o4-mini! These models feel incredibly smart. We’ve heard from top scientists that they produce useful novel ideas. Excited to see their …
- thezvi.wordpress.com: OpenAI has upgraded its entire suite of models. By all reports, they are back in the game for more than images. GPT-4.1 and especially GPT-4.1-mini are their new API non-reasoning models.
- felloai.com: OpenAI has just launched a brand-new series of GPT models—GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano—that promise major advances in coding, instruction following, and the ability to handle incredibly long contexts.
- Interconnects: OpenAI's o3: Over-optimization is back and weirder than ever
- www.ishir.com: OpenAI has released o3 and o4-mini, adding significant reasoning capabilities to its existing models. These advancements will likely transform the way users interact with AI-powered tools, making them more effective and versatile in tackling complex problems.
- www.bigdatawire.com: OpenAI released the models o3 and o4-mini that offer advanced reasoning capabilities, integrated with tool use, like web searches and code execution.
- Drew Breunig: OpenAI's o3 and o4-mini models offer enhanced reasoning capabilities in mathematical and coding tasks.
- TestingCatalog: OpenAI’s o3 and o4-mini bring smarter tools and faster reasoning to ChatGPT
- www.techradar.com: ChatGPT model matchup - I pitted OpenAI's o3, o4-mini, GPT-4o, and GPT-4.5 AI models against each other and the results surprised me
- www.techrepublic.com: OpenAI’s o3 and o4-mini models are available now to ChatGPT Plus, Pro, and Team users. Enterprise and education users will get access next week.
- Last Week in AI: OpenAI’s new GPT-4.1 AI models focus on coding, OpenAI launches a pair of AI reasoning models, o3 and o4-mini, Google’s newest Gemini AI model focuses on efficiency, and more!
- techcrunch.com: OpenAI’s new reasoning AI models hallucinate more.
- computational-intelligence.blogspot.com: OpenAI's new reasoning models, o3 and o4-mini, are a step up in certain capabilities compared to prior models, but their accuracy is being questioned due to increased instances of hallucinations.
- www.unite.ai: unite.ai article discussing OpenAI's o3 and o4-mini new possibilities through multimodal reasoning and integrated toolsets.
- Unite.AI: On April 16, 2025, OpenAI released upgraded versions of its advanced reasoning models.
- Digital Information World: OpenAI’s Latest o3 and o4-mini AI Models Disappoint Due to More Hallucinations than Older Models
- techcrunch.com: TechCrunch reports on OpenAI's GPT-4.1 models focusing on coding.
- Analytics Vidhya: o3 vs o4-mini vs Gemini 2.5 pro: The Ultimate Reasoning Battle
- THE DECODER: OpenAI's o3 achieves near-perfect performance on long context benchmark.
- the-decoder.com: OpenAI's o3 achieves near-perfect performance on long context benchmark
- www.analyticsvidhya.com: AI models keep getting smarter, but which one truly reasons under pressure? In this blog, we put o3, o4-mini, and Gemini 2.5 Pro through a series of intense challenges: physics puzzles, math problems, coding tasks, and real-world IQ tests.
- Simon Willison's Weblog: This post explores the use of OpenAI's o3 and o4-mini models for conversational AI, highlighting their ability to use tools in their reasoning process. It also discusses the concept of
- Simon Willison's Weblog: The benchmark score on OpenAI's internal PersonQA benchmark (as far as I can tell no further details of that evaluation have been shared) going from 0.16 for o1 to 0.33 for o3 is interesting, but I don't know if it it's interesting enough to produce dozens of headlines along the lines of "OpenAI's o3 and o4-mini hallucinate way higher than previous models"
- techstrong.ai: Techstrong.ai reports OpenAI o3, o4 Reasoning Models Have Some Kinks.
- www.marktechpost.com: OpenAI Releases a Practical Guide to Identifying and Scaling AI Use Cases in Enterprise Workflows
- Towards AI: OpenAI's o3 and o4-mini models have demonstrated promising improvements in reasoning tasks, particularly their use of tools in complex thought processes and enhanced reasoning capabilities.
- Analytics Vidhya: In this article, we explore how OpenAI's o3 reasoning model stands out in tasks demanding analytical thinking and multi-step problem solving, showcasing its capability in accessing and processing information through tools.
- pub.towardsai.net: TAI#149: OpenAI’s Agentic o3; New Open Weights Inference Optimized Models (DeepMind Gemma, Nvidia…
- composio.dev: OpenAI o3 vs. Gemini 2.5 Pro vs. o4-mini
- Composio: OpenAI o3 and o4-mini are out. They are two reasoning state-of-the-art models. They’re expensive, multimodal, and super efficient at tool use.
Ryan Daws@AI News
//
Anthropic has unveiled groundbreaking insights into the 'AI biology' of their advanced language model, Claude. Through innovative methods, researchers have been able to peer into the complex inner workings of the AI, demystifying how it processes information and learns strategies. This research provides a detailed look at how Claude "thinks," revealing sophisticated behaviors previously unseen, and showing these models are more sophisticated than previously understood.
These new methods allowed scientists to discover that Claude plans ahead when writing poetry and sometimes lies, showing the AI is more complex than previously thought. The new interpretability techniques, which the company dubs “circuit tracing” and “attribution graphs,” allow researchers to map out the specific pathways of neuron-like features that activate when models perform tasks. This approach borrows concepts from neuroscience, viewing AI models as analogous to biological systems.
This research, published in two papers, marks a significant advancement in AI interpretability, drawing inspiration from neuroscience techniques used to study biological brains. Joshua Batson, a researcher at Anthropic, highlighted the importance of understanding how these AI systems develop their capabilities, emphasizing that these techniques allow them to learn many things they “wouldn’t have guessed going in.” The findings have implications for ensuring the reliability, safety, and trustworthiness of increasingly powerful AI technologies.
Recommended read:
References :
- THE DECODER: Anthropic and Databricks have entered a five-year partnership worth $100 million to jointly sell AI tools to businesses.
- venturebeat.com: Anthropic has developed a new method for peering inside large language models like Claude, revealing for the first time how these AI systems process information and make decisions.
- venturebeat.com: Anthropic scientists expose how AI actually ‘thinks’ — and discover it secretly plans ahead and sometimes lies
- AI News: Anthropic provides insights into the ‘AI biology’ of Claude
- www.techrepublic.com: ‘AI Biology’ Research: Anthropic Looks Into How Its AI Claude ‘Thinks’
- THE DECODER: Anthropic's AI microscope reveals how Claude plans ahead when generating poetry
- The Tech Basic: Anthropic Now Redefines AI Research With Self Coordinating Agent Networks
Maximilian Schreiner@THE DECODER
//
Google DeepMind has announced Gemini 2.5 Pro, its latest and most advanced AI model to date. This new model boasts enhanced reasoning capabilities and improved accuracy, marking a significant step forward in AI development. Gemini 2.5 Pro is designed with built-in 'thinking' capabilities, enabling it to break down complex tasks into multiple steps and analyze information more effectively before generating a response. This allows the AI to deduce logical conclusions, incorporate contextual nuances, and make informed decisions with unprecedented accuracy, according to Google.
The Gemini 2.5 Pro has already secured the top position on the LMArena leaderboard, surpassing other AI models in head-to-head comparisons. This achievement highlights its superior performance and high-quality style in handling intricate tasks. The model also leads in math and science benchmarks, demonstrating its advanced reasoning capabilities across various domains. This new model is available as Gemini 2.5 Pro (experimental) on Google’s AI Studio and for Gemini Advanced users on the Gemini chat interface.
Recommended read:
References :
- Google DeepMind Blog: Gemini 2.5: Our most intelligent AI model
- Shelly Palmer: Google’s Gemini 2.5: AI That Thinks Before It Speaks
- AI News: Gemini 2.5: Google cooks up its ‘most intelligent’ AI model to date
- Interconnects: Gemini 2.5 Pro and Google's second chance with AI
- SiliconANGLE: Google introduces Gemini 2.5 Pro with chain-of-thought reasoning built-in
- AI News | VentureBeat: Google releases ‘most intelligent model to date,’ Gemini 2.5 Pro
- Analytics Vidhya: Gemini 2.5 Pro is Now #1 on Chatbot Arena with Impressive Jump
- www.tomsguide.com: Google unveils Gemini 2.5 — claims AI breakthrough with enhanced reasoning and multimodal power
- Fello AI: Google’s Gemini 2.5 Shocks the World: Crushing AI Benchmark Like No Other AI Model!
- bdtechtalks.com: What to know about Google Gemini 2.5 Pro
- TestingCatalog: Gemini 2.5 Pro sets new AI benchmark and launches on AI Studio and Gemini
- AI News | VentureBeat: Google’s Gemini 2.5 Pro is the smartest model you’re not using – and 4 reasons it matters for enterprise AI
- thezvi.wordpress.com: Gemini 2.5 is the New SoTA
- www.infoworld.com: Google has introduced version 2.5 of its , which the company said offers a new level of performance by combining an enhanced base model with improved post-training.
- Composio: Gemini 2.5 Pro vs. Claude 3.7 Sonnet: Coding Comparison
- Composio: Google dropped its best-ever creation, Gemini 2.5 Pro Experimental, on March 25. It is a stupidly incredible reasoning model shining on every The post first appeared on.
- www.tomsguide.com: Gemini 2.5 Pro is now free to all users in surprise move
- Analytics India Magazine: Did Google Just Build The Best AI Model for Coding?
- www.zdnet.com: Everyone can now try Gemini 2.5 Pro - for free
Maximilian Schreiner@THE DECODER
//
Google has unveiled Gemini 2.5 Pro, its latest and "most intelligent" AI model to date, showcasing significant advancements in reasoning, coding proficiency, and multimodal functionalities. According to Google, these improvements come from combining a significantly enhanced base model with improved post-training techniques. The model is designed to analyze complex information, incorporate contextual nuances, and draw logical conclusions with unprecedented accuracy. Gemini 2.5 Pro is now available for Gemini Advanced users and on Google's AI Studio.
Google emphasizes the model's "thinking" capabilities, achieved through chain-of-thought reasoning, which allows it to break down complex tasks into multiple steps and reason through them before responding. This new model can handle multimodal input from text, audio, images, videos, and large datasets. Additionally, Gemini 2.5 Pro exhibits strong performance in coding tasks, surpassing Gemini 2.0 in specific benchmarks and excelling at creating visually compelling web apps and agentic code applications. The model also achieved 18.8% on Humanity’s Last Exam, demonstrating its ability to handle complex knowledge-based questions.
Recommended read:
References :
- SiliconANGLE: Google LLC said today it’s updating its flagship Gemini artificial intelligence model family by introducing an experimental Gemini 2.5 Pro version.
- The Tech Basic: Google's New AI Models “Think” Before Answering, Outperform Rivals
- AI News | VentureBeat: Google releases ‘most intelligent model to date,’ Gemini 2.5 Pro
- Analytics Vidhya: We Tried the Google 2.5 Pro Experimental Model and It’s Mind-Blowing!
- www.tomsguide.com: Google unveils Gemini 2.5 — claims AI breakthrough with enhanced reasoning and multimodal power
- Google DeepMind Blog: Gemini 2.5: Our most intelligent AI model
- THE DECODER: Google Deepmind has introduced Gemini 2.5 Pro, which the company describes as its most capable AI model to date. The article appeared first on .
- intelligence-artificielle.developpez.com: Google DeepMind a lancé Gemini 2.5 Pro, un modèle d'IA qui raisonne avant de répondre, affirmant qu'il est le meilleur sur plusieurs critères de référence en matière de raisonnement et de codage
- The Tech Portal: Google unveils Gemini 2.5, its most intelligent AI model yet with ‘built-in thinking’
- Ars OpenForum: Google says the new Gemini 2.5 Pro model is its “smartest†AI yet
- The Official Google Blog: Gemini 2.5: Our most intelligent AI model
- www.techradar.com: I pitted Gemini 2.5 Pro against ChatGPT o3-mini to find out which AI reasoning model is best
- bsky.app: Google's AI comeback is official. Gemini 2.5 Pro Experimental leads in benchmarks for coding, math, science, writing, instruction following, and more, ahead of OpenAI's o3-mini, OpenAI's GPT-4.5, Anthropic's Claude 3.7, xAI's Grok 3, and DeepSeek's R1. The narrative has finally shifted.
- Shelly Palmer: Google’s Gemini 2.5: AI That Thinks Before It Speaks
- bdtechtalks.com: Gemini 2.5 Pro is a new reasoning model that excels in long-context tasks and benchmarks, revitalizing Google’s AI strategy against competitors like OpenAI.
- Interconnects: The end of a busy spring of model improvements and what's next for the presumed leader in AI abilities.
- www.techradar.com: Gemini 2.5 is now available for Advanced users and it seriously improves Google’s AI reasoning
- www.zdnet.com: Google releases 'most intelligent' experimental Gemini 2.5 Pro - here's how to try it
- Unite.AI: Gemini 2.5 Pro is Here—And it Changes the AI Game (Again)
- TestingCatalog: Gemini 2.5 Pro sets new AI benchmark and launches on AI Studio and Gemini
- Analytics Vidhya: Google DeepMind's latest AI model, Gemini 2.5 Pro, has reached the #1 position on the Arena leaderboard.
- AI News: Gemini 2.5: Google cooks up its ‘most intelligent’ AI model to date
- Fello AI: Google’s Gemini 2.5 Shocks the World: Crushing AI Benchmark Like No Other AI Model!
- Analytics India Magazine: Google Unveils Gemini 2.5, Crushes OpenAI GPT-4.5, DeepSeek R1, & Claude 3.7 Sonnet
- Practical Technology: Practical Tech covers the launch of Google's Gemini 2.5 Pro and its new AI benchmark achievements.
- Shelly Palmer: Google's Gemini 2.5: AI That Thinks Before It Speaks
- www.producthunt.com: Google's most intelligent AI model
- Windows Copilot News: Google reveals AI ‘reasoning’ model that ‘explicitly shows its thoughts’
- AI News | VentureBeat: Hands on with Gemini 2.5 Pro: why it might be the most useful reasoning model yet
- thezvi.wordpress.com: Gemini 2.5 Pro Experimental is America’s next top large language model. That doesn’t mean it is the best model for everything. In particular, it’s still Gemini, so it still is a proud member of the Fun Police, in terms of …
- www.computerworld.com: Gemini 2.5 can, among other things, analyze information, draw logical conclusions, take context into account, and make informed decisions.
- www.infoworld.com: Google introduces Gemini 2.5 reasoning models
- Maginative: Google's Gemini 2.5 Pro leads AI benchmarks with enhanced reasoning capabilities, positioning it ahead of competing models from OpenAI and others.
- www.infoq.com: Google's Gemini 2.5 Pro is a powerful new AI model that's quickly becoming a favorite among developers and researchers. It's capable of advanced reasoning and excels in complex tasks.
- AI News | VentureBeat: Google’s Gemini 2.5 Pro is the smartest model you’re not using – and 4 reasons it matters for enterprise AI
- Communications of the ACM: Google has released Gemini 2.5 Pro, an updated AI model focused on enhanced reasoning, code generation, and multimodal processing.
- The Next Web: Google has released Gemini 2.5 Pro, an updated AI model focused on enhanced reasoning, code generation, and multimodal processing.
- www.tomsguide.com: Gemini 2.5 Pro is now free to all users in surprise move
- Composio: Google just launched Gemini 2.5 Pro on March 26th, claiming to be the best in coding, reasoning and overall everything. But I The post appeared first on .
- Composio: Google's Gemini 2.5 Pro, released on March 26th, is being hailed for its enhanced reasoning, coding, and multimodal capabilities.
- Analytics India Magazine: Gemini 2.5 Pro is better than the Claude 3.7 Sonnet for coding in the Aider Polyglot leaderboard.
- www.zdnet.com: Gemini's latest model outperforms OpenAI's o3 mini and Anthropic's Claude 3.7 Sonnet on the latest benchmarks. Here's how to try it.
- www.marketingaiinstitute.com: [The AI Show Episode 142]: ChatGPT’s New Image Generator, Studio Ghibli Craze and Backlash, Gemini 2.5, OpenAI Academy, 4o Updates, Vibe Marketing & xAI Acquires X
- www.tomsguide.com: Gemini 2.5 is free, but can it beat DeepSeek?
- www.tomsguide.com: Google Gemini could soon help your kids with their homework — here’s what we know
- PCWorld: Google’s latest Gemini 2.5 Pro AI model is now free for all users
- www.techradar.com: Google just made Gemini 2.5 Pro Experimental free for everyone, and that's awesome.
- Last Week in AI: #205 - Gemini 2.5, ChatGPT Image Gen, Thoughts of LLMs
Matthias Bastian@THE DECODER
//
Mistral AI, a French artificial intelligence startup, has launched Mistral Small 3.1, a new open-source language model boasting 24 billion parameters. According to the company, this model outperforms similar offerings from Google and OpenAI, specifically Gemma 3 and GPT-4o Mini, while operating efficiently on consumer hardware like a single RTX 4090 GPU or a MacBook with 32GB RAM. It supports multimodal inputs, processing both text and images, and features an expanded context window of up to 128,000 tokens, which makes it suitable for long-form reasoning and document analysis.
Mistral Small 3.1 is released under the Apache 2.0 license, promoting accessibility and competition within the AI landscape. Mistral AI aims to challenge the dominance of major U.S. tech firms by offering a high-performance, cost-effective AI solution. The model achieves inference speeds of 150 tokens per second and is designed for text and multimodal understanding, positioning itself as a powerful alternative to industry-leading models without the need for expensive cloud infrastructure.
Recommended read:
References :
- THE DECODER: Mistral launches improved Small 3.1 multimodal model
- venturebeat.com: Mistral AI launches efficient open-source model that outperforms Google and OpenAI offerings with just 24 billion parameters, challenging U.S. tech giants' dominance in artificial intelligence.
- Maginative: Mistral Small 3.1 Outperforms Gemma 3 and GPT-4o Mini
- TestingCatalog: Mistral Small 3: A 24B open-source AI model optimized for speed
- Simon Willison's Weblog: Mistral Small 3.1, an open-source AI model, delivers state-of-the-art performance.
- SiliconANGLE: Paris-based artificial intelligence startup Mistral AI said today it’s open-sourcing a new, lightweight AI model called Mistral Small 3.1, claiming it surpasses the capabilities of similar models created by OpenAI and Google LLC.
- Analytics Vidhya: Mistral Small 3.1: The Best Model in its Weight Class
- Analytics Vidhya: Mistral 3.1 vs Gemma 3: Which is the Better Model?
@bdtechtalks.com
//
Alibaba has recently launched Qwen-32B, a new reasoning model, which demonstrates performance levels on par with DeepMind's R1 model. This development signifies a notable achievement in the field of AI, particularly for smaller models. The Qwen team showcased that reinforcement learning on a strong base model can unlock reasoning capabilities for smaller models that enhances their performance to be on par with giant models.
Qwen-32B not only matches but also surpasses models like DeepSeek-R1 and OpenAI's o1-mini across key industry benchmarks, including AIME24, LiveBench, and BFCL. This is significant because Qwen-32B achieves this level of performance with only approximately 5% of the parameters used by DeepSeek-R1, resulting in lower inference costs without compromising on quality or capability. Groq is offering developers the ability to build FAST with Qwen QwQ 32B on GroqCloud™, running the 32B parameter model at ~400 T/s. This model is proving to be very competitive in reasoning benchmarks and is one of the top open source models being used.
The Qwen-32B model was explicitly designed for tool use and adapting its reasoning based on environmental feedback, which is a huge win for AI agents that need to reason, plan, and adapt based on context (outperforms R1 and o1-mini on the Berkeley Function Calling Leaderboard). With these capabilities, Qwen-32B shows that RL on a strong base model can unlock reasoning capabilities for smaller models that enhances their performance to be on par with giant models.
Recommended read:
References :
- Last Week in AI: LWiAI Podcast #202 - Qwen-32B, Anthropic's $3.5 billion, LLM Cognitive Behaviors
- Groq: A Guide to Reasoning with Qwen QwQ 32B
- Last Week in AI: #202 - Qwen-32B, Anthropic's $3.5 billion, LLM Cognitive Behaviors
- Sebastian Raschka, PhD: This article explores recent research advancements in reasoning-optimized LLMs, with a particular focus on inference-time compute scaling that have emerged since the release of DeepSeek R1.
- Analytics Vidhya: China is rapidly advancing in AI, releasing models like DeepSeek and Qwen to rival global giants.
- Last Week in AI: Alibaba’s New QwQ 32B Model is as Good as DeepSeek-R1
- Maginative: Despite having far fewer parameters, Qwen’s new QwQ-32B model outperforms DeepSeek-R1 and OpenAI’s o1-mini in mathematical benchmarks and scientific reasoning, showcasing the power of reinforcement learning.
george.fitzmaurice@futurenet.com (George@Latest from ITPro
//
DeepSeek, a Chinese AI startup founded in 2023, is rapidly gaining traction as a competitor to established models like ChatGPT and Claude. They have quickly risen to prominence and are now competing against much larger parameter models with much smaller compute requirements. As of January 2025, DeepSeek boasts 33.7 million monthly active users and 22.15 million daily active users globally, showcasing its rapid adoption and impact.
Qwen has recently introduced QwQ-32B, a 32-billion-parameter reasoning model, designed to improve performance on complex problem-solving tasks through reinforcement learning and demonstrates robust performance in tasks requiring deep analytical thinking. The QwQ-32B leverages Reinforcement Learning (RL) techniques through a reward-based, multi-stage training process to improve its reasoning capabilities, and can match a 671B parameter model. QwQ-32B demonstrates that Reinforcement Learning (RL) scaling can dramatically enhance model intelligence without requiring massive parameter counts.
Recommended read:
References :
- Analytics Vidhya: QwQ-32B Vs DeepSeek-R1: Can a 32B Model Challenge a 671B Parameter Model?
- MarkTechPost: Qwen Releases QwQ-32B: A 32B Reasoning Model that Achieves Significantly Enhanced Performance in Downstream Task
- Fello AI: DeepSeek is rapidly emerging as a significant player in the AI space, particularly since its public release in January 2025.
- Groq: A Guide to Reasoning with Qwen QwQ 32B
- www.itpro.com: ‘Awesome for the community’: DeepSeek open sourced its code repositories, and experts think it could give competitors a scare
|
- How to Not Do Experiments: Phacking - Nishanth Tharakan
- My Reflection on Locally Running LLMs - Nishanth Tharakan
- Investigate that Tech: LinkedIn - Nishanth Tharakan
- How Do Models Think, and Why Is There Chinese In My English Responses? - Nishanth Tharakan
- CERN - Nishanth Tharakan
- The Intersection of Mathematics, Physics, Psychology, and Music - Nishanth Tharakan
- Python: The Language That Won AI (And How Hype Helped) - Nishanth Tharakan
- Beginner’s Guide to Oscillations - Nishanth Tharakan
- Russian-American Race - tanyakh
- The Evolution of Feminized Digital Assistants: From Telephone Operators to AI - Nishanth Tharakan
- Epidemiology Part 2: My Journey Through Simulating a Pandemic - Nishanth Tharakan
- The Mathematics Behind Epidemiology: Why do Masks, Social Distancing, and Vaccines Work? - Nishanth Tharakan
- The Game of SET for Groups (Part 2), jointly with Andrey Khesin - tanyakh
- Pi: The Number That Has Made Its Way Into Everything - Nishanth Tharakan
- Beginner’s Guide to Sets - Nishanth Tharakan
- How Changing Our Perspective on Math Expanded Its Possibilities - Nishanth Tharakan
- Beginner’s Guide to Differential Equations: An Overview of UCLA’s MATH33B Class - Nishanth Tharakan
- Beginner’s Guide to Mathematical Induction - Nishanth Tharakan
- Foams and the Four-Color Theorem - tanyakh
- Beginner’s Guide to Game Theory - Nishanth Tharakan
- Forever and Ever: Infinite Chess And How to Visually Represent Infinity - Nishanth Tharakan
- Math Values for the New Year - Annie Petitt
- Happy 2025! - tanyakh
- Identical Twins - tanyakh
- A Puzzle from the Möbius Tournament - tanyakh
- A Baker, a Decorator, and a Wedding Planner Walk into a Classroom - Annie Petitt
- Beliefs and Belongings in Mathematics - David Bressoud
- Red, Yellow, and Green Hats - tanyakh
- Square out of a Plus - tanyakh
- The Game of SET for Groups (Part 1), jointly with Andrey Khesin - tanyakh
- Alexander Karabegov’s Puzzle - tanyakh
|