Ryan Daws@AI News
//
Anthropic has unveiled groundbreaking insights into the 'AI biology' of their advanced language model, Claude. Through innovative methods, researchers have been able to peer into the complex inner workings of the AI, demystifying how it processes information and learns strategies. This research provides a detailed look at how Claude "thinks," revealing sophisticated behaviors previously unseen, and showing these models are more sophisticated than previously understood.
These new methods allowed scientists to discover that Claude plans ahead when writing poetry and sometimes lies, showing the AI is more complex than previously thought. The new interpretability techniques, which the company dubs “circuit tracing” and “attribution graphs,” allow researchers to map out the specific pathways of neuron-like features that activate when models perform tasks. This approach borrows concepts from neuroscience, viewing AI models as analogous to biological systems.
This research, published in two papers, marks a significant advancement in AI interpretability, drawing inspiration from neuroscience techniques used to study biological brains. Joshua Batson, a researcher at Anthropic, highlighted the importance of understanding how these AI systems develop their capabilities, emphasizing that these techniques allow them to learn many things they “wouldn’t have guessed going in.” The findings have implications for ensuring the reliability, safety, and trustworthiness of increasingly powerful AI technologies.
Recommended read:
References :
- THE DECODER: Anthropic and Databricks have entered a five-year partnership worth $100 million to jointly sell AI tools to businesses.
- venturebeat.com: Anthropic has developed a new method for peering inside large language models like Claude, revealing for the first time how these AI systems process information and make decisions.
- venturebeat.com: Anthropic scientists expose how AI actually ‘thinks’ — and discover it secretly plans ahead and sometimes lies
- AI News: Anthropic provides insights into the ‘AI biology’ of Claude
- www.techrepublic.com: ‘AI Biology’ Research: Anthropic Looks Into How Its AI Claude ‘Thinks’
- THE DECODER: Anthropic's AI microscope reveals how Claude plans ahead when generating poetry
- The Tech Basic: Anthropic Now Redefines AI Research With Self Coordinating Agent Networks
Maximilian Schreiner@THE DECODER
//
Google DeepMind has announced Gemini 2.5 Pro, its latest and most advanced AI model to date. This new model boasts enhanced reasoning capabilities and improved accuracy, marking a significant step forward in AI development. Gemini 2.5 Pro is designed with built-in 'thinking' capabilities, enabling it to break down complex tasks into multiple steps and analyze information more effectively before generating a response. This allows the AI to deduce logical conclusions, incorporate contextual nuances, and make informed decisions with unprecedented accuracy, according to Google.
The Gemini 2.5 Pro has already secured the top position on the LMArena leaderboard, surpassing other AI models in head-to-head comparisons. This achievement highlights its superior performance and high-quality style in handling intricate tasks. The model also leads in math and science benchmarks, demonstrating its advanced reasoning capabilities across various domains. This new model is available as Gemini 2.5 Pro (experimental) on Google’s AI Studio and for Gemini Advanced users on the Gemini chat interface.
Recommended read:
References :
- Google DeepMind Blog: Gemini 2.5: Our most intelligent AI model
- Shelly Palmer: Google’s Gemini 2.5: AI That Thinks Before It Speaks
- AI News: Gemini 2.5: Google cooks up its ‘most intelligent’ AI model to date
- Interconnects: Gemini 2.5 Pro and Google's second chance with AI
- SiliconANGLE: Google introduces Gemini 2.5 Pro with chain-of-thought reasoning built-in
- AI News | VentureBeat: Google releases ‘most intelligent model to date,’ Gemini 2.5 Pro
- Analytics Vidhya: Gemini 2.5 Pro is Now #1 on Chatbot Arena with Impressive Jump
- www.tomsguide.com: Google unveils Gemini 2.5 — claims AI breakthrough with enhanced reasoning and multimodal power
- Fello AI: Google’s Gemini 2.5 Shocks the World: Crushing AI Benchmark Like No Other AI Model!
- bdtechtalks.com: What to know about Google Gemini 2.5 Pro
- TestingCatalog: Gemini 2.5 Pro sets new AI benchmark and launches on AI Studio and Gemini
- AI News | VentureBeat: Google’s Gemini 2.5 Pro is the smartest model you’re not using – and 4 reasons it matters for enterprise AI
- thezvi.wordpress.com: Gemini 2.5 is the New SoTA
- www.infoworld.com: Google has introduced version 2.5 of its , which the company said offers a new level of performance by combining an enhanced base model with improved post-training.
- Composio: Gemini 2.5 Pro vs. Claude 3.7 Sonnet: Coding Comparison
- Composio: Google dropped its best-ever creation, Gemini 2.5 Pro Experimental, on March 25. It is a stupidly incredible reasoning model shining on every The post first appeared on.
- www.tomsguide.com: Gemini 2.5 Pro is now free to all users in surprise move
- Analytics India Magazine: Did Google Just Build The Best AI Model for Coding?
- www.zdnet.com: Everyone can now try Gemini 2.5 Pro - for free
Maximilian Schreiner@THE DECODER
//
Google has unveiled Gemini 2.5 Pro, its latest and "most intelligent" AI model to date, showcasing significant advancements in reasoning, coding proficiency, and multimodal functionalities. According to Google, these improvements come from combining a significantly enhanced base model with improved post-training techniques. The model is designed to analyze complex information, incorporate contextual nuances, and draw logical conclusions with unprecedented accuracy. Gemini 2.5 Pro is now available for Gemini Advanced users and on Google's AI Studio.
Google emphasizes the model's "thinking" capabilities, achieved through chain-of-thought reasoning, which allows it to break down complex tasks into multiple steps and reason through them before responding. This new model can handle multimodal input from text, audio, images, videos, and large datasets. Additionally, Gemini 2.5 Pro exhibits strong performance in coding tasks, surpassing Gemini 2.0 in specific benchmarks and excelling at creating visually compelling web apps and agentic code applications. The model also achieved 18.8% on Humanity’s Last Exam, demonstrating its ability to handle complex knowledge-based questions.
Recommended read:
References :
- SiliconANGLE: Google LLC said today it’s updating its flagship Gemini artificial intelligence model family by introducing an experimental Gemini 2.5 Pro version.
- The Tech Basic: Google's New AI Models “Think” Before Answering, Outperform Rivals
- AI News | VentureBeat: Google releases ‘most intelligent model to date,’ Gemini 2.5 Pro
- Analytics Vidhya: We Tried the Google 2.5 Pro Experimental Model and It’s Mind-Blowing!
- www.tomsguide.com: Google unveils Gemini 2.5 — claims AI breakthrough with enhanced reasoning and multimodal power
- Google DeepMind Blog: Gemini 2.5: Our most intelligent AI model
- THE DECODER: Google Deepmind has introduced Gemini 2.5 Pro, which the company describes as its most capable AI model to date. The article appeared first on .
- intelligence-artificielle.developpez.com: Google DeepMind a lancé Gemini 2.5 Pro, un modèle d'IA qui raisonne avant de répondre, affirmant qu'il est le meilleur sur plusieurs critères de référence en matière de raisonnement et de codage
- The Tech Portal: Google unveils Gemini 2.5, its most intelligent AI model yet with ‘built-in thinking’
- Ars OpenForum: Google says the new Gemini 2.5 Pro model is its “smartest†AI yet
- The Official Google Blog: Gemini 2.5: Our most intelligent AI model
- www.techradar.com: I pitted Gemini 2.5 Pro against ChatGPT o3-mini to find out which AI reasoning model is best
- bsky.app: Google's AI comeback is official. Gemini 2.5 Pro Experimental leads in benchmarks for coding, math, science, writing, instruction following, and more, ahead of OpenAI's o3-mini, OpenAI's GPT-4.5, Anthropic's Claude 3.7, xAI's Grok 3, and DeepSeek's R1. The narrative has finally shifted.
- Shelly Palmer: Google’s Gemini 2.5: AI That Thinks Before It Speaks
- bdtechtalks.com: Gemini 2.5 Pro is a new reasoning model that excels in long-context tasks and benchmarks, revitalizing Google’s AI strategy against competitors like OpenAI.
- Interconnects: The end of a busy spring of model improvements and what's next for the presumed leader in AI abilities.
- www.techradar.com: Gemini 2.5 is now available for Advanced users and it seriously improves Google’s AI reasoning
- www.zdnet.com: Google releases 'most intelligent' experimental Gemini 2.5 Pro - here's how to try it
- Unite.AI: Gemini 2.5 Pro is Here—And it Changes the AI Game (Again)
- TestingCatalog: Gemini 2.5 Pro sets new AI benchmark and launches on AI Studio and Gemini
- Analytics Vidhya: Google DeepMind's latest AI model, Gemini 2.5 Pro, has reached the #1 position on the Arena leaderboard.
- AI News: Gemini 2.5: Google cooks up its ‘most intelligent’ AI model to date
- Fello AI: Google’s Gemini 2.5 Shocks the World: Crushing AI Benchmark Like No Other AI Model!
- Analytics India Magazine: Google Unveils Gemini 2.5, Crushes OpenAI GPT-4.5, DeepSeek R1, & Claude 3.7 Sonnet
- Practical Technology: Practical Tech covers the launch of Google's Gemini 2.5 Pro and its new AI benchmark achievements.
- Shelly Palmer: Google's Gemini 2.5: AI That Thinks Before It Speaks
- www.producthunt.com: Google's most intelligent AI model
- Windows Copilot News: Google reveals AI ‘reasoning’ model that ‘explicitly shows its thoughts’
- AI News | VentureBeat: Hands on with Gemini 2.5 Pro: why it might be the most useful reasoning model yet
- thezvi.wordpress.com: Gemini 2.5 Pro Experimental is America’s next top large language model. That doesn’t mean it is the best model for everything. In particular, it’s still Gemini, so it still is a proud member of the Fun Police, in terms of …
- www.computerworld.com: Gemini 2.5 can, among other things, analyze information, draw logical conclusions, take context into account, and make informed decisions.
- www.infoworld.com: Google introduces Gemini 2.5 reasoning models
- Maginative: Google's Gemini 2.5 Pro leads AI benchmarks with enhanced reasoning capabilities, positioning it ahead of competing models from OpenAI and others.
- www.infoq.com: Google's Gemini 2.5 Pro is a powerful new AI model that's quickly becoming a favorite among developers and researchers. It's capable of advanced reasoning and excels in complex tasks.
- AI News | VentureBeat: Google’s Gemini 2.5 Pro is the smartest model you’re not using – and 4 reasons it matters for enterprise AI
- Communications of the ACM: Google has released Gemini 2.5 Pro, an updated AI model focused on enhanced reasoning, code generation, and multimodal processing.
- The Next Web: Google has released Gemini 2.5 Pro, an updated AI model focused on enhanced reasoning, code generation, and multimodal processing.
- www.tomsguide.com: Gemini 2.5 Pro is now free to all users in surprise move
- Composio: Google just launched Gemini 2.5 Pro on March 26th, claiming to be the best in coding, reasoning and overall everything. But I The post appeared first on .
- Composio: Google's Gemini 2.5 Pro, released on March 26th, is being hailed for its enhanced reasoning, coding, and multimodal capabilities.
- Analytics India Magazine: Gemini 2.5 Pro is better than the Claude 3.7 Sonnet for coding in the Aider Polyglot leaderboard.
- www.zdnet.com: Gemini's latest model outperforms OpenAI's o3 mini and Anthropic's Claude 3.7 Sonnet on the latest benchmarks. Here's how to try it.
- www.marketingaiinstitute.com: [The AI Show Episode 142]: ChatGPT’s New Image Generator, Studio Ghibli Craze and Backlash, Gemini 2.5, OpenAI Academy, 4o Updates, Vibe Marketing & xAI Acquires X
- www.tomsguide.com: Gemini 2.5 is free, but can it beat DeepSeek?
- www.tomsguide.com: Google Gemini could soon help your kids with their homework — here’s what we know
- PCWorld: Google’s latest Gemini 2.5 Pro AI model is now free for all users
- www.techradar.com: Google just made Gemini 2.5 Pro Experimental free for everyone, and that's awesome.
- Last Week in AI: #205 - Gemini 2.5, ChatGPT Image Gen, Thoughts of LLMs
Matthias Bastian@THE DECODER
//
Mistral AI, a French artificial intelligence startup, has launched Mistral Small 3.1, a new open-source language model boasting 24 billion parameters. According to the company, this model outperforms similar offerings from Google and OpenAI, specifically Gemma 3 and GPT-4o Mini, while operating efficiently on consumer hardware like a single RTX 4090 GPU or a MacBook with 32GB RAM. It supports multimodal inputs, processing both text and images, and features an expanded context window of up to 128,000 tokens, which makes it suitable for long-form reasoning and document analysis.
Mistral Small 3.1 is released under the Apache 2.0 license, promoting accessibility and competition within the AI landscape. Mistral AI aims to challenge the dominance of major U.S. tech firms by offering a high-performance, cost-effective AI solution. The model achieves inference speeds of 150 tokens per second and is designed for text and multimodal understanding, positioning itself as a powerful alternative to industry-leading models without the need for expensive cloud infrastructure.
Recommended read:
References :
- THE DECODER: Mistral launches improved Small 3.1 multimodal model
- venturebeat.com: Mistral AI launches efficient open-source model that outperforms Google and OpenAI offerings with just 24 billion parameters, challenging U.S. tech giants' dominance in artificial intelligence.
- Maginative: Mistral Small 3.1 Outperforms Gemma 3 and GPT-4o Mini
- TestingCatalog: Mistral Small 3: A 24B open-source AI model optimized for speed
- Simon Willison's Weblog: Mistral Small 3.1, an open-source AI model, delivers state-of-the-art performance.
- SiliconANGLE: Paris-based artificial intelligence startup Mistral AI said today it’s open-sourcing a new, lightweight AI model called Mistral Small 3.1, claiming it surpasses the capabilities of similar models created by OpenAI and Google LLC.
- Analytics Vidhya: Mistral Small 3.1: The Best Model in its Weight Class
- Analytics Vidhya: Mistral 3.1 vs Gemma 3: Which is the Better Model?
@bdtechtalks.com
//
Alibaba has recently launched Qwen-32B, a new reasoning model, which demonstrates performance levels on par with DeepMind's R1 model. This development signifies a notable achievement in the field of AI, particularly for smaller models. The Qwen team showcased that reinforcement learning on a strong base model can unlock reasoning capabilities for smaller models that enhances their performance to be on par with giant models.
Qwen-32B not only matches but also surpasses models like DeepSeek-R1 and OpenAI's o1-mini across key industry benchmarks, including AIME24, LiveBench, and BFCL. This is significant because Qwen-32B achieves this level of performance with only approximately 5% of the parameters used by DeepSeek-R1, resulting in lower inference costs without compromising on quality or capability. Groq is offering developers the ability to build FAST with Qwen QwQ 32B on GroqCloud™, running the 32B parameter model at ~400 T/s. This model is proving to be very competitive in reasoning benchmarks and is one of the top open source models being used.
The Qwen-32B model was explicitly designed for tool use and adapting its reasoning based on environmental feedback, which is a huge win for AI agents that need to reason, plan, and adapt based on context (outperforms R1 and o1-mini on the Berkeley Function Calling Leaderboard). With these capabilities, Qwen-32B shows that RL on a strong base model can unlock reasoning capabilities for smaller models that enhances their performance to be on par with giant models.
Recommended read:
References :
- Last Week in AI: LWiAI Podcast #202 - Qwen-32B, Anthropic's $3.5 billion, LLM Cognitive Behaviors
- Groq: A Guide to Reasoning with Qwen QwQ 32B
- Last Week in AI: #202 - Qwen-32B, Anthropic's $3.5 billion, LLM Cognitive Behaviors
- Sebastian Raschka, PhD: This article explores recent research advancements in reasoning-optimized LLMs, with a particular focus on inference-time compute scaling that have emerged since the release of DeepSeek R1.
- Analytics Vidhya: China is rapidly advancing in AI, releasing models like DeepSeek and Qwen to rival global giants.
- Last Week in AI: Alibaba’s New QwQ 32B Model is as Good as DeepSeek-R1
- Maginative: Despite having far fewer parameters, Qwen’s new QwQ-32B model outperforms DeepSeek-R1 and OpenAI’s o1-mini in mathematical benchmarks and scientific reasoning, showcasing the power of reinforcement learning.
george.fitzmaurice@futurenet.com (George@Latest from ITPro
//
DeepSeek, a Chinese AI startup founded in 2023, is rapidly gaining traction as a competitor to established models like ChatGPT and Claude. They have quickly risen to prominence and are now competing against much larger parameter models with much smaller compute requirements. As of January 2025, DeepSeek boasts 33.7 million monthly active users and 22.15 million daily active users globally, showcasing its rapid adoption and impact.
Qwen has recently introduced QwQ-32B, a 32-billion-parameter reasoning model, designed to improve performance on complex problem-solving tasks through reinforcement learning and demonstrates robust performance in tasks requiring deep analytical thinking. The QwQ-32B leverages Reinforcement Learning (RL) techniques through a reward-based, multi-stage training process to improve its reasoning capabilities, and can match a 671B parameter model. QwQ-32B demonstrates that Reinforcement Learning (RL) scaling can dramatically enhance model intelligence without requiring massive parameter counts.
Recommended read:
References :
- Analytics Vidhya: QwQ-32B Vs DeepSeek-R1: Can a 32B Model Challenge a 671B Parameter Model?
- MarkTechPost: Qwen Releases QwQ-32B: A 32B Reasoning Model that Achieves Significantly Enhanced Performance in Downstream Task
- Fello AI: DeepSeek is rapidly emerging as a significant player in the AI space, particularly since its public release in January 2025.
- Groq: A Guide to Reasoning with Qwen QwQ 32B
- www.itpro.com: ‘Awesome for the community’: DeepSeek open sourced its code repositories, and experts think it could give competitors a scare
Michal Langmajer@Fello AI
//
OpenAI has announced the release of GPT-4.5, its latest language model which they are calling their 'last non-chain-of-thought model.' According to OpenAI, GPT-4.5 offers substantial enhancements over its predecessors, particularly in advanced reasoning, problem-solving, and contextual understanding. Sam Altman, CEO of OpenAI, described it as the "first model that feels like talking to a thoughtful person," noting moments of astonishment at the quality of advice received from the AI.
However, the rollout is facing challenges due to GPU shortages. Altman stated they are "out of GPUs," leading to a staggered release, initially limited to ChatGPT Pro subscribers who pay $200 a month. While GPT-4.5 is available to developers across all paid API tiers, OpenAI plans to expand access to Plus and Team tiers next week, with tens of thousands of GPUs expected to arrive to alleviate the supply constraints. Despite not being a reasoning model, OpenAI estimates that GPT-4.5 is 30 times more expensive to run than GPT-4o.
Recommended read:
References :
- Fello AI: OpenAI’s GPT‑4.5 Finally Arrived: Can It Beat Grok 3 and Claude 3.7?
- Shelly Palmer: Shelly Palmer discusses the release of OpenAI's GPT-4.5.
- Analytics Vidhya: Everything You Need to Know About OpenAI’s GPT-4.5
- www.tomshardware.com: Tom's Hardware reports on Sam Altman's statement about GPU shortages delaying the GPT-4.5 release.
- venturebeat.com: VentureBeat reports OpenAI releases GPT-4.5 claiming 10X efficiency over GPT-4, but says it’s ‘not a frontier model’
- Gradient Flow: Scaling Up, Costs Up: GPT-4.5 and the Intensifying AI Competition
- Pivot to AI: OpenAI releases GPT-4.5 with ridiculous prices for a mediocre model
- Techstrong.ai: TechStrong.ai article on OpenAI's GPT-4.5 AI model.
- THE DECODER: OpenAI has released GPT-4.5 as a "Research Preview".
- eWEEK: OpenAI releases GPT-4.5, a “Warm” Generative AI Model, for Paid Plans and APIs
- www.windowscentral.com: Sam Altman on GPT-4.5: Expensive, yet the closest thing to a thoughtful conversational partner we've seen
- THE DECODER: OpenAI's largest model GPT-4.5 delivers on vibes instead of benchmarks
- 9to5Mac: OpenAI announces GPT-4.5, ChatGPT’s largest and best model for chat
- www.engadget.com: OpenAI's new GPT-4.5 model is a better, more natural conversationalist
- The Verge: Anthropic’s new ‘hybrid reasoning’ AI model is its smartest yet
- THE DECODER: OpenAI has presented its largest language model to date. According to Mark Chen, Chief Research Officer at OpenAI, GPT 4.5 shows that the scaling of AI models has not yet reached its limits.
- The Verge: OpenAI is launching GPT-4.5 today, its newest and largest AI language model.
- NextBigFuture.com: OpenAI GPT 4.5 Has BIG Coding Improvement – Claims Scaling Still Works – Expensive
- TechCrunch: OpenAI unveils GPT-4.5 ‘Orion,’ its largest AI model yet
- PCMag Middle East ai: Reporting that OpenAI has launched GPT-4.5 but is limiting it to priciest tiers due to GPU shortages.
- Analytics Vidhya: Two days ago, on 27 Feb 2025, OpenAI dropped GPT-4.5, expectations were sky-high. But instead of a groundbreaking leap forward, we got a model prioritizing emotional intelligence over raw reasoning power.
- AI News | VentureBeat: GPT-4.5 for enterprise: Do its accuracy and knowledge justify the cost?
- Windows Report: OpenAI released GPT-4.5, but it’s not much of an upgrade from GPT-4o. After DeepSeek was unleashed into the world, everyone wondered what OpenAI would do now that another AI company had developed an extremely powerful model at a tiny fraction of the budget.
- iHLS: OpenAI Unveils GPT-4.5
- Data Phoenix: OpenAI releases the long-awaited GPT-4.5/Orion, its last non-chain-of-thought model
- Towards AI: GPT-4.5: The Next Evolution in AI
- Towards AI: Towards AI article on TAI #142: GPT-4.5 Released.
- www.marketingaiinstitute.com: [The AI Show Episode 138]: Introducing GPT-4.5, Claude 3.7 Sonnet, Alexa+, Deep Research Now in ChatGPT Plus & How AI Is Disrupting Writing
- Analytics Vidhya: Now, this is a shocker, despite a lot of backlash on the cost of GPT 4.5, it becomes #1 in the Chatbot Arena LLM Leaderboard! Securing over 3,200+ votes, OpenAI’s latest model has emerged as number one across all evaluation categories, prominently excelling in Style Control and Multi-Turn interactions.
Jibin Joseph@PCMag Middle East ai
//
Elon Musk's xAI has officially launched Grok 3, its latest AI model, which Musk has dubbed the "smartest AI on Earth." Trained on 200,000 GPUs, this new model is positioned as a direct competitor to OpenAI's GPT-4 and DeepSeek, excelling in math, science, and coding benchmarks. The launch appears to coincide with a price hike for X Premium+ subscriptions, offering initial access to Grok 3 for subscribers in the U.S., with a separate subscription planned for web and app versions.
xAI's Grok 3 comes with advanced reasoning and agentic abilities and uses more than 10 times the computing power of Grok 2. It has two modes: "Think," which uses a smaller Grok 3 mini model for simple queries, and "Big Brain," which utilizes Grok 3 for complex problems. A new agentic feature called DeepSearch, conducts comprehensive analyses and generates reports, similar to tools recently released by OpenAI, Google, and Perplexity.
Recommended read:
References :
- PCMag Middle East ai: Elon Musk Reveals Grok 3 AI Chatbot: Here's What It Can Do. The model was trained on 200,000 GPUs and beats its rivals in math, science, and coding benchmarks, Musk says. It appears to coincide with a price hike on X Premium+ accounts.
- www.theguardian.com: Elon Musk’s startup rolls out new Grok-3 chatbot as AI competition intensifies. Billionaire CEO claims bot is ‘maximally truth-seeking’ as he looks to rival DeepSeek, OpenAI and Google Gemini
- shellypalmer.com: xAI Releases Grok 3: Technical Details and Competitive Context
- techvro.com: xAI’s Grok 3: A New Challenger to OpenAI and DeepSeek Emerges
- The Tech Portal: Elon Musk’s xAI unveils new Grok 3 model
- WebProNews: Grok 3.0 Unveiled: A Technical Leap Forward in the AI Arms Race
- Techstrong.ai: Elon Musk's artificial intelligence (AI) startup xAI which released the latest iteration of its Grok chatbot on Tuesday, is nearing a $5 billion server deal with Dell Inc.
- venturebeat.com: Elon Musk's latest AI model, Grok 3, is expected to challenge leading AI players in the market.
- www.artificialintelligence-news.com: xAI unveiled its Grok 3 AI model on Monday, alongside new capabilities such as image analysis and refined question answering.
- AI News: xAI unveiled its Grok 3 AI model on Monday, alongside new capabilities such as image analysis and refined question answering.
- eWEEK: Reports that Grok 3 Launches Today.
- www.analyticsvidhya.com: Grok 3 vs DeepSeek R1: Which is Better?
- www.analyticsvidhya.com: Grok 3 is Here! And What It Can Do Will Blow Your Mind!
- Casey Newton: Training Grok 3 took Elon 200,000 GPUs and untold billions, and it's ... decent at best? I wrote about AI's commodity problem
- www.verdict.co.uk: The debut of Grok-3 comes at a crucial time in the AI sector, following DeepSeek's recent open-source model release.
- Analytics Vidhya: Grok 3 vs DeepSeek R1: Which is Better?
- www.eweek.com: The AI chatbot from Elon Musk’s startup xAI is slated to go live on February 17 at 8:00 p.m. Pacific time.
- futurism.com: Researchers Find Elon Musk's New Grok AI Is Extremely Vulnerable to Hacking
- www.analyticsvidhya.com: Is 100K+ GPUs for Grok 3 worth it?
- Analytics Vidhya: Is 100K+ GPUs for Grok 3 worth it?
- MarkTechPost: Grok-3, the latest iteration of xAI's chatbot, showed strong performance in several tasks.
- www.marketingaiinstitute.com: [The AI Show Episode 136]: Elon Musk Tries to Buy OpenAI, JD Vance’s AI Speech, New GenAI Jobs Study, GPT-4o Update, OpenAI Product Roadmap & Grok 3
- bdtechtalks.com: This article provides an overview of Grok-3, highlighting its capabilities and benchmarks.
- www.eweek.com: This news piece covers Grok-3, its pricing, benchmarks, and availability.
- bdtechtalks.com: Overview of Grok-3's capabilities and benchmarks.
- shellypalmer.com: AI is evolving faster than ever, and tech expert Shelly Palmer, Professor of Advanced Media at Syracuse University, breaks down the latest developments in large language models (LLMs) on CNN International.
- the-decoder.com: grok-3-wins-praise-from-openai-founder
- composio.dev: Grok 3 vs. Deepseek r1
- thezvi.wordpress.com: Blog post about Grok 3 and its capabilities.
- eWEEK: News article about Grok 3, focusing on benchmarks, pricing and availability.
- Unite.AI: Elon Musk’s xAI has introduced Grok-3, a next-generation AI chatbot designed to change the way people interact on social media.
- www.unite.ai: Elon Musk’s xAI has introduced Grok-3, a next-generation AI chatbot designed to change the way people interact on social media.
- THE DECODER: Grok 3, the new model from Musk's xAI, wins praise from OpenAI founder
- www.marktechpost.com: xAI Releases Grok 3 Beta: A Super Advanced AI Model Blending Strong Reasoning with Extensive Pretraining Knowledge
- Analytics Vidhya: Grok 3 Prompts that Can Make Your Work Easy!
- Fello AI: Grok 3 vs ChatGPT vs DeepSeek vs Claude vs Gemini – Which AI Is Best in February 2025?
- TestingCatalog: xAI to add file attachment support in DeepSearch mode for Grok 3 on X
- bsky.app: xAI’s Grok-3 claims to outperform OpenAI’s GPT-4o, Google’s Gemini, and DeepSeek’s V3. With $12B raised in 18 months and a 100K Nvidia GPU cluster, the investment appears to pay off.
- AI News | VentureBeat: xAI’s new Grok 3 model criticized for blocking sources that call Musk, Trump top spreaders of misinformation
- www.marketingaiinstitute.com: The AI Show Episode 137]: GPT-4.5 and GPT-5 Release Dates, Grok 3, Forecasting New Jobs, DeepSeek Investigation, Microsoft Quantum Chip & Google AI “Co-Scientist�
- SiliconANGLE: SiliconAngle reports on the launch of Grok-3 and its advanced reasoning capabilities.
- futurism.com: Grok Caught Following Special Instructions for Queries About Elon Musk
- PCMag Middle East ai: Grok Was Briefly Instructed Not to Say Musk, Trump Spread Misinformation on X
- eWEEK: Grok AI Blocks Responses Claiming Trump and Musk “Spread Misinformation�
- OODAloop: A new generation of AIs: Claude 3.7 and Grok 3
- NextBigFuture.com: What Comes After XAI GROK 3?
@www.theverge.com
//
OpenAI has recently launched its o3-mini model, the first in their o3 family, showcasing advancements in both speed and reasoning capabilities. The model comes in two variants: o3-mini-high, which prioritizes in-depth reasoning, and o3-mini-low, designed for quicker responses. Benchmarks indicate that o3-mini offers comparable performance to its predecessor, o1, but at a significantly reduced cost, being approximately 15 times cheaper and five times faster. This is especially interesting because o3-mini is cheaper than GPT-4o, despite having a usage limit of 150 messages per hour compared to the unrestricted GPT-4o, showcasing its cost-effectiveness.
OpenAI is also now providing more detailed insights into the reasoning process of o3-mini, addressing criticism regarding transparency and competition from models like DeepSeek-R1. This includes revealing summarized versions of the chain of thought (CoT) used by the model, offering users greater clarity on its reasoning logic. OpenAI CEO Sam Altman believes that merging large language model scaling with reasoning capabilities could lead to "new scientific knowledge," hinting at future advancements beyond current limitations in inventing new algorithms or fields.
Recommended read:
References :
- techcrunch.com: OpenAI on Friday launched a new AI "reasoning" model, o3-mini, the newest in the company's o family of reasoning models.
- www.theverge.com: o3-mini should outperform o1 and provide faster, more accurate answers.
- community.openai.com: Today we’re releasing the latest model in our reasoning series, OpenAI o3-mini, and you can start using it now in the API.
- Techmeme: OpenAI launches o3-mini, its latest reasoning model that it says is largely on par with o1 and o1-mini in capability, but runs faster and costs less.
- simonwillison.net: OpenAI's o3-mini costs $1.10 per 1M input tokens and $4.40 per 1M output tokens, cheaper than GPT-4o, which costs $2.50 and $10, and o1, which costs $15 and $60.
- community.openai.com: This article discusses the release of OpenAI's o3-mini model and its capabilities, including its ability to search the web for data and return what it found.
- futurism.com: This article discusses the release of OpenAI's o3-mini reasoning model, aiming to improve the performance of large language models (LLMs) by handling complex reasoning tasks. This new model is projected to be an advancement in both performance and cost efficiency.
- the-decoder.com: This article discusses how OpenAI's o3-mini reasoning model is poised to advance scientific knowledge through the merging of LLM scaling and reasoning capabilities.
- www.analyticsvidhya.com: This blog post highlights the development and use of OpenAI's reasoning model, focusing on its increased performance and cost-effectiveness compared to previous generations. The emphasis is on its use for handling complex reasoning tasks.
- AI News | VentureBeat: OpenAI is now showing more details of the reasoning process of o3-mini, its latest reasoning model. The change was announced on OpenAI’s X account and comes as the AI lab is under increased pressure by DeepSeek-R1, a rival open model that fully displays its reasoning tokens.
- Composio: This article discusses OpenAI's o3-mini model and its performance in reasoning tasks.
- composio.dev: This article discusses OpenAI's release of the o3-mini model, highlighting its improved speed and efficiency in AI reasoning.
- THE DECODER: Training larger and larger language models (LLMs) with more and more data hits a wall.
- Analytics Vidhya: OpenAI’s o3- mini is not even a week old and it’s already a favorite amongst ChatGPT users.
- slviki.org: OpenAI unveils o3-mini, a faster, more cost-effective reasoning model
- singularityhub.com: This post talks about improvements in LLMs, focusing on the new o3-mini model from OpenAI.
- computational-intelligence.blogspot.com: This blog post summarizes various AI-related news stories, including the launch of OpenAI's o3-mini model.
- www.lemonde.fr: OpenAI's new o3-mini model is designed to be faster and more cost-effective than prior models.
|
|