@simonwillison.net
//
Google has broadened access to its advanced AI model, Gemini 2.5 Pro, showcasing impressive capabilities and competitive pricing designed to challenge rival models like OpenAI's GPT-4o and Anthropic's Claude 3.7 Sonnet. Google's latest flagship model is currently recognized as a top performer, excelling in Optical Character Recognition (OCR), audio transcription, and long-context coding tasks. Alphabet CEO Sundar Pichai highlighted Gemini 2.5 Pro as Google's "most intelligent model + now our most in demand." Demand has increased by over 80 percent this month alone across both Google AI Studio and the Gemini API.
Google's expansion includes a tiered pricing structure for the Gemini 2.5 Pro API, offering a more affordable option compared to competitors. Prompts with less than 200,000 tokens are priced at $1.25 per million for input and $10 per million for output, while larger prompts increase to $2.50 and $15 per million tokens, respectively. Although prompt caching is not yet available, its future implementation could potentially lower costs further. The free tier allows 500 free grounding queries with Google Search per day, with an additional 1,500 free queries in the paid tier, with costs per 1,000 queries set at $35 beyond that.
The AI research group EpochAI reported that Gemini 2.5 Pro scored 84% on the GPQA Diamond benchmark, surpassing the typical 70% score of human experts. This benchmark assesses challenging multiple-choice questions in biology, chemistry, and physics, validating Google's benchmark results. The model is now available as a paid model, along with a free tier option. The free tier can use data to improve Google's products while the paid tier cannot. Rates vary by tier and range from 150-2,000/minute. Google will retire the Gemini 2.0 Pro preview entirely in favor of 2.5.
Recommended read:
References :
- Data Phoenix: Google Unveils Gemini 2.5: Its Most Intelligent AI Model Yet
- AI News | VentureBeat: Gemini 2.5 Pro is now available without limits and for cheaper than Claude, GPT-4o
- Simon Willison's Weblog: Google's Gemini 2.5 Pro is currently the top model and, from , a superb model for OCR, audio transcription and long-context coding. You can now pay for it! The new gemini-2.5-pro-preview-03-25 model ID is priced like this: Prompts less than 200,00 tokens: $1.25/million tokens for input, $10/million for output Prompts more than 200,000 tokens (up to the 1,048,576 max): $2.50/million for input, $15/million for output This is priced at around the same level as Gemini 1.5 Pro ($1.25/$5 for input/output below 128,000 tokens, $2.50/$10 above 128,000 tokens), is cheaper than GPT-4o for shorter prompts ($2.50/$10) and is cheaper than Claude 3.7 Sonnet ($3/$15). Gemini 2.5 Pro is a reasoning model, and invisible reasoning tokens are included in the output token count. I just tried prompting "hi" and it charged me 2 tokens for input and 623 for output, of which 613 were "thinking" tokens. That still adds up to just 0.6232 cents (less than a cent) using my which I updated to support the new model just now. I released this morning adding support for the new model: llm install -U llm-gemini llm -m gemini-2.5-pro-preview-03-25 hi Note that the model continues to be available for free under the previous gemini-2.5-pro-exp-03-25 model ID: llm -m gemini-2.5-pro-exp-03-25 hi The free tier is "used to improve our products", the paid tier is not. Rate limits for the paid model - from 150/minute and 1,000/day for tier 1 (billing configured), 1,000/minute and 50,000/day for Tier 2 ($250 total spend) and 2,000/minute and unlimited/day for Tier 3 ($1,000 total spend). Meanwhile the free tier continues to limit you to 5 requests per minute and 25 per day. Google are entirely in favour of 2.5. Via Tags: , , , , , , ,
- THE DECODER: Google has opened broader access to Gemini 2.5 Pro, its latest AI flagship model, which demonstrates impressive performance in scientific testing while introducing competitive pricing.
- Bernard Marr: Google's latest AI model, Gemini 2.5 Pro, is poised to streamline complex mathematical and coding operations.
- The Cognitive Revolution: In this illuminating episode of The Cognitive Revolution, host Nathan Labenz speaks with Jack Rae, principal research scientist at Google DeepMind and technical lead on Google's thinking and inference time scaling work.
- bsky.app: Gemini 2. 5 Pro pricing was announced today - it's cheaper than both GPT-4o and Claude 3.7 Sonnet I've updated my llm-gemini plugin to add support for the new paid model Full notes here:
- Last Week in AI: Google unveils a next-gen AI reasoning model, OpenAI rolls out image generation powered by GPT-4o to ChatGPT, Tencent’s Hunyuan T1 AI reasoning model rivals DeepSeek in performance and price
Maximilian Schreiner@THE DECODER
//
Google has unveiled Gemini 2.5 Pro, its latest and "most intelligent" AI model to date, showcasing significant advancements in reasoning, coding proficiency, and multimodal functionalities. According to Google, these improvements come from combining a significantly enhanced base model with improved post-training techniques. The model is designed to analyze complex information, incorporate contextual nuances, and draw logical conclusions with unprecedented accuracy. Gemini 2.5 Pro is now available for Gemini Advanced users and on Google's AI Studio.
Google emphasizes the model's "thinking" capabilities, achieved through chain-of-thought reasoning, which allows it to break down complex tasks into multiple steps and reason through them before responding. This new model can handle multimodal input from text, audio, images, videos, and large datasets. Additionally, Gemini 2.5 Pro exhibits strong performance in coding tasks, surpassing Gemini 2.0 in specific benchmarks and excelling at creating visually compelling web apps and agentic code applications. The model also achieved 18.8% on Humanity’s Last Exam, demonstrating its ability to handle complex knowledge-based questions.
Recommended read:
References :
- SiliconANGLE: Google LLC said today it’s updating its flagship Gemini artificial intelligence model family by introducing an experimental Gemini 2.5 Pro version.
- The Tech Basic: Google's New AI Models “Think” Before Answering, Outperform Rivals
- AI News | VentureBeat: Google releases ‘most intelligent model to date,’ Gemini 2.5 Pro
- Analytics Vidhya: We Tried the Google 2.5 Pro Experimental Model and It’s Mind-Blowing!
- www.tomsguide.com: Google unveils Gemini 2.5 — claims AI breakthrough with enhanced reasoning and multimodal power
- Google DeepMind Blog: Gemini 2.5: Our most intelligent AI model
- THE DECODER: Google Deepmind has introduced Gemini 2.5 Pro, which the company describes as its most capable AI model to date. The article appeared first on .
- intelligence-artificielle.developpez.com: Google DeepMind a lancé Gemini 2.5 Pro, un modèle d'IA qui raisonne avant de répondre, affirmant qu'il est le meilleur sur plusieurs critères de référence en matière de raisonnement et de codage
- The Tech Portal: Google unveils Gemini 2.5, its most intelligent AI model yet with ‘built-in thinking’
- Ars OpenForum: Google says the new Gemini 2.5 Pro model is its “smartest†AI yet
- The Official Google Blog: Gemini 2.5: Our most intelligent AI model
- www.techradar.com: I pitted Gemini 2.5 Pro against ChatGPT o3-mini to find out which AI reasoning model is best
- bsky.app: Google's AI comeback is official. Gemini 2.5 Pro Experimental leads in benchmarks for coding, math, science, writing, instruction following, and more, ahead of OpenAI's o3-mini, OpenAI's GPT-4.5, Anthropic's Claude 3.7, xAI's Grok 3, and DeepSeek's R1. The narrative has finally shifted.
- Shelly Palmer: Google’s Gemini 2.5: AI That Thinks Before It Speaks
- bdtechtalks.com: Gemini 2.5 Pro is a new reasoning model that excels in long-context tasks and benchmarks, revitalizing Google’s AI strategy against competitors like OpenAI.
- Interconnects: The end of a busy spring of model improvements and what's next for the presumed leader in AI abilities.
- www.techradar.com: Gemini 2.5 is now available for Advanced users and it seriously improves Google’s AI reasoning
- www.zdnet.com: Google releases 'most intelligent' experimental Gemini 2.5 Pro - here's how to try it
- Unite.AI: Gemini 2.5 Pro is Here—And it Changes the AI Game (Again)
- TestingCatalog: Gemini 2.5 Pro sets new AI benchmark and launches on AI Studio and Gemini
- Analytics Vidhya: Google DeepMind's latest AI model, Gemini 2.5 Pro, has reached the #1 position on the Arena leaderboard.
- AI News: Gemini 2.5: Google cooks up its ‘most intelligent’ AI model to date
- Fello AI: Google’s Gemini 2.5 Shocks the World: Crushing AI Benchmark Like No Other AI Model!
- Analytics India Magazine: Google Unveils Gemini 2.5, Crushes OpenAI GPT-4.5, DeepSeek R1, & Claude 3.7 Sonnet
- Practical Technology: Practical Tech covers the launch of Google's Gemini 2.5 Pro and its new AI benchmark achievements.
- Shelly Palmer: Google's Gemini 2.5: AI That Thinks Before It Speaks
- www.producthunt.com: Google's most intelligent AI model
- Windows Copilot News: Google reveals AI ‘reasoning’ model that ‘explicitly shows its thoughts’
- AI News | VentureBeat: Hands on with Gemini 2.5 Pro: why it might be the most useful reasoning model yet
- thezvi.wordpress.com: Gemini 2.5 Pro Experimental is America’s next top large language model. That doesn’t mean it is the best model for everything. In particular, it’s still Gemini, so it still is a proud member of the Fun Police, in terms of …
- Computerworld: Gemini 2.5 can, among other things, analyze information, draw logical conclusions, take context into account, and make informed decisions.
- www.infoworld.com: Google introduces Gemini 2.5 reasoning models
- Maginative: Google's Gemini 2.5 Pro leads AI benchmarks with enhanced reasoning capabilities, positioning it ahead of competing models from OpenAI and others.
- www.infoq.com: Google's Gemini 2.5 Pro is a powerful new AI model that's quickly becoming a favorite among developers and researchers. It's capable of advanced reasoning and excels in complex tasks.
- AI News | VentureBeat: Google’s Gemini 2.5 Pro is the smartest model you’re not using – and 4 reasons it matters for enterprise AI
- Communications of the ACM: Google has released Gemini 2.5 Pro, an updated AI model focused on enhanced reasoning, code generation, and multimodal processing.
- The Next Web: Google has released Gemini 2.5 Pro, an updated AI model focused on enhanced reasoning, code generation, and multimodal processing.
- www.tomsguide.com: Gemini 2.5 Pro is now free to all users in surprise move
- Composio: Google just launched Gemini 2.5 Pro on March 26th, claiming to be the best in coding, reasoning and overall everything. But I The post appeared first on .
- Composio: Google's Gemini 2.5 Pro, released on March 26th, is being hailed for its enhanced reasoning, coding, and multimodal capabilities.
- Analytics India Magazine: Gemini 2.5 Pro is better than the Claude 3.7 Sonnet for coding in the Aider Polyglot leaderboard.
- www.zdnet.com: Gemini's latest model outperforms OpenAI's o3 mini and Anthropic's Claude 3.7 Sonnet on the latest benchmarks. Here's how to try it.
- www.marketingaiinstitute.com: [The AI Show Episode 142]: ChatGPT’s New Image Generator, Studio Ghibli Craze and Backlash, Gemini 2.5, OpenAI Academy, 4o Updates, Vibe Marketing & xAI Acquires X
- www.tomsguide.com: Gemini 2.5 is free, but can it beat DeepSeek?
- www.tomsguide.com: Google Gemini could soon help your kids with their homework — here’s what we know
- PCWorld: Google’s latest Gemini 2.5 Pro AI model is now free for all users
- www.techradar.com: Google just made Gemini 2.5 Pro Experimental free for everyone, and that's awesome.
- Last Week in AI: #205 - Gemini 2.5, ChatGPT Image Gen, Thoughts of LLMs
Matthias Bastian@THE DECODER
//
Mistral AI, a French artificial intelligence startup, has launched Mistral Small 3.1, a new open-source language model boasting 24 billion parameters. According to the company, this model outperforms similar offerings from Google and OpenAI, specifically Gemma 3 and GPT-4o Mini, while operating efficiently on consumer hardware like a single RTX 4090 GPU or a MacBook with 32GB RAM. It supports multimodal inputs, processing both text and images, and features an expanded context window of up to 128,000 tokens, which makes it suitable for long-form reasoning and document analysis.
Mistral Small 3.1 is released under the Apache 2.0 license, promoting accessibility and competition within the AI landscape. Mistral AI aims to challenge the dominance of major U.S. tech firms by offering a high-performance, cost-effective AI solution. The model achieves inference speeds of 150 tokens per second and is designed for text and multimodal understanding, positioning itself as a powerful alternative to industry-leading models without the need for expensive cloud infrastructure.
Recommended read:
References :
- THE DECODER: Mistral launches improved Small 3.1 multimodal model
- venturebeat.com: Mistral AI launches efficient open-source model that outperforms Google and OpenAI offerings with just 24 billion parameters, challenging U.S. tech giants' dominance in artificial intelligence.
- Maginative: Mistral Small 3.1 Outperforms Gemma 3 and GPT-4o Mini
- TestingCatalog: Mistral Small 3: A 24B open-source AI model optimized for speed
- Simon Willison's Weblog: Mistral Small 3.1, an open-source AI model, delivers state-of-the-art performance.
- SiliconANGLE: Paris-based artificial intelligence startup Mistral AI said today it’s open-sourcing a new, lightweight AI model called Mistral Small 3.1, claiming it surpasses the capabilities of similar models created by OpenAI and Google LLC.
- Analytics Vidhya: Mistral Small 3.1: The Best Model in its Weight Class
- Analytics Vidhya: Mistral 3.1 vs Gemma 3: Which is the Better Model?
@bdtechtalks.com
//
Alibaba has recently launched Qwen-32B, a new reasoning model, which demonstrates performance levels on par with DeepMind's R1 model. This development signifies a notable achievement in the field of AI, particularly for smaller models. The Qwen team showcased that reinforcement learning on a strong base model can unlock reasoning capabilities for smaller models that enhances their performance to be on par with giant models.
Qwen-32B not only matches but also surpasses models like DeepSeek-R1 and OpenAI's o1-mini across key industry benchmarks, including AIME24, LiveBench, and BFCL. This is significant because Qwen-32B achieves this level of performance with only approximately 5% of the parameters used by DeepSeek-R1, resulting in lower inference costs without compromising on quality or capability. Groq is offering developers the ability to build FAST with Qwen QwQ 32B on GroqCloud™, running the 32B parameter model at ~400 T/s. This model is proving to be very competitive in reasoning benchmarks and is one of the top open source models being used.
The Qwen-32B model was explicitly designed for tool use and adapting its reasoning based on environmental feedback, which is a huge win for AI agents that need to reason, plan, and adapt based on context (outperforms R1 and o1-mini on the Berkeley Function Calling Leaderboard). With these capabilities, Qwen-32B shows that RL on a strong base model can unlock reasoning capabilities for smaller models that enhances their performance to be on par with giant models.
Recommended read:
References :
- Last Week in AI: LWiAI Podcast #202 - Qwen-32B, Anthropic's $3.5 billion, LLM Cognitive Behaviors
- Groq: A Guide to Reasoning with Qwen QwQ 32B
- Last Week in AI: #202 - Qwen-32B, Anthropic's $3.5 billion, LLM Cognitive Behaviors
- Sebastian Raschka, PhD: This article explores recent research advancements in reasoning-optimized LLMs, with a particular focus on inference-time compute scaling that have emerged since the release of DeepSeek R1.
- Analytics Vidhya: China is rapidly advancing in AI, releasing models like DeepSeek and Qwen to rival global giants.
- Last Week in AI: Alibaba’s New QwQ 32B Model is as Good as DeepSeek-R1
- Maginative: Despite having far fewer parameters, Qwen’s new QwQ-32B model outperforms DeepSeek-R1 and OpenAI’s o1-mini in mathematical benchmarks and scientific reasoning, showcasing the power of reinforcement learning.
@bdtechtalks.com
//
Alibaba's Qwen team has unveiled QwQ-32B, a 32-billion-parameter reasoning model that rivals much larger AI models in problem-solving capabilities. This development highlights the potential of reinforcement learning (RL) in enhancing AI performance. QwQ-32B excels in mathematics, coding, and scientific reasoning tasks, outperforming models like DeepSeek-R1 (671B parameters) and OpenAI's o1-mini, despite its significantly smaller size. Its effectiveness lies in a multi-stage RL training approach, demonstrating the ability of smaller models with scaled reinforcement learning to match or surpass the performance of giant models.
The QwQ-32B is not only competitive in performance but also offers practical advantages. It is available as open-weight under an Apache 2.0 license, allowing businesses to customize and deploy it without restrictions. Additionally, QwQ-32B requires significantly less computational power, running on a single high-end GPU compared to the multi-GPU setups needed for larger models like DeepSeek-R1. This combination of performance, accessibility, and efficiency positions QwQ-32B as a valuable resource for the AI community and enterprises seeking to leverage advanced reasoning capabilities.
Recommended read:
References :
- Groq: A Guide to Reasoning with Qwen QwQ 32B
- Analytics Vidhya: Qwen’s QwQ-32B: Small Model with Huge Potential
- Maginative: Alibaba's Latest AI Model, QwQ-32B, Beats Larger Rivals in Math and Reasoning
- bdtechtalks.com: Alibaba’s QwQ-32B reasoning model matches DeepSeek-R1, outperforms OpenAI o1-mini
- Last Week in AI: LWiAI Podcast #202 - Qwen-32B, Anthropic's $3.5 billion, LLM Cognitive Behaviors
Michal Langmajer@Fello AI
//
OpenAI has announced the release of GPT-4.5, its latest language model which they are calling their 'last non-chain-of-thought model.' According to OpenAI, GPT-4.5 offers substantial enhancements over its predecessors, particularly in advanced reasoning, problem-solving, and contextual understanding. Sam Altman, CEO of OpenAI, described it as the "first model that feels like talking to a thoughtful person," noting moments of astonishment at the quality of advice received from the AI.
However, the rollout is facing challenges due to GPU shortages. Altman stated they are "out of GPUs," leading to a staggered release, initially limited to ChatGPT Pro subscribers who pay $200 a month. While GPT-4.5 is available to developers across all paid API tiers, OpenAI plans to expand access to Plus and Team tiers next week, with tens of thousands of GPUs expected to arrive to alleviate the supply constraints. Despite not being a reasoning model, OpenAI estimates that GPT-4.5 is 30 times more expensive to run than GPT-4o.
Recommended read:
References :
- Fello AI: OpenAI’s GPT‑4.5 Finally Arrived: Can It Beat Grok 3 and Claude 3.7?
- Shelly Palmer: Shelly Palmer discusses the release of OpenAI's GPT-4.5.
- Analytics Vidhya: Everything You Need to Know About OpenAI’s GPT-4.5
- www.tomshardware.com: Tom's Hardware reports on Sam Altman's statement about GPU shortages delaying the GPT-4.5 release.
- venturebeat.com: VentureBeat reports OpenAI releases GPT-4.5 claiming 10X efficiency over GPT-4, but says it’s ‘not a frontier model’
- Gradient Flow: Scaling Up, Costs Up: GPT-4.5 and the Intensifying AI Competition
- Pivot to AI: OpenAI releases GPT-4.5 with ridiculous prices for a mediocre model
- Techstrong.ai: TechStrong.ai article on OpenAI's GPT-4.5 AI model.
- THE DECODER: OpenAI has released GPT-4.5 as a "Research Preview".
- eWEEK: OpenAI releases GPT-4.5, a “Warm” Generative AI Model, for Paid Plans and APIs
- www.windowscentral.com: Sam Altman on GPT-4.5: Expensive, yet the closest thing to a thoughtful conversational partner we've seen
- THE DECODER: OpenAI's largest model GPT-4.5 delivers on vibes instead of benchmarks
- 9to5Mac: OpenAI announces GPT-4.5, ChatGPT’s largest and best model for chat
- www.engadget.com: OpenAI's new GPT-4.5 model is a better, more natural conversationalist
- The Verge: Anthropic’s new ‘hybrid reasoning’ AI model is its smartest yet
- THE DECODER: OpenAI has presented its largest language model to date. According to Mark Chen, Chief Research Officer at OpenAI, GPT 4.5 shows that the scaling of AI models has not yet reached its limits.
- The Verge: OpenAI is launching GPT-4.5 today, its newest and largest AI language model.
- NextBigFuture.com: OpenAI GPT 4.5 Has BIG Coding Improvement – Claims Scaling Still Works – Expensive
- TechCrunch: OpenAI unveils GPT-4.5 ‘Orion,’ its largest AI model yet
- PCMag Middle East ai: Reporting that OpenAI has launched GPT-4.5 but is limiting it to priciest tiers due to GPU shortages.
- Analytics Vidhya: Two days ago, on 27 Feb 2025, OpenAI dropped GPT-4.5, expectations were sky-high. But instead of a groundbreaking leap forward, we got a model prioritizing emotional intelligence over raw reasoning power.
- AI News | VentureBeat: GPT-4.5 for enterprise: Do its accuracy and knowledge justify the cost?
- Windows Report: OpenAI released GPT-4.5, but it’s not much of an upgrade from GPT-4o. After DeepSeek was unleashed into the world, everyone wondered what OpenAI would do now that another AI company had developed an extremely powerful model at a tiny fraction of the budget.
- iHLS: OpenAI Unveils GPT-4.5
- Data Phoenix: OpenAI releases the long-awaited GPT-4.5/Orion, its last non-chain-of-thought model
- Towards AI: GPT-4.5: The Next Evolution in AI
- Towards AI: Towards AI article on TAI #142: GPT-4.5 Released.
- www.marketingaiinstitute.com: [The AI Show Episode 138]: Introducing GPT-4.5, Claude 3.7 Sonnet, Alexa+, Deep Research Now in ChatGPT Plus & How AI Is Disrupting Writing
- Analytics Vidhya: Now, this is a shocker, despite a lot of backlash on the cost of GPT 4.5, it becomes #1 in the Chatbot Arena LLM Leaderboard! Securing over 3,200+ votes, OpenAI’s latest model has emerged as number one across all evaluation categories, prominently excelling in Style Control and Multi-Turn interactions.
Jibin Joseph@PCMag Middle East ai
//
Elon Musk's xAI has officially launched Grok 3, its latest AI model, which Musk has dubbed the "smartest AI on Earth." Trained on 200,000 GPUs, this new model is positioned as a direct competitor to OpenAI's GPT-4 and DeepSeek, excelling in math, science, and coding benchmarks. The launch appears to coincide with a price hike for X Premium+ subscriptions, offering initial access to Grok 3 for subscribers in the U.S., with a separate subscription planned for web and app versions.
xAI's Grok 3 comes with advanced reasoning and agentic abilities and uses more than 10 times the computing power of Grok 2. It has two modes: "Think," which uses a smaller Grok 3 mini model for simple queries, and "Big Brain," which utilizes Grok 3 for complex problems. A new agentic feature called DeepSearch, conducts comprehensive analyses and generates reports, similar to tools recently released by OpenAI, Google, and Perplexity.
Recommended read:
References :
- PCMag Middle East ai: Elon Musk Reveals Grok 3 AI Chatbot: Here's What It Can Do. The model was trained on 200,000 GPUs and beats its rivals in math, science, and coding benchmarks, Musk says. It appears to coincide with a price hike on X Premium+ accounts.
- www.theguardian.com: Elon Musk’s startup rolls out new Grok-3 chatbot as AI competition intensifies. Billionaire CEO claims bot is ‘maximally truth-seeking’ as he looks to rival DeepSeek, OpenAI and Google Gemini
- shellypalmer.com: xAI Releases Grok 3: Technical Details and Competitive Context
- techvro.com: xAI’s Grok 3: A New Challenger to OpenAI and DeepSeek Emerges
- The Tech Portal: Elon Musk’s xAI unveils new Grok 3 model
- WebProNews: Grok 3.0 Unveiled: A Technical Leap Forward in the AI Arms Race
- Techstrong.ai: Elon Musk's artificial intelligence (AI) startup xAI which released the latest iteration of its Grok chatbot on Tuesday, is nearing a $5 billion server deal with Dell Inc.
- venturebeat.com: Elon Musk's latest AI model, Grok 3, is expected to challenge leading AI players in the market.
- www.artificialintelligence-news.com: xAI unveiled its Grok 3 AI model on Monday, alongside new capabilities such as image analysis and refined question answering.
- AI News: xAI unveiled its Grok 3 AI model on Monday, alongside new capabilities such as image analysis and refined question answering.
- eWEEK: Reports that Grok 3 Launches Today.
- www.analyticsvidhya.com: Grok 3 vs DeepSeek R1: Which is Better?
- www.analyticsvidhya.com: Grok 3 is Here! And What It Can Do Will Blow Your Mind!
- Casey Newton: Training Grok 3 took Elon 200,000 GPUs and untold billions, and it's ... decent at best? I wrote about AI's commodity problem
- www.verdict.co.uk: The debut of Grok-3 comes at a crucial time in the AI sector, following DeepSeek's recent open-source model release.
- Analytics Vidhya: Grok 3 vs DeepSeek R1: Which is Better?
- www.eweek.com: The AI chatbot from Elon Musk’s startup xAI is slated to go live on February 17 at 8:00 p.m. Pacific time.
- futurism.com: Researchers Find Elon Musk's New Grok AI Is Extremely Vulnerable to Hacking
- www.analyticsvidhya.com: Is 100K+ GPUs for Grok 3 worth it?
- Analytics Vidhya: Is 100K+ GPUs for Grok 3 worth it?
- MarkTechPost: Grok-3, the latest iteration of xAI's chatbot, showed strong performance in several tasks.
- www.marketingaiinstitute.com: [The AI Show Episode 136]: Elon Musk Tries to Buy OpenAI, JD Vance’s AI Speech, New GenAI Jobs Study, GPT-4o Update, OpenAI Product Roadmap & Grok 3
- bdtechtalks.com: This article provides an overview of Grok-3, highlighting its capabilities and benchmarks.
- www.eweek.com: This news piece covers Grok-3, its pricing, benchmarks, and availability.
- bdtechtalks.com: Overview of Grok-3's capabilities and benchmarks.
- shellypalmer.com: AI is evolving faster than ever, and tech expert Shelly Palmer, Professor of Advanced Media at Syracuse University, breaks down the latest developments in large language models (LLMs) on CNN International.
- the-decoder.com: grok-3-wins-praise-from-openai-founder
- composio.dev: Grok 3 vs. Deepseek r1
- thezvi.wordpress.com: Blog post about Grok 3 and its capabilities.
- eWEEK: News article about Grok 3, focusing on benchmarks, pricing and availability.
- Unite.AI: Elon Musk’s xAI has introduced Grok-3, a next-generation AI chatbot designed to change the way people interact on social media.
- www.unite.ai: Elon Musk’s xAI has introduced Grok-3, a next-generation AI chatbot designed to change the way people interact on social media.
- THE DECODER: Grok 3, the new model from Musk's xAI, wins praise from OpenAI founder
- www.marktechpost.com: xAI Releases Grok 3 Beta: A Super Advanced AI Model Blending Strong Reasoning with Extensive Pretraining Knowledge
- Analytics Vidhya: Grok 3 Prompts that Can Make Your Work Easy!
- Fello AI: Grok 3 vs ChatGPT vs DeepSeek vs Claude vs Gemini – Which AI Is Best in February 2025?
- TestingCatalog: xAI to add file attachment support in DeepSearch mode for Grok 3 on X
- bsky.app: xAI’s Grok-3 claims to outperform OpenAI’s GPT-4o, Google’s Gemini, and DeepSeek’s V3. With $12B raised in 18 months and a 100K Nvidia GPU cluster, the investment appears to pay off.
- AI News | VentureBeat: xAI’s new Grok 3 model criticized for blocking sources that call Musk, Trump top spreaders of misinformation
- www.marketingaiinstitute.com: The AI Show Episode 137]: GPT-4.5 and GPT-5 Release Dates, Grok 3, Forecasting New Jobs, DeepSeek Investigation, Microsoft Quantum Chip & Google AI “Co-Scientist�
- SiliconANGLE: SiliconAngle reports on the launch of Grok-3 and its advanced reasoning capabilities.
- futurism.com: Grok Caught Following Special Instructions for Queries About Elon Musk
- PCMag Middle East ai: Grok Was Briefly Instructed Not to Say Musk, Trump Spread Misinformation on X
- eWEEK: Grok AI Blocks Responses Claiming Trump and Musk “Spread Misinformation�
- OODAloop: A new generation of AIs: Claude 3.7 and Grok 3
- NextBigFuture.com: What Comes After XAI GROK 3?
Jibin Joseph@PCMag Middle East ai
//
DeepSeek AI's R1 model, a reasoning model praised for its detailed thought process, is now available on platforms like AWS and NVIDIA NIM. This increased accessibility allows users to build and scale generative AI applications with minimal infrastructure investment. Benchmarks have also revealed surprising performance metrics, with AMD’s Radeon RX 7900 XTX outperforming the RTX 4090 in certain DeepSeek benchmarks. The rise of DeepSeek has put the spotlight on reasoning models, which break questions down into individual steps, much like humans do.
Concerns surrounding DeepSeek have also emerged. The U.S. government is investigating whether DeepSeek smuggled restricted NVIDIA GPUs via Singapore to bypass export restrictions. A NewsGuard audit found that DeepSeek’s chatbot often advances Chinese government positions in response to prompts about Chinese, Russian, and Iranian false claims. Furthermore, security researchers discovered a "completely open" DeepSeek database that exposed user data and chat histories, raising privacy concerns. These issues have led to proposed legislation, such as the "No DeepSeek on Government Devices Act," reflecting growing worries about data security and potential misuse of the AI model.
Recommended read:
References :
- aws.amazon.com: DeepSeek R1 models now available on AWS
- www.pcguide.com: DeepSeek GPU benchmarks reveal AMD’s Radeon RX 7900 XTX outperforming the RTX 4090
- www.tomshardware.com: U.S. investigates whether DeepSeek smuggled Nvidia AI GPUs via Singapore
- www.wired.com: Article details challenges of testing and breaking DeepSeek's AI safety guardrails.
- decodebuzzing.medium.com: Benchmarking ChatGPT, Qwen, and DeepSeek on Real-World AI Tasks
- medium.com: The blog post emphasizes the use of DeepSeek-R1 in a Retrieval-Augmented Generation (RAG) chatbot. It underscores its comparability in performance to OpenAI's o1 model and its role in creating a chatbot capable of handling document uploads, information extraction, and generating context-aware responses.
- www.aiwire.net: This article highlights the cost-effectiveness of DeepSeek's R1 model in training, noting its training on a significantly smaller cluster of older GPUs compared to leading models from OpenAI and others, which are known to have used far more extensive resources.
- futurism.com: OpenAI CEO Sam Altman has since congratulated DeepSeek for its "impressive" R1 reasoning model, he promised spooked investors to "deliver much better models."
- AWS Machine Learning Blog: Protect your DeepSeek model deployments with Amazon Bedrock Guardrails
- mobinetai.com: DeepSeek is a catastrophically broken model with non-existent, typical shoddy Chinese safety measures that take 60 seconds to dismantle.
- AI Alignment Forum: Illusory Safety: Redteaming DeepSeek R1 and the Strongest Fine-Tunable Models of OpenAI, Anthropic, and Google
- Pivot to AI: Of course DeepSeek lied about its training costs, as we had strongly suspected.
- Unite.AI: Artificial Intelligence (AI) is no longer just a technological breakthrough but a battleground for global power, economic influence, and national security.
- cset.georgetown.edu: China’s ability to launch DeepSeek’s popular chatbot draws US government panel’s scrutiny
- neuralmagic.com: Enhancing DeepSeek Models with MLA and FP8 Optimizations in vLLM
- www.unite.ai: Blog post about DeepSeek and the global power shift.
- cset.georgetown.edu: This article discusses DeepSeek and its impact on the US-China AI race.
|
|