Kyle Wiggers@TechCrunch - 61d
OpenAI has launched its new 'o1' model, representing a significant advancement in AI capabilities. The model is available to Plus and Team users and is part of the '12 Days of OpenAI' series, which aims to improve the accessibility and interactivity of AI tools. The o1 model boasts enhanced reasoning capabilities and is faster and more powerful than its predecessors, with notable improvements in math, coding, and now also includes image processing. Internal tests show a 34% reduction in major errors compared to the o1-preview model, making it more reliable for various tasks.
The 'o1' model is now accessible through a new 'Pro' plan, named ChatGPT Pro, which is priced at $200 per month. This premium subscription grants users access to the advanced features of the 'o1' model, as well as a voice module and improved answers to complex queries. Some reviewers have noted that while the model is about 11% better in coding compared to the standard version, the cost may be prohibitive for some users when compared to other alternatives, with the pro version costing 10 times the regular subscription. Despite this, the o1 pro mode is expected to be useful in fields like math, physics, and medicine. References :
Classification:
Mels Dees@Techzine Global - 53d
OpenAI has launched its upgraded 'o1' reasoning model, making it available through its API to a select group of top-tier developers. This rollout is part of OpenAI's "12 Days of OpenAI" campaign, and access is initially limited to developers in the "Tier 5" category who have spent at least $1,000 per month and have an account for over a month. This new 'o1' model is a significant upgrade from the 'o1-preview' version and includes enhanced capabilities such as function calling, which allows the model to connect to external data sources, structured JSON outputs, and image analysis. The model is more customizable, with a new 'reasoning_effort' parameter that allows developers to control how long the model thinks about a query.
The 'o1' model is also more expensive due to the computing power required. It costs $15 for every 750,000 words analyzed and $60 for every 750,000 words generated, which is almost four times as expensive as the GPT-4o model. OpenAI has also integrated GPT-4o and GPT-4o mini models into the Realtime API and added WebRTC support for real-time voice applications. Additionally, they have introduced direct preference optimization for fine-tuning AI models, enabling developers to refine their models more efficiently by providing preferred answers instead of input/output pairs. The company noted this newer model boasts improvements, offering better accuracy and fewer prompt rejections and specifically, better performance with math and programming tasks. References :
Classification:
@the-decoder.com - 40d
DeepSeek has unveiled its v3 large language model (LLM), a significant advancement in AI. This new model was trained on an impressive 14.8 trillion tokens using 2,788,000 H800 GPU hours at a cost of approximately $5.576 million, a figure remarkably lower than other models of similar capability. DeepSeek v3's training involved both supervised fine-tuning and reinforcement learning, enabling it to achieve performance benchmarks comparable to Claude 3.5 Sonnet, showcasing its strong capabilities. The model is a Mixture-of-Experts (MoE) model with 671 billion parameters, with 37 billion activated for each token.
The release of DeepSeek v3 also includes API access, with highly competitive pricing compared to others in the market. Input is priced at $0.27 per million tokens (or $0.07 with cache hits), and output at $1.10 per million tokens. For comparison, Claude 3.5 Sonnet charges $3 per million tokens for input and $15 for output. These prices, along with its strong performance, indicate DeepSeek v3 is set to disrupt the market in terms of model quality and affordability. The model was also released as fully open-source with all associated papers and training frameworks provided to the research community. References :
Classification:
|
|