Kyle Wiggers@TechCrunch - 61d
OpenAI has launched its new 'o1' model, representing a significant advancement in AI capabilities. The model is available to Plus and Team users and is part of the '12 Days of OpenAI' series, which aims to improve the accessibility and interactivity of AI tools. The o1 model boasts enhanced reasoning capabilities and is faster and more powerful than its predecessors, with notable improvements in math, coding, and now also includes image processing. Internal tests show a 34% reduction in major errors compared to the o1-preview model, making it more reliable for various tasks.
The 'o1' model is now accessible through a new 'Pro' plan, named ChatGPT Pro, which is priced at $200 per month. This premium subscription grants users access to the advanced features of the 'o1' model, as well as a voice module and improved answers to complex queries. Some reviewers have noted that while the model is about 11% better in coding compared to the standard version, the cost may be prohibitive for some users when compared to other alternatives, with the pro version costing 10 times the regular subscription. Despite this, the o1 pro mode is expected to be useful in fields like math, physics, and medicine. Recommended read:
References :
@www.businessinsider.com - 40d
OpenAI has announced plans to transition its for-profit arm into a Public Benefit Corporation (PBC) in Delaware, a move aimed at ensuring its long-term sustainability while maintaining its mission. This structural change is designed to balance profit generation with the company's broader goals, particularly in healthcare, education, and science, which will be pursued by its non-profit arm. The PBC structure will allow OpenAI to raise necessary capital while also maintaining a public benefit interest in its decision making. The company has also indicated the need to become an enduring company as it moves into 2025.
This transition comes with a clarified definition of Artificial General Intelligence (AGI), defining it as a system capable of generating over $100 billion in profits. This definition, agreed upon with Microsoft, is important as it triggers a clause in their agreement, granting them access to advanced models only before AGI is reached. There are reports the company may be trying to remove this clause as well. The move comes after a year in which OpenAI has experienced large losses, with the company reportedly not expected to turn a profit until 2029. Recommended read:
References :
@Techmeme - 46d
OpenAI has released its new O3 model which demonstrates significantly improved performance in reasoning, coding, and mathematical problem-solving compared to its previous models. The O3 model achieves 75.7% on the ARC Prize Semi-Private Evaluation in low-compute mode and an impressive 87.5% in high-compute mode. However, this performance comes at a very high cost, with the top-end system costing around $10,000 per task which makes it very expensive to run.
Recommended read:
References :
Mels Dees@Techzine Global - 53d
OpenAI has launched its upgraded 'o1' reasoning model, making it available through its API to a select group of top-tier developers. This rollout is part of OpenAI's "12 Days of OpenAI" campaign, and access is initially limited to developers in the "Tier 5" category who have spent at least $1,000 per month and have an account for over a month. This new 'o1' model is a significant upgrade from the 'o1-preview' version and includes enhanced capabilities such as function calling, which allows the model to connect to external data sources, structured JSON outputs, and image analysis. The model is more customizable, with a new 'reasoning_effort' parameter that allows developers to control how long the model thinks about a query.
The 'o1' model is also more expensive due to the computing power required. It costs $15 for every 750,000 words analyzed and $60 for every 750,000 words generated, which is almost four times as expensive as the GPT-4o model. OpenAI has also integrated GPT-4o and GPT-4o mini models into the Realtime API and added WebRTC support for real-time voice applications. Additionally, they have introduced direct preference optimization for fine-tuning AI models, enabling developers to refine their models more efficiently by providing preferred answers instead of input/output pairs. The company noted this newer model boasts improvements, offering better accuracy and fewer prompt rejections and specifically, better performance with math and programming tasks. Recommended read:
References :
@Techmeme - 46d
OpenAI's new o3 model has achieved a significant breakthrough on the ARC-AGI benchmark, demonstrating advanced reasoning capabilities through a 'private chain of thought' mechanism. This approach involves the model searching over natural language programs to solve tasks, with a substantial increase in compute leading to a vastly improved score of 75.7% on the Semi-Private Evaluation set within a $10k compute limit, and 87.5% in a high-compute configuration. The o3 model uses deep learning to guide program search, moving beyond basic next-token prediction. Its ability to recombine knowledge at test time through program execution marks a major step toward more general AI capabilities.
The o3 model's architecture and performance represents a form of deep learning-guided program search, where it explores many paths through program space. This process, which can involve tens of millions of tokens and cost thousands of dollars for a single task, is guided by a base LLM. While o3 appears to be more than just next-token prediction, it’s still being speculated what the core mechanisms of this process are. This breakthrough highlights how increases in compute can drastically improve performance and marks a substantial leap in AI capabilities, moving far beyond previous GPT model performance. The model's development and testing also revealed that it cost around $6,677 to run o3 in "high efficiency" mode against the 400 public ARC-AGI puzzles for a score of 82.8%. Recommended read:
References :
Theron Mohamed (tmohamed@insider.com)@All Content from Business Insider - 43d
Elon Musk’s AI startup, xAI, has successfully raised $6 billion in its latest Series C funding round. This significant investment will help xAI in its mission to develop Artificial General Intelligence (AGI) with a focus on truth-seeking and the elimination of ideological biases. xAI has not revealed what they will be doing with the massive amount of new capital they have just raised. This also shows that the investors believe in the company and the direction they are heading.
Recommended read:
References :
@the-decoder.com - 15d
OpenAI's o3 model is facing scrutiny after achieving record-breaking results on the FrontierMath benchmark, an AI math test developed by Epoch AI. It has emerged that OpenAI quietly funded the development of FrontierMath, and had prior access to the benchmark's datasets. The company's involvement was not disclosed until the announcement of o3's unprecedented performance, where it achieved a 25.2% accuracy rate, a significant jump from the 2% scores of previous models. This lack of transparency has drawn comparisons to the Theranos scandal, raising concerns about potential data manipulation and biased results. Epoch AI's associate director has admitted the lack of transparency was a mistake.
The controversy has sparked debate within the AI community, with questions being raised about the legitimacy of o3's performance. While OpenAI claims the data wasn't used for model training, concerns linger as six mathematicians who contributed to the benchmark said that they were not aware of OpenAI's involvement or the company having exclusive access. They also indicated that had they known, they might not have contributed to the project. Epoch AI has said that an "unseen-by-OpenAI hold-out set" was used to verify the model's capabilities. Now, Epoch AI is working on developing new hold-out questions to retest the o3 model's performance, ensuring OpenAI does not have prior access. Recommended read:
References :
@www.marktechpost.com - 38d
References:
Bitcoin News
, www.marktechpost.com
A new AI paper is exploring how formal mathematical systems can revolutionize math-based Large Language Models (LLMs). This approach seeks to address fundamental logic and computational issues by combining structured logic with abstract mathematical reasoning. Current LLMs often struggle with advanced problems such as theorem proving and abstract logical deductions due to a reliance on informal datasets and lack of rigorous verification. By using formal systems like Lean, Coq, and Isabelle, the new approach aims to provide a robust framework for tackling complex problems, reducing errors and improving AI's capabilities in science, engineering, and quantitative fields.
In other news, Bitwise is seeking regulatory approval for a Bitcoin Standard Corporations ETF. This ETF would track companies that hold more than 1,000 BTC. The proposal comes as Ki Young Ju, CEO of CryptoQuant, has expressed skepticism about the likelihood of the U.S. adopting a Bitcoin standard. Ju argues that historically, the U.S. has only turned to alternative asset standards during times of economic uncertainty, drawing parallels to the push for a gold standard in the past. He suggests any U.S. adoption of Bitcoin will be driven by economic threats rather than a strategic shift. Recommended read:
References :
@analyticsindiamag.com - 28d
References:
DIGITIMES Asia: News and Insig
, analyticsindiamag.com
,
Nvidia is aggressively expanding its presence in the robotics sector, highlighted by the development of its Jetson Thor computing platform. This platform is designed to replicate Nvidia’s success in AI chips and is a core component of their strategy to develop advanced humanoid robots. Furthermore, Nvidia is not working alone in this endeavor. They have partnered with Foxconn to create humanoid robots, aiming to move beyond just manufacturing and integrate into new tech areas. This strategic move demonstrates Nvidia’s focus on becoming a dominant player in AI-driven robotics, specifically for humanoid technology.
Nvidia is also addressing the challenge of training these robots through their Isaac GR00T Blueprint, unveiled at CES. This blueprint utilizes synthetic data generation to create the extensive datasets needed for imitation learning, allowing robots to mimic human actions. A new workflow uses Apple Vision Pro to capture human actions in a digital twin and the data is used in the Isaac Lab framework, which teaches robots to move and interact safely. Nvidia’s Cosmos platform also is in use by generating physics-aware videos that are also used to train robots. The company's CEO, Jensen Huang, emphasizes humanoid robots as the next big leap in AI innovation, aiming to establish Nvidia as a key player in the future of robotics and autonomous systems. Recommended read:
References :
@pub.towardsai.net - 36d
Recent developments in AI agent frameworks are paving the way for more efficient and scalable applications. The Jido framework, built in Elixir, is designed to run thousands of agents using minimal resources. Each agent requires only 25KB of memory at rest, enabling large-scale deployment without heavy infrastructure. This capability could significantly reduce the cost and complexity of running multiple parallel agents, a common challenge in current agent frameworks. Jido also allows agents to dynamically manage their own workflows and sub-agents utilizing Elixir's concurrency features and OTP architecture.
The core of Jido centers around four key concepts: Actions, Workflows, Agents, and Sensors. Actions represent small, reusable tasks, while workflows chain these actions together to achieve broader goals. Agents are stateful entities that can plan and execute these workflows. The focus is on creating a system where the agents can, to a degree, manage themselves without constant human intervention. Jido provides a practical approach to building autonomous, distributed systems through functional programming principles, and dynamic error handling. Recommended read:
References :
|
|