Top Mathematics discussions

NishMath

Kyle Wiggers@TechCrunch - 60d
OpenAI has launched its new 'o1' model, representing a significant advancement in AI capabilities. The model is available to Plus and Team users and is part of the '12 Days of OpenAI' series, which aims to improve the accessibility and interactivity of AI tools. The o1 model boasts enhanced reasoning capabilities and is faster and more powerful than its predecessors, with notable improvements in math, coding, and now also includes image processing. Internal tests show a 34% reduction in major errors compared to the o1-preview model, making it more reliable for various tasks.

The 'o1' model is now accessible through a new 'Pro' plan, named ChatGPT Pro, which is priced at $200 per month. This premium subscription grants users access to the advanced features of the 'o1' model, as well as a voice module and improved answers to complex queries. Some reviewers have noted that while the model is about 11% better in coding compared to the standard version, the cost may be prohibitive for some users when compared to other alternatives, with the pro version costing 10 times the regular subscription. Despite this, the o1 pro mode is expected to be useful in fields like math, physics, and medicine.

Recommended read:
References :
  • TechCrunch: OpenAI confirms new $200 monthly subscription, ChatGPT Pro
  • Databricks acquires BladeBridge to aid data warehouse migrations | InfoWorld: OpenAI releases o1 LLM, unveils ChatGPT Pro
  • Analytics Vidhya: Comparison of OpenAI's o1 model with GPT-4o, discussing the strengths of each.
  • www.computerworld.com: Analysis of OpenAI's o1 family of large language models.
  • hackernoon.com: Blog post analyzing the value proposition of ChatGPT Pro.
  • Hands-on with the Windows answer to Apple?s Vision Pro ? Computerworld: The $200 monthly pricing OpenAI has set for a subscription to its recently launched ChatGPT Pro is definitely “surprising,”  said Gartner analyst Arun Chandrasekaran on Friday, but at the same time it’s indicative that the company is betting that organizations will ultimately pay more for enhanced AI capabilities.
  • Analytics Vidhya: OpenAI o1 is Out: The Most Advanced Model is Available to USE!
  • SiliconANGLE: OpenAI debuts ChatGPT Pro plan with reasoning-optimized o1 pro mode LLM
  • Analytics Vidhya: OpenAI recently released o1 and o1 pro in their 12 Days of OpenAI – Live updates, offering unlimited access through a $200 ChatGPT Pro subscription.
  • Analytics Vidhya: ChatGPT Pro: Is This $200 Plan the Ultimate AI Power Move?
  • NextBigFuture.com: OpenAI o1 Pro Mode Costs $200 Per Month and GPT5 Slipping to May 2025
  • HackerNoon - ai: OpenAI Launches ChatGPT Pro: Is the $200 Monthly Subscription Worth It?
  • Don't Worry About the Vase: So, how about OpenAI’s o1 and o1 Pro? Sam Altman: o1 is powerful but it’s not so powerful that the universe needs to send us a tsunami.
  • VentureBeat: OpenAI launches full o1 model with image uploads and analysis, debuts ChatGPT Pro
  • VentureBeat: OpenAI appears poised to launch ChatGPT Pro subscription plans at $200 USD per month
  • Composio: OpenAI o1 vs Claude 3.5 Sonnet: Which One’s Really Worth Your $20?

@www.businessinsider.com - 39d
References: Techmeme , Techmeme , www.techmeme.com ...
OpenAI has announced plans to transition its for-profit arm into a Public Benefit Corporation (PBC) in Delaware, a move aimed at ensuring its long-term sustainability while maintaining its mission. This structural change is designed to balance profit generation with the company's broader goals, particularly in healthcare, education, and science, which will be pursued by its non-profit arm. The PBC structure will allow OpenAI to raise necessary capital while also maintaining a public benefit interest in its decision making. The company has also indicated the need to become an enduring company as it moves into 2025.

This transition comes with a clarified definition of Artificial General Intelligence (AGI), defining it as a system capable of generating over $100 billion in profits. This definition, agreed upon with Microsoft, is important as it triggers a clause in their agreement, granting them access to advanced models only before AGI is reached. There are reports the company may be trying to remove this clause as well. The move comes after a year in which OpenAI has experienced large losses, with the company reportedly not expected to turn a profit until 2029.

Recommended read:
References :
  • Techmeme: OpenAI says the PBC will oversee commercial operations, and the nonprofit will hire staff to pursue charitable activities in health care, education, and science (Hayden Field/CNBC)
  • Techmeme: OpenAI says its board of directors is evaluating its corporate structure, including a plan to turn its for-profit arm into a Delaware Public Benefit Corporation (OpenAI)
  • Analytics India Magazine: OpenAI Must Earn $100 Bn to Prove AGI’s Worth to Microsoft
  • www.techmeme.com: OpenAI said Friday that in moving toward a new for-profit structure in 2025, the company will create a public benefit corporation …
  • analyticsindiamag.com: Only when OpenAI’s systems generate maximum profits, the company would have achieved AGI. The post appeared first on .
  • www.businessinsider.com: OpenAI has detailed its plans for a new corporate structure that would separate its business from being controlled by its nonprofit board.
  • Latest from Tomshardware in Artificial-intelligence: OpenAI switching to a for-profit company to raise more cash as it continues to lose money
  • Neowin: OpenAI will be transforming its for-profit into a Delaware Public Benefit Corporation
  • www.engadget.com: In a blog post penned by its board of directors, OpenAI said Thursday it plans to transform its for-profit arm into a Public
  • Quartz: OpenAI may become a for-profit because it needs 'more capital than imagined'
  • Fortune | FORTUNE: OpenAI confirms plans to become a for-profit company as it looks to raise even more investor money
  • Artificial intelligence (AI) | The Guardian: OpenAI lays out plan to shift to for-profit corporate structure
  • Gizmodo: To Further Its Mission of Benefitting Everyone, OpenAI Will Become Fully for-Profit
  • the-decoder.com: OpenAI wants to turn its profit-making arm into what's called a public benefit corporation - a business model that tries to balance making money with doing good for society.
  • LessWrong: Our plan is to transform our existing for-profit into a Delaware (PBC) with ordinary shares of stock and the OpenAI mission as its public benefit interest.
  • PCMag Middle East ai: OpenAI says the plan “would result in one of the best-resourced non-profits in history,” and allow it to raise funds with "conventional terms," like its competitors.
  • Benzinga - Stock Market Quotes, Business News, Financial News, Trading Ideas, and Stock Research by Professionals: OpenAI's For-Profit Transition, Trump's AI Advisor, And Google's Code Red: This Week In AI
  • www.tekedia.com: OpenAI Announces Reason For Transition to For-Profit Organization, Lays Out Plans

@Techmeme - 46d
OpenAI has released its new O3 model which demonstrates significantly improved performance in reasoning, coding, and mathematical problem-solving compared to its previous models. The O3 model achieves 75.7% on the ARC Prize Semi-Private Evaluation in low-compute mode and an impressive 87.5% in high-compute mode. However, this performance comes at a very high cost, with the top-end system costing around $10,000 per task which makes it very expensive to run.

Recommended read:
References :
  • Simon Willison's Weblog: OpenAI o3 breakthrough high score on ARC-AGI-PUB
  • Ars Technica - All content: ArsTechnica article on OpenAI o3 and o3-mini
  • TechCrunch: Techcrunch article about OpenAI new o3 model
  • THE DECODER: The Decoder article about OpenAI o3 models
  • www.heise.de: Heise Online news about OpenAI o3 models
  • NextBigFuture.com: OpenAI o3 sets new records in several key areas, particularly in reasoning, coding and mathematical problem-solving. It scores 75.7% on the semi-private eval in low-compute mode (for $20 per task in compute ) and 87.5% in high-compute mode (thousands of $ per task). It’s very expensive. It is not just brute force. These capabilities are ...
  • arcprize.org: OpenAI o3 87.5% High Score on ARC Prize Challenge
  • @julianharris.bsky.social - Julian Harris: OpenAI announced o3 that is significantly better than previous systems, according to an independent benchmark org (The Arc Prize) that apparently got access. Only thing is it’s wildly wildly expensive to run. Like its top end system is around $10k per TASK.
  • shellypalmer.com: OpenAI’s o3: Progress Toward AGI or Just More Hype?
  • pub.towardsai.net: OpenAI’s O3: A New Frontier in AI Reasoning Models
  • MarkTechPost: OpenAI Announces OpenAI o3: A Measured Advancement in AI Reasoning with 87.5% Score on Arc AGI Benchmarks
  • www.marktechpost.com: OpenAI Announces OpenAI o3: A Measured Advancement in AI Reasoning with 87.5% Score on Arc AGI Benchmarks
  • Research & Development World: Just how big of a deal is OpenAI’s o3 model anyway?
  • Shelly Palmer: OpenAI’s o3: Progress Toward AGI or Just More Hype?
  • NextBigFuture.com: OpenAI O3 Crushes Benchmark Tests But is it Intelligence ?
  • AI News | VentureBeat: OpenAI’s o3 shows remarkable progress on ARC-AGI, sparking debate on AI reasoning
  • Analytics India Magazine: OpenAI soft-launches AGI with o3 models, Enters Next Phase of AI. The company cracked the ARC-AGI benchmark in just five years.

Mels Dees@Techzine Global - 52d
OpenAI has launched its upgraded 'o1' reasoning model, making it available through its API to a select group of top-tier developers. This rollout is part of OpenAI's "12 Days of OpenAI" campaign, and access is initially limited to developers in the "Tier 5" category who have spent at least $1,000 per month and have an account for over a month. This new 'o1' model is a significant upgrade from the 'o1-preview' version and includes enhanced capabilities such as function calling, which allows the model to connect to external data sources, structured JSON outputs, and image analysis. The model is more customizable, with a new 'reasoning_effort' parameter that allows developers to control how long the model thinks about a query.

The 'o1' model is also more expensive due to the computing power required. It costs $15 for every 750,000 words analyzed and $60 for every 750,000 words generated, which is almost four times as expensive as the GPT-4o model. OpenAI has also integrated GPT-4o and GPT-4o mini models into the Realtime API and added WebRTC support for real-time voice applications. Additionally, they have introduced direct preference optimization for fine-tuning AI models, enabling developers to refine their models more efficiently by providing preferred answers instead of input/output pairs. The company noted this newer model boasts improvements, offering better accuracy and fewer prompt rejections and specifically, better performance with math and programming tasks.

Recommended read:
References :
  • hackernoon.com: OpenAI Launches ChatGPT Pro: Is the $200 Monthly Subscription Worth It?
  • Analytics Vidhya: OpenAI o1 is Out: The Most Advanced Model is Available to USE!
  • NextBigFuture.com: OpenAI o1 Pro Mode Costs $200 Per Month and GPT5 Slipping to May 2025
  • SiliconANGLE: OpenAI today introduced ChatGPT Pro, a new paid tier of its chatbot that provides access to large language models optimized for reasoning tasks. The subscription is priced at $200 per month, 10 times more than the consumer-focused ChatGPT Plus plan.
  • Analytics Vidhya: OpenAI recently released o1 and o1 pro in their 12 Days of OpenAI – Live updates, offering unlimited access through a $200 ChatGPT Pro subscription. With much speculation surrounding their capabilities, I wondered – Is this premium subscription worth the investment?
  • www.techmeme.com: OpenAI adds o1 to its API, but only for certain developers to start, announces new versions of GPT-4o and GPT-4o mini as part of its Realtime API, and more (Kyle Wiggers/TechCrunch)
  • Dave's linkblog feed: OpenAI’s API users get full access to the new o1 model Newest API upgrade also includes fine-tuning and real-time interaction improvements.
  • TechCrunch: OpenAI brings its o1 reasoning model to its API — for certain developers
  • THE DECODER: OpenAI's upgraded o1 model brings function calling, image analysis, and more to the API
  • Techzine Global: OpenAI launches o1 reasoning model exclusively for top developers
  • SiliconANGLE: OpenAI’s makes the full version of its o1 reasoning model available, but only to some developers
  • Don't Worry About the Vase: Blog post discussing the capabilities of the o1 model.
  • AI News | VentureBeat: OpenAI opens up its most powerful model, o1, to third-party developers

@Techmeme - 46d
OpenAI's new o3 model has achieved a significant breakthrough on the ARC-AGI benchmark, demonstrating advanced reasoning capabilities through a 'private chain of thought' mechanism. This approach involves the model searching over natural language programs to solve tasks, with a substantial increase in compute leading to a vastly improved score of 75.7% on the Semi-Private Evaluation set within a $10k compute limit, and 87.5% in a high-compute configuration. The o3 model uses deep learning to guide program search, moving beyond basic next-token prediction. Its ability to recombine knowledge at test time through program execution marks a major step toward more general AI capabilities.

The o3 model's architecture and performance represents a form of deep learning-guided program search, where it explores many paths through program space. This process, which can involve tens of millions of tokens and cost thousands of dollars for a single task, is guided by a base LLM. While o3 appears to be more than just next-token prediction, it’s still being speculated what the core mechanisms of this process are. This breakthrough highlights how increases in compute can drastically improve performance and marks a substantial leap in AI capabilities, moving far beyond previous GPT model performance. The model's development and testing also revealed that it cost around $6,677 to run o3 in "high efficiency" mode against the 400 public ARC-AGI puzzles for a score of 82.8%.

Recommended read:
References :
  • arcprize.org: OpenAI's new o3 system - trained on the ARC-AGI-1 Public Training set - has scored a breakthrough 75.7% on the Semi-Private Evaluation set at our stated public leaderboard $10k compute limit.
  • Simon Willison's Weblog: OpenAI o3 breakthrough high score on ARC-AGI-PUB
  • Techmeme: Techmeme report about O3 model.
  • TechCrunch: TechCrunch reporting on OpenAI's unveiling of o3 and o3-mini with advanced reasoning capabilities.
  • Ars Technica - All content: OpenAI announces o3 and o3-mini, its next simulated reasoning models
  • THE DECODER: OpenAI unveils o3, its most advanced reasoning model yet. A cost-effective mini version is set to launch in late January 2025, followed by the full version.
  • www.heise.de: OpenAI's new o3 model aims to outperform humans in reasoning benchmarks
  • NextBigFuture.com: OpenAI Releases O3 Model With High Performance and High Cost
  • www.techmeme.com: Techmeme post about OpenAI o3 model
  • @julianharris.bsky.social - Julian Harris: OpenAI announced o3 that is significantly better than previous systems, according to an independent benchmark org (The Arc Prize) that apparently got access. Only thing is it’s wildly wildly expensive to run. Like its top end system is around $10k per TASK.
  • shellypalmer.com: OpenAI’s o3: Progress Toward AGI or Just More Hype?
  • pub.towardsai.net: OpenAI’s O3: Pushing the Boundaries of Reasoning with Breakthrough Performance and Cost Efficiency Image Source: The world of AI continues to evolve at an astonishing pace, and OpenAI’s latest announcement has left the community buzzing with excitement.
  • www.marktechpost.com: OpenAI Announces OpenAI o3: A Measured Advancement in AI Reasoning with 87.5% Score on Arc AGI Benchmarks
  • www.rdworldonline.com: Just how big of a deal is OpenAI’s o3 model anyway?
  • NextBigFuture.com: OpenAI O3 Crushes Benchmark Tests But is it Intelligence ?
  • Analytics India Magazine: OpenAI soft-launches AGI with o3 models, Enters Next Phase of AI
  • OODAloop: OpenAI’s o3 shows remarkable progress on ARC-AGI, sparking debate on AI reasoning
  • pub.towardsai.net: TAI 131: OpenAI’s o3 Passes Human Experts; LLMs Accelerating With Inference Compute Scaling

@the-decoder.com - 14d
OpenAI's o3 model is facing scrutiny after achieving record-breaking results on the FrontierMath benchmark, an AI math test developed by Epoch AI. It has emerged that OpenAI quietly funded the development of FrontierMath, and had prior access to the benchmark's datasets. The company's involvement was not disclosed until the announcement of o3's unprecedented performance, where it achieved a 25.2% accuracy rate, a significant jump from the 2% scores of previous models. This lack of transparency has drawn comparisons to the Theranos scandal, raising concerns about potential data manipulation and biased results. Epoch AI's associate director has admitted the lack of transparency was a mistake.

The controversy has sparked debate within the AI community, with questions being raised about the legitimacy of o3's performance. While OpenAI claims the data wasn't used for model training, concerns linger as six mathematicians who contributed to the benchmark said that they were not aware of OpenAI's involvement or the company having exclusive access. They also indicated that had they known, they might not have contributed to the project. Epoch AI has said that an "unseen-by-OpenAI hold-out set" was used to verify the model's capabilities. Now, Epoch AI is working on developing new hold-out questions to retest the o3 model's performance, ensuring OpenAI does not have prior access.

Recommended read:
References :
  • Analytics India Magazine: The company has had prior access to datasets of a benchmark the o3 model scored record results on. 
  • the-decoder.com: OpenAI's involvement in funding FrontierMath, a leading AI math benchmark, only came to light when the company announced its record-breaking performance on the test.
  • THE DECODER: OpenAI's involvement in funding FrontierMath, a leading AI math benchmark, only came to light when the company announced its record-breaking performance on the test. Now, the benchmark's developer Epoch AI acknowledges they should have been more transparent about the relationship.
  • LessWrong: Some lessons from the OpenAI-FrontierMath debacle
  • Pivot to AI: OpenAI o3 beats FrontierMath — because OpenAI funded the test and had access to the questions