model

All

Article

Flash

Citigroup: OpenRouter open-source model token share skyrockets to 65%, with price differences between Chinese and American models reaching dozens of times

According to a report by Reuters, the latest data released by Citi shows that driven by the demand for companies to cut AI spending, the market adoption rate of low-cost open-source models is rapidly increasing. In June of this year, the proportion of open-source model Tokens processed on the AI aggregation platform OpenRouter surged from 34% in January to 65%.Citi specifically pointed out in the report that the cost difference is a core factor driving this trend. The data shows that some Chinese open-source large models are continuously narrowing the performance gap with leading models in the United States, while their charges are as low as 18 cents per million Tokens, whereas the average charge for leading large models in the industry is about 4 dollars, resulting in a cost difference of over 20 times.

4 hours ago

high cost-performance model

NVIDIA Claude series models launched on Microsoft's Azure NVIDIA platform

According to Jinshi reports, NVIDIA announced that Anthropic's Claude series models are now available on Microsoft's Azure NVIDIA GB300 Blackwell Ultra platform.

20 hours ago

China Mobile has established a "Token Office" led by group leaders to coordinate AI large models and computing power business

According to C114 Communication Network, China Mobile has recently established a "Token Office" at the group level, directly led by the core leadership of the group, with the general manager of the Strategic Development Department serving as the executive deputy director, a position higher than the previous Computing Power Office.It is reported that this department aims to streamline the entire process of "creating Tokens, delivering Tokens, and applying Tokens," breaking the previous situation where multiple secondary departments such as the Computing Power Office, Mobile Cloud, Digital Intelligence Business Unit, Marketing Department, and Government and Enterprise Business Unit operated independently. Its core mission is to promote the implementation of two underlying capabilities in the intelligent era of China Mobile: first, to introduce the MobileClaw intelligent framework with a rich "skill package," and second, to gather over 300 mainstream models on the Mobile MoMA model platform, to accelerate the commercial implementation and large-scale operation of operators in the dimensions of AI network infrastructure and model application.

a day ago

Computing Power Business

Due to the shortage of computing power, Google restricts Meta's use of the Gemini AI model

According to a report by the Financial Times cited by Jinshi, Google has imposed limits on Meta's use of its Gemini AI model, as Meta's computing power exceeds what Google can provide.The report states that Google informed Meta around March that it could not meet all of the Gemini computing power that Meta sought to purchase, adding that this gap disrupted and delayed some of Meta's internal AI projects. The report also noted that several other clients of Google were affected, but to a lesser extent. The Financial Times stated that Meta was particularly severely impacted due to its exceptionally high demand for Google's models.

2026-06-28

computing power

The large models in the United States are moving towards closure in the name of security

The government successfully inserted itself as an approver between commercial AI models and their users for the first time.

2026-06-27

The large models in the United States are moving towards closure in the name of security

Coinbase: Has reduced AI spending by nearly 50% and is trying to default to adopting open weight models

Coinbase CEO Brian Armstrong published an article introducing the company's latest progress in AI cost optimization.Armstrong stated that as the usage of AI and Token consumption continues to grow, the key to controlling costs is not to restrict employee usage or frequently send budget reminders, but to optimize default model selection, task routing mechanisms, and caching strategies.He revealed that Coinbase is trying to use open-weight models such as GLM 5.2 and Kimi 2.7 as default options through an internal LLM gateway, while still allowing engineers to choose other models based on specific task requirements. Data shows that 91% of the company's employees have never reached the AI usage quota limit, so Coinbase has not chosen to tighten quotas but instead improved overall efficiency through lower-cost model solutions.In terms of model routing, Coinbase preprocesses prompts and, combined with cache hit rates and the pricing of different models, automatically assigns tasks to the most suitable model. Armstrong believes that complex tasks such as planning and reasoning may require support from cutting-edge models, but execution tasks do not necessarily need to invoke higher-cost models. In the future, the model selection process should be more automated by AI rather than relying on manual decisions.Additionally, he pointed out that cache hit rate is one of the important factors affecting AI costs. Coinbase has incorporated a cache-aware mechanism into the request process to improve the reuse rate of historical results. For example, in the case of LibreChat, after optimizing the caching solution, its cache hit rate has increased from 5% to 60%.Armstrong also stated that the company requires engineers to keep context as concise as possible, including starting new sessions when switching tasks, narrowing the context scope of files, and closing unused tools, to reduce unnecessary Token consumption.According to him, through these measures, Coinbase has successfully reduced AI spending by nearly 50%, while Token usage continues to grow.

2026-06-27

open weight model

OpenAI has launched the next generation GPT-5.6 series models, currently available only to trusted partners using Codex and the API

According to official news, OpenAI has officially launched the preview version of the next-generation GPT-5.6 series models, including the flagship model Sol, the balanced model Terra, and the fast low-cost model Luna. GPT-5.6 introduces a brand new maximum reasoning effort and features a super strong mode that accelerates complex tasks through sub-agents.The flagship model Sol introduces the Ultra mode, which combines maximum reasoning intensity with sub-agent collaboration. In the Terminal-Bench 2.1 command line workflow test, Sol achieved a score of 88.8%, which increased to 91.9% in Ultra mode, surpassing GPT-5.5's 83.4% and Claude Fable 5's 88.0%. The mid-range model Terra performs close to GPT-5.5 while being priced at half, and the lightest model Luna is designed specifically for everyday automation tasks. Sol is priced at $5 per million input tokens and $30 for output, and it supports reducing secondary call costs by utilizing prompt caching.In terms of security, the security assessment confirmed that Sol did not exceed the critical thresholds of the Preparedness Framework cybersecurity. OpenAI has invested over 700,000 A100 equivalent GPU hours in automated red team exercises, equipping the entire series of models with a defense stack that includes rejection mechanisms, real-time abuse classifiers, and account-level audits. Although the current limited release follows the U.S. government's security framework, OpenAI emphasizes that it does not want a government-led access mechanism to become the long-term default model, as it would limit defenders' access to cutting-edge tools.

2026-06-27

Gate.AI full-chain large model management platform upgrade, enhancing unified large model access and enterprise governance capabilities

The trading platform Gate's full-link large model management platform Gate.AI has recently completed an upgrade, launching a one-stop large model routing service for enterprises and developers. The platform is now connected to over 200 mainstream large models worldwide, supporting the two major protocols of OpenAI and Anthropic. Enterprises can access different model resources through a single API, achieving unified access and management, thereby reducing development, operation, and migration costs.Combining intelligent routing and comprehensive enterprise governance, Gate.AI achieves optimal matching of heterogeneous models and high business availability through intelligent routing and an automatic fallback mechanism. In terms of governance and security, the platform has built a multi-level unified management system that includes organizational structure, role permission control, members, and API keys, reinforcing privacy protection with zero data retention (ZDR) and data processing agreements (DPA). Additionally, through refined cost governance measures such as shared quota pools, it helps enterprises achieve efficient, standardized, and transparent operation of AI resources.As an important part of Gate's Intelligent Web3 strategy, Gate.AI is continuously improving the construction of an open AI platform, further promoting the large-scale application of AI in practical business scenarios by connecting global model resources and enterprise-level governance systems. In the future, Gate will continue to deepen its efforts in model access, intelligent routing, enterprise governance, and application innovation, creating a full-link open AI ecosystem to provide long-term support for the intelligent upgrade of global enterprises.

2026-06-25

large model management platform

enterprise governance

intelligent routing

B.AI "Self-Selected Service Provider" expands again, mainstream models + four-tier discounts open up computing power freedom

The B.AI platform's "custom service provider" model matrix has officially expanded, with the latest integrations of mainstream large models such as Moonshot (Kimi series) and Z.ai (GLM series).This module covers four major discount groups: 10% off, 40% off, 60% off, and 80% off. Users can generate a personalized "discount model invocation Key" with a single click, allowing for flexible switching between core business and daily testing, achieving a precise balance of high availability and low cost.In addition, all discounts can be stacked with a maximum 1:1 recharge bonus, further lowering the computational power threshold. Starting today, log in to the B.AI console to customize your exclusive model combination and immediately enter the era of AI invocation with exceptional cost performance.

2026-06-24

self-selected service provider

The Anthropic Mythos model detected security vulnerabilities in the U.S. government's classified systems

According to the Associated Press, Anthropic's Mythos model has discovered vulnerabilities in the U.S. government's classified systems.

2026-06-24

Vitalik: Ethereum Foundation budget cut by 40%, will shift to a long-term donation fund model

Vitalik Buterin, co-founder of Ethereum, stated that the Ethereum Foundation (EF) has announced a budget cut of approximately 40% this year as part of its financial transformation plan.According to the funding management policy released last year, EF is gradually transitioning from an "expenditure-based organization" to an "endowment-based model," aiming to reduce the annual expenditure ratio from about 15% to approximately 5% after 2030. In this process, the foundation emphasizes that it will accept inevitable personnel and resource adjustments and acknowledges the loss of some capabilities and experience.In this round of restructuring, EF has reduced approximately 54 employees, accounting for about 20% of the overall team. Vitalik stated that many of these departing members may continue to participate in the Ethereum ecosystem in external forms in the future. Meanwhile, the foundation will shift its strategic focus to a more "lightweight" protocol governance and development path, including advancing the "Strawmap" long-term roadmap, covering core protocol upgrades such as consensus mechanisms, privacy technologies, account models, and state structures, and promoting Ethereum's evolution into its third phase.In terms of specific structural adjustments, EF will weaken the "multi-client redundancy priority" model and shift towards a development approach based more on specialized division of labor and AI-assisted formal verification; the privacy and scalability research team PSE will be restructured, transitioning from exploratory R&D to more focused engineering implementation; the scale of ecosystem activities such as Devcon will also gradually be reduced.In addition, EF will reduce investments in large cross-domain projects in the future, placing greater emphasis on protocol security and high-value improvements, while encouraging more innovative work to be completed externally. Although the path is more streamlined, Ethereum will continue to strengthen its core positioning as a highly censorship-resistant and long-term stable protocol.

2026-06-23

Ethereum Foundation

zero-knowledge proofs

The Sun Wukong ecology will hold a themed X Space dialogue on "B.AI partners with MiniMax to fully subsidize the M3 model" tonight at 20:00

The Sun Wukong Ecology announced that it will hold a roundtable discussion today at 20:00 Beijing time at X Space, themed "High-Performance Open Source Model M3 Free Access! B.AI partners with MiniMax for exclusive full subsidies to accelerate the implementation of the Agent economy."It is reported that this event will feature special guests from MiniMax and industry KOLs, who will delve into how the exclusive subsidy program launched by B.AI in collaboration with MiniMax empowers complex Agent scenarios. Participants will focus on the dual reduction trend in development thresholds and costs for AI Agents under the "free access" model, analyzing its role in promoting the large-scale practical application of the Agent economy.

2026-06-23

Sun Wukong Ecology

OpenAI expands its cybersecurity program Daybreak, launching a dedicated defense model GPT-5.5-Cyber

OpenAI announced a comprehensive expansion of its cybersecurity program Daybreak, aimed at leveraging artificial intelligence to accelerate the discovery and automatic remediation of software vulnerabilities. The core of this expansion is the full version dedicated model GPT-5.5-Cyber launched for trusted defenders. It is reported that this model has set the highest score records in multiple cybersecurity benchmark tests, surpassing GPT-5.5's 81.8% and competitor Mythos 5's 83.8%, significantly improving the accuracy of vulnerability scanning and patch generation. At the same time, the synchronously updated Codex Security plugin has been deeply integrated into the developer workflow, supporting fully automated codebase scanning, threat modeling, and patch generation.In terms of ecosystem development, OpenAI has launched an exclusive partner program, allowing compliant security service providers to integrate GPT-5.5 with specific permissions into their commercial products; and has initiated the "Patch the Planet" program in collaboration with organizations like Trail of Bits to assist over 30 foundational open-source projects such as Python and Go in implementing vulnerability fixes. In addition, OpenAI revealed that it is currently engaged in deep cooperation with governments and institutions from multiple countries, including the United States, the United Kingdom, France, and Japan, to jointly enhance the cybersecurity capabilities of global critical infrastructure.

2026-06-23

vulnerability remediation

partner program

ByteDance releases Doubao 2.1 Pro large model, accelerating AI strategy towards the enterprise sector

According to the "Science and Technology Innovation Board Daily," at the 2026 Volcano Engine Force Conference, ByteDance officially released the latest flagship version 2.1Pro of the Doubao large model. Tan Dai, President of Volcano Engine, stated that the model has made breakthroughs in four dimensions: code delivery, long-range Agent tasks, multimodal understanding, and enterprise-level stable operation, possessing stronger engineering delivery capabilities and being capable of handling complex R&D tasks for enterprises. At the conference, ByteDance CEO Liang Rubo also emphasized that the company will focus on enhancing large model capabilities and firmly invest in MaaS (Model as a Service) business.The report pointed out that ByteDance's AI strategic focus is clearly shifting towards enterprise-level services. Currently, the daily token call volume of the Doubao large model has reached 180 trillion, an increase of over 1500 times since its release, and has grown more than 10 times in the past year. However, due to the bottleneck in monetization on the consumer side and high expenses (with daily computing costs reaching tens of millions and daily revenue below one million), ByteDance has accordingly adjusted its resource allocation. Meanwhile, the video generation model Seedance, primarily aimed at the B-end, has validated its commercialization potential, with current annual recurring revenue (ARR) reaching 2 billion dollars, effectively offsetting Doubao's computing costs. In addition, the new version of Seedance will also be the first in the industry to launch a 3D white membrane preview feature.

2026-06-23

enterprise side

News: Anthropic's next-generation Mythos model has completed training

According to Andrew Curran, the next-generation Mythos model from Anthropic, which has stronger capabilities (named either Mythos 5.1 or Mythos 6), has completed training. It is currently uncertain whether this model will be publicly released or kept for internal use only to accelerate the development of subsequent technologies.Analysis indicates that although some cutting-edge models like Fable 5 or Mythos 5 face restrictions or scrutiny regarding public release, this has not slowed down the research and development pace of top AI laboratories. In the face of fierce competition from open-source models represented by GLM-5.2, leading AI companies still need to continuously invest in and train more powerful systems to maintain their commercial and technological leadership.

2026-06-22

training completed

GLM-5.2 flagship model officially launched on the B.AI platform, API and Web Chat are now fully open

The GLM-5.2 flagship model Web Chat has officially landed on the B.AI platform. Following the API's initial opening on June 18, both ends of the service are now fully available.As a representative work of the new generation of open-source weights, this model features an industry-leading 1M ultra-long context window, specifically designed for high-performance scenarios such as large-scale code development, complex reasoning, and agent tasks, balancing cutting-edge performance with stable execution capabilities. Users can now flexibly access it through the official API standard channel and various service provider discount models, or they can directly use the web-based chat for seamless integration across all workflow scenarios.B.AI, as a leading one-stop AI model service platform, is committed to providing developers and enterprises with an efficient, secure, and user-friendly model deployment and invocation experience. The comprehensive integration of GLM-5.2 will further enrich the platform ecosystem and empower various intelligent applications to be implemented.

2026-06-20

American developers are accelerating the adoption of Chinese AI models to cut costs, but face challenges in revenue conversion

According to Rest of World, in order to significantly reduce operating costs, American developers and startups are accelerating the adoption of Chinese AI models such as DeepSeek, Minimax, and Kimi. Due to their high cost-effectiveness in handling common tasks like code writing, the market share of Chinese AI models is rapidly increasing. Data shows that in May, the Token usage share of DeepSeek on the service provider Vercel platform surged from less than 1% to 17%. Additionally, it is reported that Microsoft is also exploring the use of DeepSeek or other open-source models as a low-cost alternative for Copilot.However, despite the popularity of Chinese AI models in the market due to their price advantage, many American companies and developers choose to access these models indirectly through local servers or American cloud service providers, constrained by domestic political scrutiny and data security concerns. This has led to Chinese AI companies facing high usage rates while still struggling to effectively convert traffic into substantial direct revenue and long-term enterprise-level customer trust in the U.S. market.

2026-06-18

American developers

Chinese AI models

revenue conversion

JPMorgan restricts its Hong Kong employees' access to Anthropic AI models

According to the Financial Times, informed sources revealed that JPMorgan Chase has stopped its Hong Kong employees from accessing Anthropic's AI models. Currently, the bank's Hong Kong employees can no longer call models like Claude from the internally approved large language model dropdown list.It is reported that this decision is primarily based on the specific wording of the usage terms in the licensing agreement between JPMorgan Chase and Anthropic. According to the relevant terms, the usage scope of Anthropic's models excludes the Greater China region (including Hong Kong). Earlier this year, Goldman Sachs also implemented similar access restrictions for its Hong Kong employees based on a strict interpretation of these terms.Although international institutions previously could usually circumvent regional restrictions by signing global contracts and hosting data overseas, compliance actions among Wall Street banks are tightening as the U.S. government and regulatory agencies increasingly scrutinize the overseas use of AI technology. Anthropic has previously made it clear that its Claude model has never been officially "supported" in Hong Kong.

2026-06-18

Hong Kong employees

access restrictions

Microsoft expands its business in China with the help of OpenAI models, and ByteDance's annual spending may exceed 1 billion dollars

According to Bloomberg, citing informed sources, Microsoft has established a substantial business by selling OpenAI models to Chinese companies, despite facing intensified competition between China and the United States in the field of artificial intelligence. Among them, ByteDance has been one of Microsoft's largest artificial intelligence clients in recent years, primarily using OpenAI models. Informed sources say that ByteDance's annual spending on Microsoft's artificial intelligence and cloud services is expected to exceed $1 billion.

2026-06-18

artificial intelligence

Aster upgrades its token economic model, increasing the buyback and burn ratio to 198%

Aster announced an update to the ASTER token economic model, increasing the buyback and burn ratio to 198%. Starting from today at 20:00 (UTC+8), 99% of Aster's daily platform fees will be used to buy back $ASTER. At the same time, an equivalent amount of $ASTER will be burned from the reserves, with the buyback and burn ratio being 1:1.The repurchased $ASTER will be allocated to stakers. Each epoch, these $ASTER will be added to the loyalty rewards (300,000 $ASTER base reward plus the buyback amount) and distributed to veASTER based on locking weight. The burn will be prioritized for the team. The initial supply of $ASTER is 8 billion tokens. The burn will continue until the total supply is reduced to 3 billion tokens. The buyback will automatically run daily through TWAP and settle on-chain, with both buyback and burn being publicly verifiable. In addition, each permissionless listing project on Aster Spot must pay a fee of 50,000 USDT, which will be used for additional buybacks of ASTER and distributed as extra staking rewards.

2026-06-17

token economic model

app_icon

ChainCatcher Building the Web3 world with innovations.