Weekly AI Rankings — June 28 – July 05, 2026
Top 5 AI Models of the Week
This week, Anthropic launched Claude Sonnet 5, making it available to a wider audience, including free access. The model is positioned as a cheaper working default with an announced pricing structure.
Claude Sonnet 5 offers an API pricing of $2 per 1M input and $10 per 1M output, making it accessible for developers.
Z.ai released ZCode 3.0, an AI-native IDE that supports multi-agent workflows. This release marks a significant step for development with agents.
ZCode 3.0 includes code review features and tight integration with GLM-5.2, making it a powerful tool for developers.
This week, GLM-5.2 surpassed Claude Opus 4.8 on Terminal Bench 2.1, sparking discussions about its performance. The model is also integrated into ZCode 3.0.
On Terminal Bench 2.1, GLM-5.2 scored 82.7, while Claude Opus 4.8 scored 78.9, highlighting its competitiveness.
Meituan announced LongCat 2.0, trained on Chinese chips, drawing attention to architectural solutions for long context.
LongCat 2.0 includes information on API pricing and training parameters at a large scale.
Mistral updated Leanstral to version 1.5, marking an important step for supporting formal verification in Lean 4.
Leanstral 1.5 is aimed at assisting in the formulation and verification of proofs, making it useful for developers.
Top 5 AI Tools of the Week
This week, Anthropic released an official prompt library for Claude Code, marking a significant step for developers. The company also restricted access to the tool for developers in China.
The library includes templates for planning, debugging, and automation, making it useful for real engineering scenarios.
Sber launched GFusion, a diffusion LLM based on GigaChat, drawing attention to its atypical output mechanics. The model was released as open source.
GFusion attempts to generate and edit text in blocks, differing from traditional autoregressive models.
Caveman Code has become a popular plugin that saves tokens by compressing the style of LLM responses. This tool has attracted developers' attention amid rising token costs.
Caveman Code can save up to 75% of tokens by reducing introductory explanations and lengthy transitions.
Sber launched KVAE-Audio for audio compression, marking an important step in generative models. The tool demonstrates significant compression in audio processing time.
KVAE-Audio processes audio at 48 kHz and demonstrates compression of up to 960× in processing time.
Sber launched KVAE-Audio for audio compression, marking an important step in generative models. The tool demonstrates significant compression in audio processing time.
KVAE-Audio processes audio at 48 kHz and demonstrates compression of up to 960× in processing time.