LLM Benchmarks

Leading Large Language Models Compared

Monthly performance analyses of leading language models – from OpenAI and Google to local open-source solutions.

The AI Strategy & Research Hub of TIMETOACT GROUP Austria is among the leading experts in applied research on generative AI for enterprises. Our research findings feed directly into product development – enabling us to set the highest standards in implementing AI-powered solutions for businesses.

LLM Benchmarks | March 2025

Highlights:

  • New OpenAI models - o3-mini, o4-mini and GPT-4.1
  • Qwen3 - pushing state-of-the-art for local models
  • Google Is Breathing Down the Neck: Gemini Flash 2.5 Preview and Pro 2.5 v2
  • Practical insights from AI Coding

The benchmark categories in detail

Here's exactly what we're looking at with the different categories of LLM Leaderboards

Docs

How well can the model work with large documents and knowledge bases?

CRM

How well does the model support work with product catalogs and marketplaces?

Integrate

Can the model easily interact with external APIs, services and plugins?

Marketing

How well can the model support marketing activities, e.g. brainstorming, idea generation and text generation?

Reason

How well can the model reason and draw conclusions in a given context?

Code

Can the model generate code and help with programming?

Cost

The estimated cost of running the workload. For cloud-based models, we calculate the cost according to the pricing. For on-premises models, we estimate the cost based on GPU requirements for each model, GPU rental cost, model speed, and operational overhead.

Speed

The "Speed" column indicates the estimated speed of the model in requests per second (without batching). The higher the speed, the better.

Archive

Curious about how the scores have evolved? Here you can find all links to previously published leaderboards

Discover our AI workshops for businesses


Whether it's AI fundamentals, Prompt Engineering training, or potential analysis – we offer tailored solutions for every need.

Explore our AI Workshops

Transform your digital projects with the best AI language models!

Discover the transformative power of the best Large Language Models and revolutionize your business with AI! Stay future-oriented, increase efficiency and secure a clear competitive advantage. We support you in taking your business value to the next level.

* required

We use the data you send us only for contacting you in connection with your request. You can find all further information in our privacy policy.

Solve captcha, please!

captcha image
Martin Warnung
Sales Consultant TIMETOACT GROUP Österreich GmbH +43 664 881 788 80
Blog 2/3/25

ChatGPT & Co: LLM Benchmarks for January

Find out which large language models outperformed in the January 2025 benchmarks. Stay informed on the latest AI developments and performance metrics.

Blog 10/1/24

ChatGPT & Co: LLM Benchmarks for September

Find out which large language models outperformed in the September 2024 benchmarks. Stay informed on the latest AI developments and performance metrics.

Blog 1/7/25

ChatGPT & Co: LLM Benchmarks for December

Find out which large language models outperformed in the December 2024 benchmarks. Stay informed on the latest AI developments and performance metrics.

Blog 11/12/24

ChatGPT & Co: LLM Benchmarks for October

Find out which large language models outperformed in the October 2024 benchmarks. Stay informed on the latest AI developments and performance metrics.

Blog 12/4/24

ChatGPT & Co: LLM Benchmarks for November

Find out which large language models outperformed in the November 2024 benchmarks. Stay informed on the latest AI developments and performance metrics.

Wissen 7/30/24

LLM-Benchmarks July 2024

This LLM Leaderboard from July 2024 helps to find the best Large Language Model for digital product development.

Wissen 6/30/24

LLM-Benchmarks June 2024

This LLM Leaderboard from june 2024 helps to find the best Large Language Model for digital product development.

Wissen 4/30/24

LLM-Benchmarks April 2024

This LLM Leaderboard from april 2024 helps to find the best Large Language Model for digital product development.

Wissen 8/30/24

LLM-Benchmarks August 2024

Instead of our general LLM benchmarks, we present the first benchmark of different AI architectures in August.

Wissen 5/30/24

LLM-Benchmarks May 2024

This LLM Leaderboard from may 2024 helps to find the best Large Language Model for digital product development.

Blog 5/16/24

Common Mistakes in the Development of AI Assistants

We share how failures when implementing AI occurr and what can be learned from them for future projects: So that AI assistants can be implemented more successfully in the future!

Blog 5/17/24

8 tips for developing AI assistants

8 practical tips for implementing AI assistants

Technologie Übersicht

Consulting for IBM products

IBM Software & Consulting with passion and experience – you can rely on us. Services relating to IBM Software Solutions have always been an integral part of our offer.

Headerbild zu Digitalem Ökosystem
Service

Fit for the digital ecosystem

Insurers are digitally networking with their ecosystem to gain critical capabilities in a division of labor. Personal data, object data are securely exchanged via common digital interfaces.

Wissen 5/2/24

Unlock the Potential of Data Culture in Your Organization

Are you ready to revolutionize your organization's potential by unleashing the power of data culture? Imagine a workplace where every decision is backed by insights, every strategy informed by data, and every employee equipped to navigate the digital landscape with confidence. This is the transformative impact of cultivating a robust data culture within your enterprise.

Headerbild zur Logistik- und Transportbranche
Branche

AI & Digitization for the Transportation and Logistics Indus

Digitale, transparente Prozesse und automatisierte Optimierung helfen Logistikunternehmen, Kosten und Leistung besser auszubalancieren, für eine starke, zukunftsfähige Rolle als Partner der Wirtschaft

Navigationsbild zu Data Science
Service

AI & Data Science

We offer comprehensive solutions in the fields of data science, machine learning and AI that are tailored to your specific challenges and goals.

Headerbild zu Cloud bei Versicherungen
Branche

Paths to the cloud for insurers

Cloud ist die Blaupause für eine moderne Nutzung von IT-Ressourcen. Doch traditionell scheuen viele Versicherer den Weg in die Cloud. Befürchtet werden Kontrollverlust oder fehlende Sicherheit für vertrauliche Daten. Doch all dies sind Themen, die technisch und organisatorisch gelöst sind. Es überwiegen die Vorzüge einer flexiblen und kostengünstigen Architektur.

Headerbild zu Digitale Transformation bei Versicherern
Leistung

Mastering digital transformation in insurance

Digital transformation is the transformation of the corporate world through new technologies and the Internet ► Learn how insurers can master this.

Headerbild für lokale Entwicklerressourcen in Deutschland
Branche

On-site digitization partner for insurance companies

We find the optimal IT solution for insurance companies! ► Everything from a single source ✓ Personally on site ✓ Arrange a personal exchange now.