Scaling Databases in the Chiplet Era: Lessons from MariaDB Performance Engineering

I’ll be presenting at MariaDB Day 2026 in Brussels on HammerDB performance tuning and benchmarking methodology that delivers real-world architectural impact,  How MariaDB Achieved 2.5× OLTP Throughput in Enterprise Server 11.8. This blog post provides some of the insights and background that went into this work.

In 2025 HammerDB began working with MariaDB engineering to analyse OLTP scalability on modern hardware platforms and to identify architectural bottlenecks that limit throughput at scale. The goal was not to tune a benchmark, but to use accurate, comparable and repeatable workloads to  identify fundamental serialization points in the database engine. You can read the blog post here on the results of that work:  Performance Engineered: MariaDB Enterprise Server 11.8 Accelerates OLTP Workloads by 2.5x

To deliver improvements as quickly as possible, these changes were introduced in MariaDB Enterprise Server 11.8.3-1, with updates further progressing into the community version from MariaDB 12.2.X onwards.

Note: It has been highlighted that some recent public benchmarks of MariaDB evaluated earlier MariaDB 11.8 Community builds that did not yet include these scalability improvements. To observe the latest performance engineering benefits you should use the latest version of MariaDB Enterprise Server 11.8.3-1 or above or MariaDB Community Server 12.2.1 (latest version at the time of publishing of this blog).”


1. Why Modern Hardware Changes Database Scaling

You may notice that the subtitle explicitly says MariaDB 11.8 delivers up to 2.5x higher OLTP throughput on Dell PowerEdge R7715 servers powered by AMD EPYC™ processors.  Database performance can never be viewed in isolation from the hardware.  If you have a familiarity with Moore’s Law you will know that CPU transistor density historically doubled approximately every two years, leading to faster, cheaper, and more efficient computing power.  However, in the chiplet era, performance scaling is no longer driven purely by transistor density—system architecture, topology, and interconnect design now dominate real-world database scalability.

Amdahl’s Law, USL, and CPU Topology

The classic underlying scaling model applied by performance engineering is Amdahl’s Law, which states that overall speedup is limited by the serial fraction of a workload. Even small serial sections cap scalability as core counts rise. The Universal Scalability Law extends this by modelling contention and coherency overheads, explaining why performance can plateau or even regress at high concurrency.

In the chiplet era, CPU topology adds another dimension: NUMA boundaries and interconnect latency now create intra-CPU scaling knees that classical models did not explicitly predict. Modern database performance engineering therefore requires topology-aware analysis, not just thread-level scaling curves. For this reason we partnered with StorageReview to access the very latest Dell system to evaluate MariaDB performance.

The server platform used was a Dell R7715 equipped with:

      • AMD EPYC 9655P 1 x 96 cores / 192 threads
      • 768 GB RAM (12 x 64GB)
      • Dual PERC13 (Broadcom NVMe RAID)
      • Ubuntu 24.04

Given the topology focus it is vital to cross-reference performance research on multiple systems.  The results were also verified and additional testing conducted on systems based on the following CPUs:

      • Intel Xeon Gold 5412U / 128GB RAM
      • AMD EPYC 9454P 1 x 48 cores / 96 threads 256 GB RAM

2. The HammerDB TPROC-C Workload

HammerDB and TPC

HammerDB implements the TPC-C workload model as an open-source, fair-use derivative called TPROC-C. It is designed from the ground up for parallel throughput and modern hardware scaling. In 2018 HammerDB was adopted by the TPC and is hosted by the TPC-Council  with oversight via the TPC-OSS working group to ensure independence and fairness. HammerDB supports not only MariaDB, but Oracle, Microsoft SQL Server, IBM Db2, PostgreSQL and MySQL running natively on both Linux and Windows, enabling a wide range of workloads and insights.  Anyone can contribute to HammerDB.

TPC-C trademark and naming

Although hosted by the TPC-Council, HammerDB  does not use ‘TPC’ or ‘TPC-like’ terminology as defined by the TPC fair use policies.  TPC-C is a trademarked benchmark requiring audit and configuration disclosure. Non-audited workloads should be labelled ‘TPC-C-derived’ and not use TPC-C or tpmC in its naming, hence  the HammerDB TPROC-C workload. Using the trademark without audit and disclosure not only violates the TPC trademark but can mislead readers about comparability with published results.

Transaction metric: tpmC and NOPM

TPC-C defines performance in committed New-Order transactions per minute (tpmC).  HammerDB defines its own TPROC-C metric called NOPM which means (New Orders Per Minute).  HammerDB also provides a database engine metric measurement called TPM (transaction per minute), however TPM cannot be directly compared between database systems.  A system may execute more database transactions  while completing fewer workload transactions. This also extends to low level database operations such as SELECT, INSERT, UPDATE, and DELETE statements. NOPM is the meaningful metric to measure TPC-C derived workloads.

Stored procedures vs client SQL

HammerDB uses stored procedures to reduce network round trips and client-side overhead.  Client-side SQL execution can increase network traffic by an order of magnitude and bias results of database engine scalability downward.

Sharded schemas vs single-instance scaling

The TPC-C specification defines a single schema with N warehouses. Multiplying schema instances (for example, 10 schemas × 100 warehouses) removes contention and changes scaling behaviour, effectively benchmarking a sharded architecture rather than a single schema.  HammerDB implements both cached and scaled derived versions of the TPC-C specification, both enable scaling to 1000’s of warehouses without the need for sharding.

Warehouse access patterns and I/O bias

The cached implementation of HammerDB follows the default implementation concept of virtual users having a home warehouse. HammerDB also has an option called “use all warehouses” where random warehouse access patterns are intended to drive physical I/O and bias benchmarks toward storage rather than database concurrency control and buffer pool management. Cached workloads are typically used to observe core engine scalability.

With HammerDB we have a proven methodology known from years of observation to align system comparison ratios with official audited TPC-C results.


3. MariaDB 10.6 Baseline and Analysis

 HammerDB generates graphical insight of workloads for you automatically, and the initial run of MariaDB 10.6.23-19 produced a highly respectable throughput of 782,949 NOPM which aligns with 1.8M database transactions per minute (i.e. stored procedure commits and rollbacks). As noted with topology aware scaling our peak performance was measured at 48 Active Virtual Users aligning with the CPU architecture but below the system physical core and thread count.

HammerDB MariaDB 10.6 Result

HammerDB can group related performance jobs into performance profiles. As shown in the graph this simply shows the results of repeated individual performance tests with the Virtual User count increased. It can be observed that although performance peaked at the hardware-defined 48 Virtual users it then decreased and levelled off when equalling the physical CPU count.

HammerDB MariaDB 10.6 Performance Profile

At high concurrency, the first clear symptom was time accumulating in native_queued_spin_lock_slowpath — a kernel sign that many threads are repeatedly contending on the same lock. Tracing that contention into the engine led directly to the transactional write path, and in particular mtr_commit, which appends redo records and advances the Log Sequence Number (LSN) on every commit.

Historically, LSN allocation and log buffer management were protected by a global lsn_lock, implemented as a futex (a fast userspace mutex). Futexes are efficient under light contention, but when many threads compete for the same lock they can become a scalability bottleneck, effectively serialising  redo log throughput.

As a solution, MariaDB engineering redesigned this path to remove the global lock. Instead, threads use atomic fetch_add to reserve non-overlapping slices of the log buffer and advance the LSN. What was previously a single serialized critical section becomes a scalable atomic allocation path — a key contributor to the step-change in OLTP throughput seen in MariaDB 11.8 on modern multi-core and chiplet-based systems and you can see the implementation here.

MDEV-21923 LSN allocation is a bottleneck

The nature of identifying and resolving points of serialization is that resolving one area, leads to the next and the next point of contention led to:

MDEV-19749 MDL scalability regression after backup locks

Historically, FLUSH TABLES WITH READ LOCK (FTWRL)—which freezes all writes during backups—used two separate metadata lock namespaces: one for global read locks and one for commit locks. This meant normal workload traffic was split across two independent contention points. When the newer BACKUP STAGE framework was introduced, these namespaces were merged into a single BACKUP lock namespace to simplify the code. While functionally correct, this change unintentionally collapsed two queues into one, concentrating all metadata lock traffic on a single shared data structure and significantly reducing scalability under high concurrency.

MariaDB engineering addressed this by introducing MDL fast lanes, which shard lightweight metadata locks across multiple independent instances while retaining a global path for heavyweight backup locks. Under normal operation, this restores parallelism by spreading contention across multiple lock instances, while still preserving strict correctness when a backup is in progress. In effect, a global metadata serialization point was split back into multiple scalable paths, shrinking the engine’s serial fraction once again.


4. MariaDB 11.8.3-1 Enterprise

Although MDEV-21923 and MDEV-19749 were the key implementations to scale the MariaDB database engine across CPU architectural boundaries, additional performance related MDEVs can be viewed in the MariaDB blog post. MariaDB Enterprise Server 11.8.3-1 was the first release to include all of these changes to identify and remove single points of contention in the transactional subsystem.

The result was 2,015,540 NOPM at 4.6M TPM for MariaDB Enterprise Server 11.8.3-1 more than 2.5X higher throughput than MariaDB 10.6.23-19 on the same server.

HammerDB MariaDB Enterprise 11.8.3-1 Result

The resulting scalability curves closely follow extended USL predictions and align with hardware topology boundaries, indicating that core engine serialization has been substantially reduced.

HammerDB MariaDB Enterprise 11.8.3-1 Performance Profile

5. Continuous Performance Engineering

As we have seen removing one point of serialization  inevitably reveals the next. This is the nature of performance engineering at scale: each architectural fix shifts the scalability frontier.

Work is continuing  to further improve concurrency and topology-aware scaling aligning MariaDB development with modern database hardware.  HammerDB will continue to collaborate with MariaDB engineering to quantify these changes and provide reproducible, industry-standard benchmarks.


Conclusion

Modern database performance is shaped as much by hardware topology as by software design. Chiplet architectures have fundamentally altered scalability assumptions, and benchmarking methodologies must reflect this reality.

Failing to account for topology effects risks conflating software limitations with architectural boundaries. Database benchmarking in the chiplet era therefore requires explicit topology-aware analysis to remain a valid decision-making tool.

What Is a Benchmark?

Most people don’t realise that benchmarks pre-date computers by decades.

The idea comes from land surveying. Long before software or databases existed, surveyors faced a simple problem: how do you measure something today and be confident you’re measuring the same thing months or years later?

The answer was the BENCH MARK — a fixed reference point, literally set into stone, that could be returned to again and again, knowing it hadn’t moved.

That original idea still matters.


The Core of a Benchmark: Repeatability

At its heart, a benchmark is about repeatability.

If you can’t repeat a measurement and get a comparable result, you don’t really have a benchmark — you just have a number.

That’s as true for database benchmarks as it was for surveying:

  • the same workload

  • the same schema

  • the same scalability

  • the same rules

  • the same measurement method

Change those, and you’ve changed what’s being measured.

Experimentation isn’t the problem — it’s the point. A good benchmark makes experimentation meaningful.

When the benchmark is repeatable, changing a single variable — hardware, configuration, software version, or scale — means any difference you observe can reasonably be attributed to that change. Cause and effect become visible.

But without a stable reference point, you’re guessing.


“Fine or Imprisonment for Disturbing This Mark”

Many historic survey benchmarks, particularly metal markers, include a warning stamped directly into them:

“Fine or imprisonment for disturbing this mark.”

But the mark itself wasn’t valuable. However, if it could be moved, every future measurement taken from it would be suspect. Accumulated knowledge depended on that reference point remaining exactly where it was. Instead, what was valuable was trust in the measurement.


“We Only Benchmark Our Production Systems”

The problem is that production is rarely a single system.

Over time, organisations accumulate many databases, platforms, and teams. People move on, systems evolve, and each measurement captures a moment that can’t easily be compared with the next.

Each result may be valid on its own, but without a common benchmark none of them relate to each other. You don’t end up with a strategy — you end up with isolated measurements tied to individual systems and people with different skill-sets, opinions and bias.

And once that happens, you don’t really have a database strategy at all.

You end up with a sprawl: different databases, different platforms, different operating systems, some on-prem, some in the cloud. Each measured differently, at a different time. The numbers don’t line up, and there’s no consistent view of performance — or of what any of it actually costs.


Why Repeatability Unlocks Understanding

Once a benchmark is accurate and repeatable, something important changes: you can start to understand why things behave the way they do.

You can:

  • compare hardware generations

  • evaluate configuration changes

  • understand scaling behaviour

  • measure cost versus performance

  • make decisions based on evidence rather than instinct

The goal isn’t to recreate every detail of production. It’s to create a stable reference point you can return to — today, next month, or next year — and trust that the comparison still holds.


In summary

The concept of benchmarks have lasted for centuries because they’re repeatable.

Long before databases or computers, someone looked at the problem of measurement and said: “If we want to measure something time and time again, we need a fixed reference point we can trust.”

That simple idea is still the foundation of benchmarking today.

How to get HammerDB PostgreSQL Metrics up and running fast with pgsentinel

In this post, we give a quick guide to getting HammerDB PostgreSQL metrics up and running with the pgsentinel extension from PostgreSQL build and install to running HammerDB.

Note that PostgreSQL metrics will also run on HammerDB for Windows, however this guide uses Linux as a more straightforward example for building and installing PostgreSQL extensions.  Firstly download PostgreSQL from here http://www.postgresql.org.

Build PostgreSQL and Extensions

Extract the PostgreSQL source into a directory called /opt/postgresql-18.0, cd to this directory, build PostgreSQL and install it.  in this case /opt/postgresql18.

./configure --prefix=/opt/postgresql18
sudo mkdir /opt/postgresql18
sudo chown -R ubuntu:ubuntu /opt/postgresql18
make install
mkdir /opt/postgresql18/DATA

If we look in this directory, we can see the binary install and our DATA directory.

ubuntu@ubuntu24:/opt/postgresql18$ ls
bin DATA include lib share

Note we have not created a database yet, that will come later.

Now we want to export our PATH and LD_LIBRARY_PATH so pgsentinel and HammerDB can find where we built PostgreSQL.

export PATH=/opt/postgresql18/bin:$PATH
export LD_LIBRARY_PATH=/opt/postgresql18/lib:$LD_LIBRARY_PATH

Now, we want to go to pg_stat_statements and build and install this extension.

cd /opt/postgresql-18.0/contrib/pg_stat_statements/
make
make install

The next step is to install our database into the /opt/postgresql18/DATA directory.

/opt/postgresql18$ ./bin/initdb ./DATA

Edit the postgresql.conf and add the following settings.

# Add settings for extensions here
shared_preload_libraries = 'pg_stat_statements,pgsentinel'
track_activity_query_size=2048
pg_stat_statements.save=on
pg_stat_statements.track=all
pgsentinel_pgssh.enable = true
pgsentinel_ash.max_entries = 1000000

Start PostgreSQL

/opt/postgresql18$ ./bin/pg_ctl start -D ./DATA
waiting for server to start.....2025-11-14 17:11:15.179 GMT [11871] LOG: starting PostgreSQL 18.0 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0, 64-bit
2025-11-14 17:11:15.179 GMT [11871] LOG: listening on IPv4 address "127.0.0.1", port 5432
2025-11-14 17:11:15.190 GMT [11871] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2025-11-14 17:11:15.193 GMT [11877] LOG: database system was shut down at 2025-11-14 17:10:46 GMT
2025-11-14 17:11:15.196 GMT [11871] LOG: database system is ready to accept connections
2025-11-14 17:11:15.197 GMT [11880] LOG: starting bgworker pgsentinel
done
server started

Login to PostgreSQL and create the pg_stat_statements and pgsentinel extensions.

./bin/psql -d postgres
psql (18.0)
Type "help" for help.
postgres=# CREATE EXTENSION pg_stat_statements;
CREATE EXTENSION
postgres=# CREATE EXTENSION pgsentinel;
CREATE EXTENSION
postgres=# SELECT * FROM pg_active_session_history LIMIT 5;
ash_time | datid | datname | pid | leader_pid | usesysid | usename | applicatio
n_name | client_addr | client_hostname | client_port | backend_start | xact_star
t | query_start | state_change | wait_event_type | wait_event | state | backend_
xid | backend_xmin | top_level_query | query | cmdtype | queryid | backend_type 
| blockers | blockerpid | blocker_state 
----------+-------+---------+-----+------------+----------+---------+-----------
-------+-------------+-----------------+-------------+---------------+----------
--+-------------+--------------+-----------------+------------+-------+---------
----+--------------+-----------------+-------+---------+---------+--------------
+----------+------------+---------------
(0 rows)

Next switch to HammerDB, as we already exported the LD_LIBRARY_PATH, HammerDB can find the PostgreSQL library.

./hammerdbcli 
HammerDB CLI v5.0
Copyright © HammerDB Ltd hosted by tpc.org 2019-2025
Type "help" for a list of commands
Initialized new Jobs on-disk database /tmp/hammer.DB
hammerdb>librarycheck
...
Checking database library for PostgreSQL
Success ... loaded library Pgtcl for PostgreSQL
...

Run HammerDB

Run HammerDB and create a PostgreSQL schema, in this example we use TPROC-C.

Once the schema is built, run the HammerDB workload.

Hit the metrics button, you now see the HammerDB Active Session History overview.

Grab the Metrics tab from the notebook and drag the window out. Grab the corner of the Window to expand it fully. Note, you can also shrink and expand the graph pane separately.

You now see the full details of the PostgreSQL workload. In this view, we can see WALWrite immediately as the main wait event.

The important concept of an Active Session History is that you can select a time period in the main graph window by dragging the grey box between a start and end point and HammerDB will show the statistics for that time period. You can drill down on the SQL being run, the event and the user.

When you have started the CPU agent you can also view the CPU metrics in real-time.

Conclusion

HammerDB PostgreSQL metrics allows you to view historical PostgreSQL performance metrics for your HammerDB workloads.

 

 

How to Compare Open Source and Proprietary Databases with HammerDB

We’ve recently given presentations at MariaDB MeetUp days on the methodology for comparing open source and proprietary databases. As we’ve been asked numerous times for the slides in this post we provide the content and summary of the presentation.

Benchmarking Performance Without Restrictions

Sharing performance metrics for proprietary databases such as Oracle and SQL Server is restricted by license agreement specifically DeWitt clauses. These clauses prohibit the publication of benchmark results without explicit vendor approval and you should be fully aware of what you can and can’t share without permission. Open source databases such as MariaDB, MySQL and PostgreSQL have no such restrictions, enabling freely sharing of performance evaluations using HammerDB.

HammerDB does not give legal advice, however if you run your own benchmarks on commercial databases with HammerDB and keep the results private then you are on safe ground from infringing any license conditions. There are no restrictions on any database on running benchmarks, only restrictions on sharing the results.


HammerDB: The Open Source Benchmarking Tool

HammerDB is a widely adopted benchmarking tool designed for relational databases. It is hosted by the TPC Council and available under the GPLv3 license on GitHub. Its cross-platform design supports both Windows and Linux clients, and Docker containers are available for simplified deployment. ARM support is also in development.

Do check the HammerDB download statistics page as HammerDB is cloned, and downloaded hundreds of thousands of times globally.

HammerDB supports multiple databases including MariaDB, Oracle, SQL Server, IBM Db2, MySQL, and PostgreSQL all accessed via native interfaces written in C for maximum performance and scalability.

HammerDB implements two primary workloads:

  • TPROC-C: An OLTP workload based on the TPC-C standard, designed to emulate transaction processing environments with high concurrency and throughput.
  • TPROC-H: An OLAP workload modeled after TPC-H, used for analytics and decision support scenarios with large scan-heavy queries.

A common question is that if the workloads are based on the TPC-C and TPC-H standards then why not simply use TPC-C and TPC-H and the associated terminology such as tpmC and QphH. The answer is clear in that this terminology is copyrighted and a trademark of the TPC-Council and any tool and users using this terminology is violating these restrictions unless a workload is  approved and published on the official TPC site. Using HammerDB you are assured that no copyright or trademarks are infringed when using or sharing (where permitted) performance data.

As would be expected with a database benchmarking tool, all workloads are implemented in SQL, for every workload and every database what runs in the database is SQL. For efficiency all of the HammerDB database interfaces are written in the native ‘C’ interface.

A key database benchmarking concept is that the tool and workload should stress the concurrency in the database and not in the client. A number of database benchmarking tools fall at this first hurdle. HammerDB is designed with low overhead and non-blocking, multithreaded parallel execution, ensuring we are stressing the concurrency control in the database and not in the client.

If you generate a flamegraph during a HammerDB test it demonstrates the minimal impact of the HammerDB client on system performance during TPROC-C benchmarking.

For OLTP tests, using stored procedures is highly recommended to reduce overhead and simulate realistic transactional patterns. This drastically reduces network round-trips and improves efficiency.

Another important concept that catches out many benchmarking beginners is the concept of cached vs scaled workloads.  HammerDB always recommends starting with a cached workload, this ‘sets the bar’ for performance of  a particular database and server.  Advanced users can then scale an environment with HammerDB advanced features to match the cached performance.


Interfaces, Schema, and Configuration

HammerDB offers multiple user interfaces: a native UHD GUI for Windows and Linux, a web interface (CloudTk), and full command-line support for automation.

Schema creation is a streamlined process. HammerDB builds the tables, loads the data, creates indexes and stored procedures, and collects statistics with a single action. You can also perform schema checks to ensure consistency.

Schema builds are affected by CPU (on the client), I/O (on the server), and network bandwidth if run remotely. For cloud-native benchmarks, it’s also possible to generate data into flat files and upload to the target system.

One common question is how many warehouses to configure. For most benchmarks, only a subset of warehouses are used as virtual user primarily accesses one warehouse. This keeps the data “hot” for a cached workload.

However, advanced scenarios like full-warehouse access allow for distributed I/O stress testing, especially in cloud environments.


Running Benchmarks and Automating Workloads

HammerDB supports flexible test configuration via driver scripts. These can be used to perform quick functional tests or timed runs that simulate production workloads. Test durations, ramp-up periods, and concurrency levels are easily configured.

While the GUI offers one-click test execution, automation for advanced users is where HammerDB excels. Using Tcl, Python, Bash, or PowerShell, entire workloads from schema creation to results capture can be scripted and run as part of CI pipelines. Docker images also enable HammerDB to be run in such environments in a self-contained environment.

Driver scripts offer flexible execution: quick test scripts, timed runs with ramp-up, and advanced custom options.

Running a test captures performance metrics and transactions over time.

Results are logged to a SQLite database, accessible via GUI or CLI.

A complete command-line test run can be built using Tcl or Python script to define connection and workload parameters and iteratively run performance profiles.


Performance Analysis and Result Interpretation

Each benchmark run is recorded as a job in the SQLite repository, enabling traceable, repeatable testing. HammerDB can sequence multiple jobs in an automated profile to discover scalability limits and performance plateaus. This helps expert performance engineers evaluate how well the database handles increasing user loads.

Job summaries and individual run details are available in the UI and exported formats, providing a clear view of performance characteristics.

Arguably, the most common database benchmarking error in OLTP environments is to run a single benchmark test and extrapolate the results to compare databases. This is where the performance profile is key. Automated job sequences increment virtual users to discover peak performance and scalability characteristics. Many database ‘scale’ perform and scale differently on different hardware and what may perform well on one system may not perform well on another.  The most important metric is peak performance and advanced users often observe how closely this correlates with physical and logical CPU counts.

HammerDB reports two main performance indicators:

  • NOPM (New Orders Per Minute) comparable across engines, approximates official tpmC scores.
  • TPM (Transactions Per Minute) measures internal engine activity, only comparable within the same engine.

Job detail output provides insights into performance at individual run levels for tuning and analysis.


Oracle Workloads and Future Plans

HammerDB continues to expand its capabilities. A key planned enhancement area  is the Oracle workload replay feature, whereby Oracle trace files can be converted and run against a database for a bespoke test workload.

HammerDB Oracle functionality also includes an Active Session History (ASH) viewer for monitoring database performance over time. Active Session History is also supported for PostgreSQL.


Final Thoughts: Hardware and Transparency

Database performance is never just about the database engine. Hardware configurations including CPU topology, memory layout, I/O subsystems, and networking play a decisive role and only holistic testing of the entire system with a tool that is proven to scale such as HammerDB can accurately reveal the weak spots.

Lastly, if someone has already performed relevant benchmarks (where permitted) reuse them and contribute your own results back to the community.

HammerDB in the near future plans to support  public uploads to the TPC to encourage transparency and shared learning.

HammerDB v5.0 Released – Powered by Tcl 9.0 and Single-File Executables

🚀 We’re excited to announce the release of HammerDB v5.0, a major leap forward for the world’s leading open-source database benchmarking tool. With this release, we’ve delivered the HammerDB we’ve always envisioned—modern, portable, and more powerful than ever.

Built on Tcl/Tk 9.0—the first major upgrade in over 27 years—HammerDB v5.0 brings a fully streamlined experience, enhanced performance, and unmatched ease of installation across platforms.

🧪 Benchmark Any Major Database

Whether you’re running MariaDB, PostgreSQL, Oracle, SQL Server, MySQL, or IBM Db2, HammerDB v5.0 makes comparing databases, faster, easier, and more consistent across platforms than ever before.

👉 Download HammerDB v5.0 now and start benchmarking with ease.

✅ Single-File Executables

No more juggling directories—v5.0 introduces standalone binaries for:

  • hammerdb (Graphical User Interface)

  • hammerdbcli (Command Line Interface)

  • hammerdbws (Web Service)

Each tool is now delivered as a single file with everything you need bundled inside. While Python dependencies remain external, they’re still fully supported via the familiar pylib directory.

✅ Effortless Installation

Getting started is now simpler than ever, with native packages for major platforms:

  • .deb packages for Ubuntu/Debian

  • .rpm packages for Red Hat/CentOS/Fedora

  • Native builds for Windows and Linux

  • ARM support coming soon

✅ Improved Windows Thread Handling

In earlier versions, Windows required 3x the threads per virtual user compared to Linux. With Tcl 9.0, HammerDB v5.0 now uses a 1:1 thread-to-virtual-user model across all platforms—bringing consistency and improved efficiency to Windows environments.

⚡ Performance Enhancements

Thanks to the Tcl 9 upgrade and native C package ports, v5.0 delivers internal performance boosts:

  • Up to 2–5% gains on TPROC-C workloads

  • Minimal changes for TPROC-H (by design)

While HammerDB benchmarks databases—not itself—these improvements can impact comparative results against earlier versions.

🔓 Still 100% Open Source

HammerDB remains fully open source under the GNU GPL v3.0. The source code is available on GitHub, and you can still build from source. Developer partners also have access to a special edition with the full directory structure of previous versions.

👉 Read the full changelog on GitHub

How to understand TPC-C tpmC and TPROC-C NOPM and what is ‘good’ performance?

More than ever, we see confusion in interpreting and comparing the performance of databases with workloads derived from the TPC-Council’s TPC-C specification, including HammerDB’s TPROC-C NOPM and TPM.

In this post, we revisit how to interpret transactional database performance metrics and give guidance on what levels of performance should be expected on up-to-date hardware and software in 2024.

tpmC

tpmC is the transactions per minute metric that is the measurement of the official TPC-C benchmark from the TPC-Council. To be more specific, tpmC measures the rate of New Order transactions only, executed per minute, while the database is also processing other transactions. Therefore, the tpmC is not the same as the number of total database transactions (i.e. transactions commits and rollbacks) recorded from the database metrics. This is an important concept, as different database engines may produce better performance with different ratios between tpmC and database transactions. Put simply, some database engines may perform better committing more frequently and some less so. Therefore, tpmC is the official metric and this is not the same as measuring database transactions.

Importantly, TPC-C and tpmC are registered trademarks of the TPC-Council. Without exception, TPC-C and tpmC can only be used for official audited TPC-C benchmarks published here by the TPC-Council. If a benchmark claiming a tpmC metric has not been audited and approved by the TPC-Council, then it is invalid as well as violating the registered trademarks of the TPC-C, only benchmarks found at the TPC-Council can use TPC-C and tpmC.

The TPC provide the specification for the TPC-C benchmark for free and workloads can be implemented derived from the specification, however it is not permitted to use the official terminology unless audited and HammerDB does not use the TPC-C and tpmC terminology to respect the TPC’s trademark.

NOPM and TPM/TPS

HammerDB implements a workload derived from the TPC-C specification called TPROC-C. HammerDB reports two metrics, NOPM and TPM.  The easiest way to understand NOPM and TPM is to think of the two metrics as the same you see on a dashboard. NOPM is how fast you are going, i.e. throughput, and TPM is how hard the engine is working to deliver that throughput.

The two metrics are relative, however, they are not the same.

NOPM

The HammerDB NOPM metric measures the rate of New Order transactions only, executed per minute, while the database is also processing other transactions. It therefore measures and reports the same metric as the official tpmC, however we are permitted to use tpmC for the reasons given in the section above. The NOPM value is the key measure of workload throughput.

TPM (TPS)

TPM is the metric derived from the database engine of the number of commits and rollbacks it is processing. This metric will relate to the transaction counter of database tools, for example Batches/sec in activity monitor for SQL Server Management studio. This metric allows you to relate database performance to the performance analysis of a specific database engine, however should not be compared between database engines. Where you see a similar TPS metric, simply multiply by 60 for a TPM equivalent. Note that we should typically expect TPM to be at least double that of NOPM and therefore be aware of comparing a TPM/TPS value from one workload with the NOPM value of another.

Fixed throughput workloads

Another concept to be familiar with, when comparing TPC-C derived workloads, is that of fixed throughput workloads and implementing keying and thinking time delays between transactions.  When using tpmC for an officially audited benchmark, this must be a fixed throughput workload, where each individual database session adds throughput so that the workload scales as we add more warehouses and users (and is another very good technical as well as legal reason not to use tpmC incorrectly). This type of approach can be implemented with HammerDB with the asynchronous scaling option as described in this post.

How to run a fixed throughput workload with HammerDB

However, with HammerDB at very early stages of development, it was identified that the performance ratio observed between systems was the same with both the HammerDB default workload without keying and thinking time as the official published TPC-C benchmarks with keying and thinking time.

Why this would be the case is straightforward. With an official TPC-C benchmarks there tends to be a TP Monitor or middleware between the clients and the server and therefore the clients are not connecting directly to the database. An official benchmark will not typically be implemented with thousands of connections to the database, instead the connections are managed by the middleware, as HammerDB does with asynchronous scaling where each Virtual User manages a much larger user defined number of connections. It is the middleware which maintains connections to the database, again typically in the low hundreds, that are driving high levels of throughput. This is why the default no key and think time approach produces similar scaling ratios to the official implementations.

Response Times

With HammerDB if you select the time profile checkbox it will record the response times of all the transactions. As shown below, using the HammerDB web service to view the results.  NEWORD is transaction that we are measuring for the NOPM metric.

Clearly, there is a relationship between response times and throughput.  For example, if we increase the number of virtual users and see performance at or lower than we saw at a lower number of virtual users, then the response times will be higher. For a ‘good’ level of throughput as we shall discuss in the next section our response should be in the low single-digit or tens of milliseconds. We should not expect response times to be in the region of hundreds or thousands of milliseconds (i.e. seconds).

 

What is ‘good’ performance?

Firstly, it is worth noting that database performance is relative.  With a high-performance database platform, you can do more with less to achieve the same results. However, there are trade-offs with, for example, uptime and reliability and for example a distributed database environment may provide lower performance results and higher response times but higher levels of availability.

Also, price/performance is an important, unless you have an unlimited budget you should always consider the Total Cost of Ownership (TCO) for a 3-year period of the system to gauge how much a particular level of performance will cost you.

Nevertheless, the key concept of database benchmarking is repeatability, if you set up exactly the same database configuration and run HammerDB against it, you should expect the same result. Any difference in performance can be attributed to the changes that you make.  Therefore, even when your testing environment is delivering throughput that far exceeds your expected production database performance, a test system that delivers better performance in a HammerDB test will also have the potential to outperform one for your real-world applications.

With these caveats, we can say that all the database engines that HammerDB supports can exceed 1M (1 million) NOPM for the HammerDB TPROC-C test, with the following example showing MariaDB.

Also note as discussed that the database transaction rate exceeds 2M more than double the NOPM value as expected.

So, regardless of database engine, we have the potential to see HammerDB TPROC-C performance in the millions of NOPM and transactions. Of course, the exact figure that you see will depend on both hardware and software or more specifically for hardware – the CPU, memory, I/O and the ability of the engineer to configure and tune the hardware platform and for database software the ability of the engineer to configure and tune the database software and operating system (and virtualization layer), as well as to setup up run HammerDB. Results in the 10’s of millions are not uncommon with the most advanced hardware, software and expertise.

Summary

We have seen what metrics are important to a TPC-C derived test and how to interpret them.  We have also seen what constitutes ‘good’ performance on an up-to-date system.

As a final point, it is worth noting that the commercial database supported by HammerDB have been running the TPC-C workload that HammerDB is derived from for a long-time and a comparatively large amount of time and effort went into ensuring these databases ran the TPC-C workload very well. Unsurprisingly, this translates directly into HammerDB performance, and with the right hardware and engineer expertise the commercial databases will both perform and scale well and set the bar as would be expected.

Additionally, open source database performance is improving rapidly all the time, and we have seen open source databases more than double performance in HammerDB tests in recent years compared to earlier releases. Always consider the latest releases when evaluating the potential of open source database performance. The ability to compare and contrast the performance of different database engines rather than running test workloads entirely in isolation that only run against one database is crucial to this improvement.

There has never been a more important time to be running your own database benchmarks to continually evaluate the optimal database platform for your needs.

 

 

 

 

 

 

 

 

Diagnosing CPU faults and performance with HammerDB

An often overlooked aspect of database benchmarking is that it should be used to stress test databases on all new hardware environments before they enter production. In this post we will look at an example of issue diagnosis and resolution with the desktop CPU Intel® Core™ i9-14900K as well as examples of measuring database performance with both the performance and efficient cores with this CPU. We will look at some initial TPROC-C tests with SQL Server 2022 and Windows 11 with what we learned from the configuration.

APPCRASH and Diagnosis

To begin with, running in the default environment with a low number of Virtual Users works as expected, however as soon as we start to increase the number of Virtual Users, HammerDB crashes with an example error as below.

Faulting application name: wish90.exe, version: 9.0.2.0, time stamp: 0x6749d6f6
Faulting module name: tcl90.dll, version: 9.0.2.0, time stamp: 0x6749d49c
Exception code: 0xc0000005
Fault offset: 0x00000000001577ec
Faulting process id: 0x2510
Faulting application start time: 0x1DB44CF52D8545D
Faulting application path: E:\HammerDB-5.0-Win\HammerDB-5.0\bin\wish90.exe
Faulting module path: E:\HammerDB-5.0-Win\HammerDB-5.0\bin\tcl90.dll
Report Id: 364bd54e-7da0-4984-9d78-eb6c4ee343ea
Faulting package full name: 
Faulting package-relative application ID:

Looking in the Windows event logs we can see associated errors related to CPU instability.

A corrected hardware error has occurred.

Reported by component: Processor Core
Error Source: Corrected Machine Check
Error Type: Internal parity error
Processor APIC ID: 32

whilst every workload we run resulting in application crashes or BSOD with errors such as CLOCK_WATCHDOG_TIMEOUT, clearly highlighting a problem with our configuration.

Whereas a user might initially look to errors in the application as the cause of the issue, searching for the errors we see in the system logs leads us to the following link Vmin Shift Instability Root Cause specifically related to the Intel® Core™ i9-14900K with the resolution given as follows “For all Intel® Core™ 13th/14th Gen desktop processor users: the 0x12B microcode update must be loaded via BIOS update and has been distributed to system and motherboard manufacturers to incorporate into their BIOS.”

We have therefore diagnosed and resolved an issue with database performance through stress testing the platform with HammerDB.

Testing the Solution

Now we have updated the system BIOS with the 0x12B microcode update, the system or application no longer crashes and the schema build completes as expected.

We can now run a test for a longer duration with the system exhibiting stability

at 2M SQL Server TPM

and a highly respectable throughput of 800K+ NOPM for a desktop environment.

CPU affinity and calibration

However we can see from the HammerDB CPU metrics that of the 32 logical CPUs available the first 16 cores appear to be underutilised during the test. (Another important CPU check is to identify individual cores with high levels of system utilisation, that could identify individual cores overburdened with interrupts). Additionally if we check the configuration of the i9-14900K we can see that there are 24 cores in this CPU of which 8 are performance cores,  16 efficient cores and hyper-threading  is only available on Performance-cores, therefore the first 16 logical CPUs on the system could be the 2 x logical CPUs of the performance cores.

With SQL Server we can check this by manually setting the Processor Affinity

and using a single-threaded CPU test as described in the HammerDB documentation.

In this case we create and run the following routine on SQL Server

USE [tpcc]
GO
SET ANSI_NULLS ON
GO
CREATE PROCEDURE [dbo].[CPUSIMPLE] 
AS
   BEGIN
      DECLARE
         @n numeric(16,6) = 0,
         @a DATETIME,
         @b DATETIME
      DECLARE
         @f int
      SET @f = 1
      SET @a = CURRENT_TIMESTAMP
      WHILE @f <= 10000000 
         BEGIN
      SET @n = @n % 999999 + sqrt(@f)
            SET @f = @f + 1
         END
         SET @b = CURRENT_TIMESTAMP
         PRINT ‘Timing = ‘ + ISNULL(CAST(DATEDIFF(MS, @a, @b)AS VARCHAR),”)
         PRINT ‘Res = ‘+ ISNULL(CAST(@n AS VARCHAR),”)
   END

Running this test firstly on firstly CPUs 0-15 and then 16-31 confirms our expectations that with a single-thread 0-15 completes the test in just over 5 seconds

Timing = 5350
Res = 873729.721235

(1 row affected)

Completion time: 2024-12-03T13:22:00.6322394+00:00

and 12 seconds on CPUs 16-31

Timing = 12013
Res = 873729.721235

(1 row affected)

Completion time: 2024-12-03T13:22:59.1543056+00:00

Using this knowledge what if we set for example the CPU affinity to the first 24 CPUs only. Re-running the test show we are now using the performance cores

and seeing a boost in performance

and when the test completes we have now reached the symbolic 1M NOPM in a desktop environment, importantly without any system instabilities.

As we have run the CPU single-threaded tests in T-SQL for completeness we can also run the same test within HammerDB itself, with an elapsed time of just over 1 second showing that the client side logic of HammerDB offers high levels of performance compared to database stored procedures.

We can also verify from system tools that the CPU is reaching the levels of frequency expected from the configuration.

(Note that as the test completes so quickly you can increase the number of iterations in the loop to observe the boost in frequency for longer.)

Summary

What we observed from these tests is that ALL hardware and software environments should be stress tested, up to and beyond expected operating conditions to identify potential issues. In this example we found a repeatable way with HammerDB to identify and resolve a CPU instability issue, however issues can also occur for example at the memory and I/O layers, stress testing and observing system logs is the best practice to identify these before the configuration is deployed in production.

We also used the HammerDB CPU metrics to observe the distribution of the CPU utilisation across the available CPUs and use SQL Server CPU affinity to improve performance by 17% over the default configuration using the knowledge we had of this particular CPU’s configuration.

How to analyze PostgreSQL benchmark performance with HammerDB

In this post, we will look at the findings of a blog post by EnterpriseDB analyzing a HammerDB workload against PostgreSQL using Postgres Workload reports. We will provide a brief summary of the article and its conclusions, and demonstrate a better way to analyze PostgreSQL performance with HammerDB itself. We will then highlight the errors that have been made in the EnterpriseDB article and the reasons why.

The EnterpriseDB blog post can be found here, How Postgres Workload Reports Help Optimize Database Operations.

An excellent point that the author makes is that special attention should be given to two specific items: the actual_start_snap_ts (2024-02-15 07:44:18.588779+00) and the actual_end_snap_ts (2024-02-15 07:54:16.587401+00)… This indicates that the database is being evaluated over a 10-minute period. We will remember this point as it gives an indication of the common error made when using analysis based on snapshot type workload reports (we have seen the same error with Oracle AWR reports).

The key findings of the article were as follows:

  1. This server had a HammerDB benchmark running against it. One possibility – and in this case, the most probable conclusion – is that the client test machine was overwhelmed and could not respond to the server fast enough.
  2. The client was waiting for user input, such as a return from a prompt.

It doesn’t mean that something on the server is impacting the system’s throughput. Instead, the issue is with the client.

And the SQL with the highest wait time in the ClientRead wait event was the following:

COPY stock (s_i_id, s_w_id, s_quantity, s_dist_01, s_dist_02, s_dist_03, s_dist_04, s_dist_05, s_dist_06, s_dist_07, s_dist_08, s_dist_09, s_dist_10, s_data, s_ytd, s_order_cnt, s_remote_cnt) FROM STDIN WITH (FORMAT CSV)

An example PostgreSQL manual at AWS gives an indication of why a COPY operation might result in a ClientRead wait event.

During a copy operation, the data is transferred from the client’s file system to the Aurora PostgreSQL DB cluster. Sending a large amount of data to the DB cluster can delay transmission of data from the client to the DB cluster.

But why are we running a COPY operation during a benchmark anyway? To investigate further, we will analyze PostgreSQL performance using HammerDB built-in tools for a deeper insight into the workload.

(Note, we are using the new awbreezedark theme to be introduced with HammerDB v4.13).

Setting up pgSentinel

HammerDB PostgreSQL metrics is based on a superb tool called pgSentinel that brings active session history functionality to PostgreSQL. An active session history allows us to look back at a database workload and see the workload being run over time. Note that this is in contrast to a snapshot/report type functionality that shows us an average of the workload over a period of time, such as 10 minutes. The active session history instead allows us to drill down into specific time periods of interest.

To use pgSentinel we have installed pg_stat_statements and pgsentinel and added the following to our postgresql.conf.

shared_preload_libraries = 'pg_stat_statements,pgsentinel'
track_activity_query_size=2048
pg_stat_statements.save=on
pg_stat_statements.track=all
pgsentinel_pgssh.enable = true
pgsentinel_ash.pull_frequency = 1
pgsentinel_ash.max_entries = 1000000

Building the HammerDB schema

Before we run the workload, we need to install the schema, so for this example running locally on a laptop we are going to configure a 30 warehouse schema with 4 Virtual Users.

and start the build running.

Press the HammerDB metrics button to start the HammerDB PostgreSQL metrics, collection, grab the Metrics tab in the notebook display, drag it outside the main window and expand to reveal all the metrics being collected over time.  Note if you only have browser access to a remote server you can run HammerDB through the browser, so this functionality is available in all environments.

Deploy HammerDB as a Web Application with CloudTK

The first thing we see in real time is we have on average 2 active sessions, waiting 100% of the time on ClientRead and the SQL it is waiting on is the same COPY stock statement we saw from the EnterpriseDB blog post. (click on the image to expand for a detailed view).

So this COPY statement is coming from the schema build phase and not the HammerDB benchmark workload at all. It should be clear as shown in the manual that if we are running a COPY statement and transferring a large amount of data, we should expect to spend a lot of time waiting for the client to send that data. The client is not ‘overwhelmed’ at all, this is normal when running a COPY statement.

Running the HammerDB benchmark

Once the schema build has completed, we run the HammerDB benchmark workload itself.

and checking with the transaction counter, we can see we are now running a workload against the PostgreSQL database.

Now if we head back to the HammerDB metrics we will use the selection box in the graph to select a period of time when we are running the benchmark workload rather than the build. It should be clear from the graph itself that although the build is dominated by the ClientRead wait event, now we are running the workload most of our time is spent in foregound CPU and the top SQL is related to the benchmark workload and not the build.

If you click on the CPU event, it will also report the SQL that causes most occurrences of that event, which of courses matches with the top SQL shown first.

If you close the metrics window, it will reattach to HammerDB, experts can use this view for a quick oversight of the wait events experienced by the database.

When the workload has completed, HammerDB reports the performance.

So, now it should be clear, the error made by EnterpriseDB in analyzing PostgreSQL performance using workload reports.  The clue is in the special attention paid to the 10-minute period of the workload report. It is highly likely that in this period of time they ran a schema build and a benchmark workload and then mistakenly attributed the build wait events to the workload and reached the wrong overall conclusion (that the client test machine was overwhelmed and could not respond to the server fast enough). Of course, we already know this will not be the case in advance because we use stored procedures to ensure that the workload runs on the server and not the client.

We have learned an important aspect of snapshot type workload reports. Firstly, they show an average of all the work taking place on a database over a period of time, so as in this case if you have 2 separate scenarios (such as schema build and benchmark) in this time period the overall report can be confusing. Secondly, snapshot workload reports are relative. The wait events they show are only relative to the work taking place at that time, so for example, a report from an idle system may have the DBA diagnosing obscure housekeeping related wait events as the DBA taking the snapshot has missed the crucial workload. When using snapshot type reports, pay close attention to what the workload should be that you think you are capturing and clues that you could have made an error.

Instead, using pgSentinel for an active session history approach shows the active sessions over time and their wait events, making it easier to determine that different workloads have been running consecutively.

We looked at the blog post How Postgres Workload Reports Help Optimize Database Operations by EnterpriseDB and saw how using a workload report to analyze a HammerDB workload resulted in an incorrect conclusion. We then presented the HammerDB alternative based on pgSentinel and an active session history approach to illustrate that it was likely EnterpriseDB attributed the wait events from a HammerDB schema build to a benchmark run, thereby reaching this incorrect conclusion.

Of course, workload reports and the active session history complement each other in diagnosing database performance, however with workload reports it is especially important to know what workload is running on the database during the snapshot interval. For example, with the HammerDB Oracle workload a snapshot for an AWR report is taken automatically after the ramp-up period is complete to ensure that the report only shows the database metrics for the workload period.

Using HammerDB in database failure and failover scenarios

A frequently asked question with HammerDB is when a user is using the TPROC-C workload to test database failure and failover scenarios, by deliberately killing connections or shutting down the database during a workload and then restarting it. This post provides a guide to configuring HammerDB for such a scenario and considerations to be aware of when doing so.

Is a TPROC-C workload valid if you have restarted the database?

This is an important consideration that is often missed, so although it seems obvious, it is worth reiterating that benchmarking applications are not typically designed to be disconnected and the database restarted whilst the workload is running. Of course, you can configure your test environment however you wish, and HammerDB is scripted for exactly this flexibility. However, your results will not be valid or comparable to other results. Additionally, as HammerDB is designed as a benchmarking application, there are no plans to extend functionality to automatically recover in-flight tests from database failures. If the database ‘goes away’ during a test, then the test is invalid and the database should be restarted and the benchmark re-run in its entirety.

Nevertheless, having said this, HammerDB has additional functionality to both make multiple connections to test environments and run test scripts multiple times.

Connect pool functionality for clusters

Firstly, when testing clusters, it is possible to configure a pool of connections for HammerDB to use to connect to multiple instances at the same time. We can also configure multiple databases in the same instance to test at the same time as well. In the example below, we have configured this connect pool for PostgreSQL to connect to 3 different databases and round-robin the connections between all of them. Note that you can direct each stored procedure to a different instance, such as read-write vs read-only.

HammerDB-4.11/config/connectpool$ more pgcpool.xml
<connpool>
<connections>
<c1>
<pg_host>localhost</pg_host>
<pg_port>5432</pg_port>
<pg_sslmode>prefer</pg_sslmode>
<pg_user>tpcc1</pg_user>
<pg_pass>tpcc1</pg_pass>
<pg_dbase>tpcc1</pg_dbase>
</c1>
<c2>
<pg_host>localhost</pg_host>
<pg_port>5432</pg_port>
<pg_sslmode>prefer</pg_sslmode>
<pg_user>tpcc2</pg_user>
<pg_pass>tpcc2</pg_pass>
<pg_dbase>tpcc2</pg_dbase>
</c2>
<c3>
<pg_host>localhost</pg_host>
<pg_port>5432</pg_port>
<pg_sslmode>prefer</pg_sslmode>
<pg_user>tpcc3</pg_user>
<pg_pass>tpcc3</pg_pass>
<pg_dbase>tpcc3</pg_dbase>
</c3>
</connections>
<sprocs>
<neworder>
<connections>c1 c2 c3</connections>
<policy>round_robin</policy>
</neworder>
<payment>
<connections>c1 c2 c3</connections>
<policy>round_robin</policy>
</payment>
<delivery>
<connections>c1 c2 c3</connections>
<policy>round_robin</policy>
</delivery>
<stocklevel>
<connections>c1 c2 c3</connections>
<policy>round_robin</policy>
</stocklevel>
<orderstatus>
<connections>c1 c2 c3</connections>
<policy>round_robin</policy>
</orderstatus>
</sprocs>
</connpool>

To use this XML configuration, select XML Connect Pool in the connection options.

See this post for more details on configuration.  When we run the workload, we now make a connection to all the configured databases and run the transactions according to the defined policy.

If we look at the configured connections when the workload has started, we can see the multiple connections across the defined databases.  Note in this example the main connection for measuring the NOPM/TPM values connects to the databases tpcc1.

00:00:00 postgres: tpcc1 tpcc1 127.0.0.1(52450) idle
00:00:07 postgres: tpcc1 tpcc1 127.0.0.1(34704) idle
00:00:07 postgres: tpcc2 tpcc2 127.0.0.1(34710) SELECT
00:00:07 postgres: tpcc3 tpcc3 127.0.0.1(34722) idle
00:00:00 postgres: tpcc1 tpcc1 127.0.0.1(34738) idle
00:00:07 postgres: tpcc1 tpcc1 127.0.0.1(34748) idle
00:00:07 postgres: tpcc2 tpcc2 127.0.0.1(34758) idle
00:00:07 postgres: tpcc3 tpcc3 127.0.0.1(34770) SELECT
00:00:00 postgres: tpcc1 tpcc1 127.0.0.1(34786) idle
00:00:07 postgres: tpcc1 tpcc1 127.0.0.1(34788) idle
00:00:07 postgres: tpcc2 tpcc2 127.0.0.1(34792) idle
00:00:07 postgres: tpcc3 tpcc3 127.0.0.1(34804) SELECT
00:00:00 postgres: tpcc1 tpcc1 127.0.0.1(34820) idle
00:00:07 postgres: tpcc1 tpcc1 127.0.0.1(34822) SELECT
00:00:07 postgres: tpcc2 tpcc2 127.0.0.1(34826) idle
00:00:07 postgres: tpcc3 tpcc3 127.0.0.1(34836) idle
00:00:00 postgres: tpcc1 tpcc1 127.0.0.1(34850) idle

When the test has finished, the NOPM/TPM values are reported from the main connection.

Some databases will report the NOPM from across the cluster but some only per database and therefore this feature will also report the client side TPM to monitor the values per connection.

"4 Active Virtual Users configured",
"1",
"TEST RESULT : System achieved 3591 NOPM from 24731 PostgreSQL TPM",
"2",
"VU2 processed neworder 18343 payment 18127 delivery 1823 stocklevel 1853 orderstatus 1838 transactions",
"0",
"Vuser 2:FINISHED SUCCESS",
"4",
"VU4 processed neworder 18178 payment 18294 delivery 1791 stocklevel 1861 orderstatus 1860 transactions",
"0",
"Vuser 4:FINISHED SUCCESS",
"5",
"VU5 processed neworder 18269 payment 18206 delivery 1824 stocklevel 1866 orderstatus 1819 transactions",
"0",
"Vuser 5:FINISHED SUCCESS",
"3",
"VU3 processed neworder 18789 payment 18668 delivery 1912 stocklevel 1878 orderstatus 1761 transactions",
"0",
"Vuser 3:FINISHED SUCCESS",
"1",
"CLIENT SIDE TPM : neworder 10511 payment 10470 delivery 1050 stocklevel 1065 orderstatus 1039",
"0",
"Vuser 1:FINISHED SUCCESS",
"0",
"ALL VIRTUAL USERS COMPLETE"

When testing clusters, this feature allows you to measure the throughput across multiple instances at the same time and therefore also test the planned failure of some of the databases you are testing.

Virtual User Iterations

If you want to test killing connections and have HammerDB reconnect, then you should use the Virtual User Iterations feature. You can think of this value as an outer loop around Total Transactions per User set in the Driver Options for a workload. For example, if you change Total Transactions per User to 100,000 and Virtual User Iterations to 10 then each Virtual User will do 1000,000 transactions in total across 10 separate sessions having logged off and on 10 times. Of course, however, if you kill the connection first then the session will also reconnect, allowing you to do the same for failure scenarios.

When you create the virtual users, you can see the iterations value in the central panel.

Now if you terminate a session, in this example with select pg_terminate_backend(9011); where 9011 is a Virtual User pid. You can see the error, and that the Virtual User has reconnected.

Also note that when killing connections, they can be left in an uncertain state that can result in crashes or errors when the same thread starts running again. Therefore, it is best practice to add a short pause before reconnecting, as shown with the line “after 2000” below and also to preemptively close the connection before trying to reconnect, in this example with catch {pg_disconnect $lda}

Summary

Benchmarking applications are not typically designed for failure and failover scenarios. Nevertheless, HammerDB provides users with the functionality to create their own bespoke test environments that allow them to test multiple instances in a cluster and also to reconnect and restart a test after a failure. Using both features either independently or combined enables having an application that can test database or connection failure instead of benchmarking performance.