HammerDB Variable or Step Workloads are an advanced testing feature that enables you to automatically vary the load on the database over a period of time. When taking this approach you would not focus on the test result but instead monitor the databases ability to cope with the variation in demand and transaction response times.
To implement Step Workloads, HammerDB v4.1 introduces a CLI only command called steprun combined with a new XML configuration file called steps.xml. the steprun command reads the XML configuration file and creates primary and replica instances of HammerDB per step with the replica instances starting at pre-defined time intervals automatically connecting back to the primary instance of HammerDB.
When defining the workload it is best to think of the configuration as defining a pyramid with the primary at the base and the replicas sitting above it. Each replica must finish at the same time or earlier than the primary. The primary running time continues to be defined by the standard settings. For example for configuring an Oracle workload the following commands set the rampup and duration running times respectively. Therefore in this case the workload woud run for 12 minutes with 2 minutes of rampup and 10 minutes of duration.
diset tpcc rampup 2 diset tpcc duration 10
The replicas are defined in steps.xml in the config directory to determine how many virtual users each replica will configure, how soon after the previous instance has started the replica should start and how long it will run for.
<steps> <replica1> <start_after_prev>2</start_after_prev> <duration>8</duration> <virtual_users>2</virtual_users> </replica1> <replica2> <start_after_prev>2</start_after_prev> <duration>6</duration> <virtual_users>2</virtual_users> </replica2> <replica3> <start_after_prev>2</start_after_prev> <duration>4</duration> <virtual_users>2</virtual_users> </replica3> <replica4> <start_after_prev>2</start_after_prev> <duration>2</duration> <virtual_users>2</virtual_users> </replica4> </steps>
If the configuration is incorrect HammerDB will report the error and fail to start the Step Workload. In the following example we have set the default 2 minutes of rampup and 5 minutes of test.
diset tpcc rampup 2 diset tpcc duration 5
In this case the workload errors because the replica running times exceed the primary.
Value 2 for tpcc:rampup is the same as existing value 2, no change made Value 5 for tpcc:duration is the same as existing value 5, no change made primary starts immediately, runs rampup for 2 minutes then runs test for 5 minutes with 2 Active VU replica1 starts 2 minutes after rampup completes and runs test for 8 minutes with 2 Active VU Error: replica1 is set to complete after 12 minutes and is longer than the Primary running time of 7 minutes replica2 starts 2 minutes after previous replica starts and runs test for 6 minutes with 2 Active VU Error: replica2 is set to complete after 12 minutes and is longer than the Primary running time of 7 minutes replica3 starts 2 minutes after previous replica starts and runs test for 4 minutes with 2 Active VU Error: replica3 is set to complete after 12 minutes and is longer than the Primary running time of 7 minutes replica4 starts 2 minutes after previous replica starts and runs test for 2 minutes with 2 Active VU Error: replica4 is set to complete after 12 minutes and is longer than the Primary running time of 7 minutes Error: Step workload primary running time must exceed the running time of all replicas
Instead the following script defines the rampup of 2 minutes and duration of 10 correctly so that the primary provides an adequate base for the pyramid, starting before and ending at the same time or after all of the replicas. The workload is started with the steprun command as the last command. No commands should follow steprun as the command will intentionally exit all replicas and the primary when the workload is complete.
dbset db ora dbset bm TPC-C diset connection system_user system diset connection system_password oracle diset connection instance RAZPDB1 diset tpcc tpcc_user tpcc diset tpcc tpcc_pass tpcc diset tpcc total_iterations 10000000 diset tpcc ora_driver timed diset tpcc rampup 2 diset tpcc duration 10 vuset logtotemp 1 vuset vu 2 steprun
Running this script it can be seen that without further intervention the primary and replicas are created with the replicas automatically connecting back to the primary, the replicas are then started at the time interval given in the steps.xml file.
The primary sets the rampup in the replicas to zero (as the rampup has completed in the primary) and then sends the individual duration times to the replicas. Time profiling is also disabled in the replicas. When complete replicas will call exit from the primary and when the final replica has completed the primary will also exit.
Note that it is expected for Virtual User 1 of the replicas to end immediately with the message.
Vuser 1:FINISHED SUCCESS
This is because Virtual User 1 is the monitor Virtual User but this Virtual User does not run in the replica meaning it ends immediately. When the replicas are started the message sending “run_virtual” is recorded.
Sending "run_virtual" ....
The following output shows the previously defined step workload running against an Oracle database.
hammerdb>source runstepora.tcl Database set to Oracle Benchmark set to TPC-C for Oracle Value system for connection:system_user is the same as existing value system, no change made Changed connection:system_password from manager to oracle for Oracle Changed connection:instance from oracle to RAZPDB1 for Oracle Value tpcc for tpcc:tpcc_user is the same as existing value tpcc, no change made Value tpcc for tpcc:tpcc_pass is the same as existing value tpcc, no change made Changed tpcc:total_iterations from 1000000 to 10000000 for Oracle Clearing Script, reload script to activate new setting Script cleared Changed tpcc:ora_driver from test to timed for Oracle Value 2 for tpcc:rampup is the same as existing value 2, no change made Changed tpcc:duration from 5 to 10 for Oracle primary starts immediately, runs rampup for 2 minutes then runs test for 10 minutes with 2 Active VU replica1 starts 2 minutes after rampup completes and runs test for 8 minutes with 2 Active VU replica2 starts 2 minutes after previous replica starts and runs test for 6 minutes with 2 Active VU replica3 starts 2 minutes after previous replica starts and runs test for 4 minutes with 2 Active VU replica4 starts 2 minutes after previous replica starts and runs test for 2 minutes with 2 Active VU Switch from Local to Primary mode? Enter yes or no: replied yes Setting Primary Mode at id : 20166, hostname : razorbill.home Primary Mode active at id : 20166, hostname : razorbill.home Starting 1 replica HammerDB instance Starting 2 replica HammerDB instance HammerDB CLI v4.1 Copyright (C) 2003-2021 Steve Shaw Type "help" for a list of commands HammerDB CLI v4.1 Copyright (C) 2003-2021 Steve Shaw Type "help" for a list of commands Starting 3 replica HammerDB instance Starting 4 replica HammerDB instance Doing wait to connnect .... Primary waiting for all replicas to connect .... 0 out of 4 are connected HammerDB CLI v4.1 Copyright (C) 2003-2021 Steve Shaw Type "help" for a list of commands HammerDB CLI v4.1 Copyright (C) 2003-2021 Steve Shaw Type "help" for a list of commands The xml is well-formed, applying configuration The xml is well-formed, applying configuration The xml is well-formed, applying configuration Switch from Local to Replica mode? Enter yes or no: replied yes Switch from Local to Replica mode? Enter yes or no: replied yes Setting Replica Mode at id : 20182, hostname : razorbill.home Replica connecting to localhost 20166 : Connection succeeded Received a new replica connection from host ::1 Setting Replica Mode at id : 20181, hostname : razorbill.home Replica connecting to localhost 20166 : Connection succeeded New replica joined : {20182 razorbill.home} The xml is well-formed, applying configuration Received a new replica connection from host ::1 New replica joined : {20182 razorbill.home} {20181 razorbill.home} Switch from Local to Replica mode? Primary call back successful Switched to Replica mode via callback Enter yes or no: replied yes Primary call back successful Switched to Replica mode via callback Setting Replica Mode at id : 20183, hostname : razorbill.home Received a new replica connection from host ::1 Replica connecting to localhost 20166 : Connection succeeded New replica joined : {20182 razorbill.home} {20181 razorbill.home} {20183 razorbill.home} Primary call back successful Switched to Replica mode via callback Switch from Local to Replica mode? Enter yes or no: replied yes Setting Replica Mode at id : 20184, hostname : razorbill.home Received a new replica connection from host ::1 Replica connecting to localhost 20166 : Connection succeeded New replica joined : {20182 razorbill.home} {20181 razorbill.home} {20183 razorbill.home} {20184 razorbill.home} Primary call back successful Switched to Replica mode via callback Primary waiting for all replicas to connect .... {20182 razorbill.home} {20181 razorbill.home} {20183 razorbill.home} {20184 razorbill.home} out of 4 are connected Primary Received all replica connections {20182 razorbill.home} {20181 razorbill.home} {20183 razorbill.home} {20184 razorbill.home} Database set to Oracle Database set to Oracle Database set to Oracle Setting primary to run 2 virtual users for 10 duration Database set to Oracle Database set to Oracle Value 10 for tpcc:duration is the same as existing value 10, no change made Sending dbset all to 20182 razorbill.home Setting replica1 to start after 2 duration 8 VU count 2, Replica instance is 20182 razorbill.home Sending "diset tpcc ora_timeprofile false" to 20182 razorbill.home Value false for tpcc:ora_timeprofile is the same as existing value false, no change made Sending "diset tpcc rampup 0" to 20182 razorbill.home Changed tpcc:rampup from 2 to 0 for Oracle Sending "diset tpcc duration 8" to 20182 razorbill.home Changed tpcc:duration from 10 to 8 for Oracle Sending "vuset vu 2" to 20182 razorbill.home Sending dbset all to 20181 razorbill.home Setting replica2 to start after 2 duration 6 VU count 2, Replica instance is 20181 razorbill.home Sending "diset tpcc ora_timeprofile false" to 20181 razorbill.home Value false for tpcc:ora_timeprofile is the same as existing value false, no change made Sending "diset tpcc rampup 0" to 20181 razorbill.home Changed tpcc:rampup from 2 to 0 for Oracle Sending "diset tpcc duration 6" to 20181 razorbill.home Changed tpcc:duration from 10 to 6 for Oracle Sending "vuset vu 2" to 20181 razorbill.home Sending dbset all to 20183 razorbill.home Setting replica3 to start after 2 duration 4 VU count 2, Replica instance is 20183 razorbill.home Sending "diset tpcc ora_timeprofile false" to 20183 razorbill.home Value false for tpcc:ora_timeprofile is the same as existing value false, no change made Sending "diset tpcc rampup 0" to 20183 razorbill.home Changed tpcc:rampup from 2 to 0 for Oracle Sending "diset tpcc duration 4" to 20183 razorbill.home Changed tpcc:duration from 10 to 4 for Oracle Sending "vuset vu 2" to 20183 razorbill.home Sending dbset all to 20184 razorbill.home Setting replica4 to start after 2 duration 2 VU count 2, Replica instance is 20184 razorbill.home Sending "diset tpcc ora_timeprofile false" to 20184 razorbill.home Value false for tpcc:ora_timeprofile is the same as existing value false, no change made Sending "diset tpcc rampup 0" to 20184 razorbill.home Changed tpcc:rampup from 2 to 0 for Oracle Sending "diset tpcc duration 2" to 20184 razorbill.home Changed tpcc:duration from 10 to 2 for Oracle Sending "vuset vu 2" to 20184 razorbill.home Script loaded, Type "print script" to view Script loaded, Type "print script" to view Script loaded, Type "print script" to view Script loaded, Type "print script" to view Script loaded, Type "print script" to view Vuser 1 created MONITOR - WAIT IDLE Vuser 2 created - WAIT IDLE Vuser 3 created - WAIT IDLE Vuser 1 created MONITOR - WAIT IDLE Vuser 2 created - WAIT IDLE Vuser 1 created MONITOR - WAIT IDLE Vuser 3 created - WAIT IDLE Vuser 2 created - WAIT IDLE 3 Virtual Users Created with Monitor VU Vuser 3 created - WAIT IDLE 3 Virtual Users Created with Monitor VU Vuser 1 created MONITOR - WAIT IDLE Vuser 2 created - WAIT IDLE Vuser 3 created - WAIT IDLE 3 Virtual Users Created with Monitor VU Vuser 1 created MONITOR - WAIT IDLE Vuser 2 created - WAIT IDLE Vuser 3 created - WAIT IDLE 3 Virtual Users Created with Monitor VU Logging activated to /tmp/hammerdb.log 3 Virtual Users Created with Monitor VU Starting Primary VUs Vuser 1:RUNNING Vuser 1:Beginning rampup time of 2 minutes Vuser 2:RUNNING Vuser 2:Processing 10000000 transactions with output suppressed... Vuser 3:RUNNING Vuser 3:Processing 10000000 transactions with output suppressed... Delaying Start of Replicas to rampup 2 replica1 2 replica2 2 replica3 2 replica4 2 Delaying replica1 for 4 minutes. Delaying replica2 for 6 minutes. Delaying replica3 for 8 minutes. Delaying replica4 for 10 minutes. Primary entering loop waiting for vucomplete Vuser 1:Rampup 1 minutes complete ... Vuser 1:Rampup 2 minutes complete ... Vuser 1:Rampup complete, Taking start AWR snapshot. Vuser 1:Start Snapshot 18 taken at 10 MAY 2021 09:07 of instance RAZCDB1 (1) of database RAZCDB1 (171153594) Vuser 1:Timing test period of 10 in minutes Vuser 1:1 ..., Sending "run_virtual" to 20182 razorbill.home Vuser 1:RUNNING Vuser 1:Operating in Replica Mode, No Snapshots taken... Vuser 1:FINISHED SUCCESS Vuser 2:RUNNING Vuser 2:Processing 10000000 transactions with output suppressed... Vuser 3:RUNNING Vuser 3:Processing 10000000 transactions with output suppressed... Vuser 1:2 ..., Vuser 1:3 ..., Sending "run_virtual" to 20181 razorbill.home Vuser 1:RUNNING Vuser 1:Operating in Replica Mode, No Snapshots taken... Vuser 1:FINISHED SUCCESS Vuser 2:RUNNING Vuser 2:Processing 10000000 transactions with output suppressed... Vuser 3:RUNNING Vuser 3:Processing 10000000 transactions with output suppressed... Vuser 1:4 ..., Vuser 1:5 ..., Sending "run_virtual" to 20183 razorbill.home Vuser 1:RUNNING Vuser 1:Operating in Replica Mode, No Snapshots taken... Vuser 1:FINISHED SUCCESS Vuser 2:RUNNING Vuser 2:Processing 10000000 transactions with output suppressed... Vuser 3:RUNNING Vuser 3:Processing 10000000 transactions with output suppressed... Vuser 1:6 ..., Vuser 1:7 ..., Sending "run_virtual" to 20184 razorbill.home Vuser 1:RUNNING Vuser 1:Operating in Replica Mode, No Snapshots taken... Vuser 1:FINISHED SUCCESS Vuser 2:RUNNING Vuser 2:Processing 10000000 transactions with output suppressed... Vuser 3:RUNNING Vuser 3:Processing 10000000 transactions with output suppressed... Vuser 1:8 ..., Vuser 1:9 ..., Vuser 1:10 ..., Vuser 1:Test complete, Taking end AWR snapshot. Vuser 1:End Snapshot 19 taken at 10 MAY 2021 09:17 of instance RAZCDB1 (1) of database RAZCDB1 (171153594) Vuser 1:Test complete: view report from SNAPID 18 to 19 Vuser 1:2 Active Virtual Users configured Vuser 1:TEST RESULT : System achieved 13607 NOPM from 28559 Oracle TPM Vuser 1:FINISHED SUCCESS Vuser 2:FINISHED SUCCESS Vuser 3:FINISHED SUCCESS Vuser 2:FINISHED SUCCESS Vuser 3:FINISHED SUCCESS Vuser 3:FINISHED SUCCESS ALL VIRTUAL USERS COMPLETE Vuser 2:FINISHED SUCCESS Replica workload complete and calling exit from primary Lost connection to : 20182 razorbill.home because target application died or connection lost Vuser 3:FINISHED SUCCESS ALL VIRTUAL USERS COMPLETE Vuser 3:FINISHED SUCCESS ALL VIRTUAL USERS COMPLETE Vuser 2:FINISHED SUCCESS ALL VIRTUAL USERS COMPLETE Vuser 2:FINISHED SUCCESS ALL VIRTUAL USERS COMPLETE Replica workload complete and calling exit from primary Lost connection to : 20181 razorbill.home because target application died or connection lost Replica workload complete and calling exit from primary Lost connection to : 20183 razorbill.home because target application died or connection lost Replica workload complete and calling exit from primary Lost connection to : 20184 razorbill.home because target application died or connection lost Primary complete deleting port_file /tmp/hdbcallback.tcl Step workload complete
Monitoring the workload enables you to see the variation and the impact of starting additional instances against the same database over time.
Step workloads enable you to configure complex Virtual User configurations to see how your database responds to changes in load over time.