This is a very typical FAQ and although detailed in the documentation some extra details may help in sizing and configuration. For a basic starting point create a schema with 250-500 warehouses per server CPU socket for more details size as follows.
The official TPC-C test has a fixed number of users per warehouse and uses keying and thinking time so that the workload generated by each user is not intensive. However most people use HammerDB with keying and thinking time disabled and therefore each virtual user can approximately drive the CPU resources of one CPU core on the database server. Therefore the relationship between virtual users, warehouses and cores can be seen, you need considerably fewer virtual users and warehouses to drive a system to maximum throughput than the official test.
Additionally it is important to understand the workload. By default each virtual user has the concept of a home warehouses where approximately 90% of its workload will take place. With HammerDB this home warehouse is chosen at random at the start of the test and remains fixed. For example it can then be understood that if you configure a schema with 1000 warehouses and run a test with 10 virtual users by default most of the workload will be concentrated upon 10 warehouses. (It is important to note the “by default” clause here as there are exceptions to change this behaviour if desired). Also with the home warehouse chosen at random it should be clear that number of warehouses should be configured so that when the maximum number of virtual users that you will run are configured there is a good chance that the selection of a home warehouse at random will be evenly distributed across the available warehouses with one or possibly 2 virtual users selecting the same home warehouse at random but not more.
As an example configuring a 10 warehouse schema and running 100 virtual users against this schema would be an error in configuration as it would be expected for 10 virtual users or more to select the same warehouse. Doing this would mean that the workload would spend considerably more time in lock contention and would not produce valid results. Typically an option of 4 to 5 warehouses per virtual user would be a minimum value to ensure an even distribution of virtual users to warehouse. Therefore for the 100 virtual users 400 to 500 warehouses should be a minimum to be configured. As noted configuring more should not have a major impact on results as depending on the number of virtual users used in the test most the warehouses will be idle (and ideally most of the warehouses you are using will be cached in memory in your buffer cache so the I/O to the data area is minimal).
As one virtual user can drive most of the capacity of one CPU core the actual value for the number of warehouses you choose will depend upon the number of cores per socket. Note that if using CPUs with Hyper-Threading allow for additional CPU capacity, so size as if there were 35% more physical cores. Also depending on your chosen database some database software will not scale to fully utilise all cores, see the best practice guides for guidance on your chosen database. If CPU utilisation is limited then you will need fewer warehouses configured and virtual users for the test.
It should also be clear that there is no completely accurate one-size-fits-all type guidance for warehouse sizing as different databases will scale differently and some may need more warehouses and virtual users than others to reach peak performance. A common error is to size many thousands of warehouses and virtual users with the aim of reaching high performance but instead resulting in high levels of contention and low performance. Even for highly scalable databases on large systems upper limits for tests without keying and thinking time are in the region of 2000 warehouses for up to 500 virtual users for maximum performance.
The exceptions to these guidelines are given in the section Advanced Driver Script Options in the following Chapter namely the Use All Warehouses and Event Driven Scaling options. For advanced users these options enable increasing the amount of data area I/O and running more Virtual Users less intensively with keying and thinking time respectively.