Working in the MongoDB Server Performance Testing team, we use Amazon EC2 for system level testing. This allows us to flexibly deploy and tear down MongoDB clusters of various topologies, day after day. On the other hand, using a public cloud for performance testing can be challenging for repeatability of test results - to put it mildly. We, therefore, ended up spending several months just benchmarking EC2 itself. We compared combinations of different instance types and disks (ephemeral SSD vs PIOPS EBS). In the end, we found that the largest impact in reducing variability came from the same configuration options that we use on physical HW as well: turning off hyperthreading, using numactl and turning off CPU power saving states. Thus, you could argue that blaming "the cloud" for our performance trouble was wrong. It's possible to get similar performance characteristics from EC2 as physical hardware when used correctly, and when used incorrectly, both physical and cloud hardware will perform poorly.

With the new configuration, we've been able to greatly lower variability of our daily performance tests, and increase trust in the test results. For WiredTiger tests, even the worst case is less than 10% min-max range, and MMAPv1 is close to that. We consider this to be below the threshold of performance change that most end users are able to observe anyway, hence it is sufficient for our performance testing purposes.

The results also emphasized a golden rule of performance engineering: measure everything, assume nothing. It turned out the configuration, that was originally used for our performance testing, actually had the worst variability of all configurations we tested!


Comments are closed.