Introducing Grasshopper – An Open Source Python Library for Load Testing – Alteryx

We’re excited to announce that a new open-source project has joined the Alteryx open-source ecosystem. Grasshopper is a library for automated load testing, written in Python.

Introduction

At Alteryx, one of our objectives is to develop innovative tools and frameworks that enhance software testing abilities. As part of this mission, we have designed a user-friendly, open-source framework that enables individuals to optimize their performance testing efforts. This framework is specifically intended to help measure software performance by load testing against an API by generating metrics such as response times, failure rates, and custom trends and reporting those metrics to a timeseries database. Our aim is to help everyone progress on their performance testing journey through this framework.

Our team loved how easy it was to get started with Locust, but, it lacked an out-of-the-box solution to measure the timings that span multiple HTTP requests. This was a crucial factor in assessing our software’s performance. Additionally, we were unable to aggregate metrics for a generalized conditional statement or report those metrics to a timeseries database. These features were essential for us to conduct efficient and comprehensive performance testing of our software. As a result, we extended locust to create Grasshopper, which includes all of these necessary capabilities.

So, we are excited to announce a new addition to our open-source projects: Grasshopper, a load-testing Python library. Grasshopper uses Locust under the hood, and provides a few key features on top, namely:

Checks
Custom Trends
Timing Thresholds
Timeseries DB reporting and tagging (namely InfluxDB)
Integration with PyTest

An example Grafana dashboard that uses Grasshopper & InfluxDB metrics. To get started with your own dashboard, see the “example” folder in the Grasshopper repository.

Background

Recently, our team was presented with a rare opportunity that SDETs often dream of – a chance to pause and evaluate the tooling we were using for our performance testing. After carefully surveying the available options, we chose Locust as the foundation for our solution. This decision was made with the benefit of hindsight, allowing us to incorporate lessons learned from previous experiences into our tooling strategy.

The story of Grasshopper starts out with trying out Locust, discovering the amazing benefits it introduced, and then building on top of those features.

Why Locust?

As a performance team with SDETs, we were interested in Locust for a variety of reasons, including:

it’s Python – Many of our SDETs had expertise in Python and wanted access to the simplicity of Python and the huge ecosystem of python packages available. Decorators, pytest, and breakpoints to name a few.
it’s modular – Locust was built with customizability in mind. If we had some test requirement that it did not initially fulfill, we could always add a custom listener or decorator to extend the functionality.
it’s realistic – Locust offered some great features around closely replicating user behavior. For example, one could define a list of tasks for each user to perform, and then put weights on each task to determine how often each will be performed.
we already had a shared API test library in Python – Our quality organization had already developed a library for many of the API actions used in functional tests. If we were able to re-use those functions in our performance tests, then that would be huge, as we could maintain both frameworks at once without any duplicated code.

Immediate Benefits

Our team made a significant breakthrough when we integrated Locust with the shared functional API test library. Though there were some details to iron out, we were ultimately successful in making it work. This achievement had a monumental impact on our team’s efficiency and the overall quality of our work. By leveraging the shared library, we were able to eliminate the need for a dedicated set of API actions for performance testing, freeing up our resources to concentrate on more critical tasks. Additionally, our ability to contribute to the shared library garnered enthusiasm and satisfaction from our colleagues in the quality organization.

But, we needed more… time for some innovation!

Despite the love for Locust, we identified a number of items that more mature tools and frameworks provide.

Tag-based suites for trend analysis and other situations where we evaluate a proposed change to the product/configuration
Custom trends. For example, say you have some timed action that spans multiple HTTP calls. Locust does not have the functionality to keep track of such a timing, so we added this as a decorator that can be wrapped around the method that you care about.
Checks. Checks validate boolean conditions in the test. For example, a check could validate that a response body has a certain shape. This check is given a name, and is aggregated over time.
Custom tagging for all metrics (checks, HTTP requests, Custom Trends)
Data to our timeseries db (influxdb) & dashboards (Grafana). This includes HTTP response times, custom trends, checks, and custom tags.
Thresholds. For example, you care that the 90th percentile of HTTP requests for a certain request aren’t above x ms.
Reporting results to other locations (console reports, reportportal, slack, etc.)
And lastly, what every good testing framework needs – some reusable base classes that take care of the majority of the boilerplate that tests often contain

So, we set out to build a package that combines other open source tools and some additional code to add in many of these items. Overall, our goal was that you install the package, make a copy of the example test and start filling out the code that is specific to the thing you want to test. This ended up being a non-trivial amount of work and we’d like to save you from this same labor, so announcing Grasshopper – the full(er) performance testing solution written all in python, built on top of Locust and Pytest, sporting many convenience capabilities.

Life After Grasshopper

Since grasshopper is so easy to get started with, we decided to evangelize this tool directly to our own developers within Alteryx, and have seen lots of success in the process. Now, there are developer teams that independently use this tool, and have the flexibility to extend it to their own specific use cases. This has been a massive win for shifting performance testing left in our software development lifecycle.

As an added bonus, we were able to reuse the api model library built for functional tests, saving a tremendous amount of work that would ultimately duplicate code. This also drastically speeds up our ability to reproduce cases that a customer has escalated.

Usage

Installation

This package can be installed via pip: pip install locust-grasshopper

Future Outlook

In the future, we plan on adding the following functionality to grasshopper:

slack reporting
reportportal reporting
prometheusDB reporting

Contributions

Special thanks to the team that built Grasshopper: Suzanne Ezell, Jacob Fiola, Logan Michalicek, Simon Stratton, Mykola Solopii, Vijaya Ayyappaneni, and Ashwin Chandrasekar. In addition, we would like to thank Laurie Linz. None of this would have been possible without her support and leadership.

We would also like extend our thanks the maintainers of locust and locust-influxdb-listener. Grasshopper could not exist without the use of these packages.

If you have ideas for enhancing or improving Grasshopper, open source contributions are welcome. To get started, check out the Grasshopper “contributing” section in the project README.

Alternative Tools

If Grasshopper doesn’t quite fit your needs, here are some other load testing tools we evaluated, all open source options (definitively not exhaustive).

k6 – javascript based; the primary reason that we did not choose k6 was that the code is compiled to golang before execution, so debugging was challenging. This also prevented us from using most npm packages in our performance tests.
Gatling – scala based, includes recording feature, nice GUI for watching live tests; we ruled this out because of a lack of scala experience.
JMeter – java based, fairly mature tool; But a major downside is that it uses a fully domain-specific language. In the end, we decided to go with a tool that could share our API library which was in python