🚀

Are we there yet?

Lessons Learnt Performance Testing A Crypto Exchange API

Ben Shi

#dddsydney @hbish | /in/benshi | hbish.com

👋 hi

DDDSydney 2019 Sponsors
Ben Shi

About Me

  • Ben ShiBen Shi

    📐⏱

    Quality Assurance Engineer

    ~3.5 years
  • Ben ShiBen Shi

    🛠🏗

    Software Engineer / Consultant

    ~7 years
  • Ben Shi

    📝🌱

    Software Engineering Manager

    still growing!

Agenda

  • Boring Bits
  • Lessons Learnt
  • Performance Engineering
  • Wrap up

Quick Survey 🗳

...a testing practice performed to determine how a system performs in terms of responsiveness and stability...

(source: wikipedia)

So like?

slow button bash

fast

faster button bash

faster

fastest button bash

fastest

What flavours does it come in?

  • Load testing
  • Stress testing
  • Soak testing
  • Spike testing
  • Breakpoint testing
  • Configuration testing
  • Isolation testing
  • Internet testing

additional reading

The Art of Application Performance Testing - Ian Molyneaux

What can we measure?

What can we measure?

Client

  • Latency/Response Time - Round Trip
  • Throughput

Server

  • Latency/Response Time - Processing Time
  • Throughput
  • Availability
  • Server metrics - CPU, memory, disk I/O, network I/O...
  • Connection pooling
  • Cache hit/miss ratios
  • Queue depth
  • and more...

Crypto Exchange: A Crash Course

Centralized vs Decentralized Exchangessource: bitcoinwiki

API

Centralized vs Decentralized Exchanges

Typical Architecture

+-------------+---------------+
| RPC/Web API | Websocket API |
+----------------------------------------+
| Matching Engine | Asset Management     |
+----------------------------------------+-------------------+       +--------------------+
| Settlement Engine | Account Management | Wallet Management | <---> | Blockchain Network |
+----------------------------------------+---+---------------+       +--------------------+
| KYC Service | MFA/Security Services        |                                |
+--------------------------------------------+                                |
| Networking | Database | Event Bus & MQ     | <------------------------------+
+--------------------------------------------+
| Cold Wallet Storage                        |
+--------------------------------------------+

additional reading: How do cryptocurrency exchanges work? and what technologies are driving disruption - Naveen Saraswat

Let the show begin!

  • Centralised crypto exchange API
  • Order submission/execution
  • Gatling

Panda holding a gatling gun

Gatling

  • non-blocking/asynchronous stack (scala, akka, netty)
  • not just a load runner, can be scripted using DSL
  • built-in assertion
  • good calculation and statistics
  • nice reporting
  • comes with a recorder
  • open-source*
  • can run in a distributed fashion and fed into other performance platforms
gatling scripting
gatling report

🎯

1. Test with a goal in mind

Non-Functional Requirements

Expensive & Time Consuming

Why you no requirements

Baseline!

Baseline

Without data you're just another person with an opinion

- W. Edwards Deming

🔘

2. Consider all the parameters

🙁

POST https://{{host}}/api/order
jwt: abcdefg1234567890
Content-Type: application/json
Accept: application/json
Accept-Charset: utf-8

{
    "symbol": "EthBtc",
    "side": "Buy",
    "order_type": "Limit",
    "time_in_force": "Gtc",
    "quantity": 2,
    "price": 10.3123,
    "new_order_resp_type": "Ack",
    "timestamp": 1562336244
}

Parameters

val price = ??
val quantity = ??

val Side = Array("Buy", "Sell")
val OrderTypes = Array("Market", "Limit", "StopLoss", "StopLossLimit", "TakeProfit", "TakeProfitLimit", "LimitMaker")
val TimeInForce = Array("Gtc", "Ioc")
val Symbols = Array("EthBtc", "EthLtc", "EthUsdt", "EthXrp")
exec(http("""POST /api/order STOPLOSS SELL ${symbol}""")
  .post("/order")
  .body(StringBody(
    """{
               "symbol": "${symbol}",
               "side": "${side}",
               "order_type": "StopLoss",
               "time_in_force": "Gtc",
               "quantity": ${quantity},
               "stop_price": ${stopSellPrice},
               "new_order_resp_type": "Ack",
               "timestamp": ${timestamp}
               }""")).asJson
  .headers(PostHeaders)
  .header("jwt", """${jwt}""")
  .check(status.is(200))
)

😕

val orderParams: Iterator[Map[String, Any]] = Iterator.continually(
  Map(
    "quantity" -> Random.nextDouble() * 100,
    "price" -> Random.nextDouble() * 10,
    "symbol" -> Symbols(Random.nextInt(Symbols.length)),
    "timestamp" -> Instant.now.getEpochSecond,
  )
)

😃

val orderParams: Iterator[Map[String, Any]] = Iterator.continually(
    elem = {
        val symbol = Random.nextInt(Symbols.length)
        val marketPrice = getMarketPrice(symbol)
        Map(
            "quantity" -> Random.nextDouble() * 100,
            "price" -> marketPrice,
            "sellPrice" -> marketPrice * (1 + (Random.nextInt(5) / 1000)),
            "buyPrice" -> marketPrice * (1 - (Random.nextInt(5) / 1000)),
            "symbol" -> Symbols(symbol),
            "timestamp" -> Instant.now.getEpochSecond,
        )
    }
)

💉

3. Functional Testing on Steroids

Add some asserts!

  • response code
  • check orders matched and reconcile with user balance

🤖

Trading Bots

additional functional tests while under load

🛂

4. Trust, but Verify (Developers)

Developer's performance intuitions are often wrong

Myself included

Things Devs might say

  • "Get a bigger instance with more ram/cpu"
  • "Just add more instances"
  • "It must be the database"
  • "Non-issue, it hasn't crashed yet"
  • "Must be an environment issue"
  • "Your tests are wrong"
Performance Test Result

🧐

1% tile: 974.0 (ns)
5% tile: 1075.0 (ns)
10% tile: 2292.0 (ns)
25% tile: 2695.0 (ns)
50% tile: 3671.0 (ns)
75% tile: 10440.0 (ns)
90% tile: 10091923.799999999 (ns)
99% tile: 68835579.60000025 (ns)

Things Devs might do

let start = Instant::now();
submit_order();
let duration = start.elapsed();

println!("Time elapsed in submit_order() is: {:?}", duration);

Coordinated Omission Problem

additional reading: "How NOT to Measure Latency" by Gil Tene
People Buying Coffee

Application Performance Monitoring Metrics

+

Log Aggregation w/ Correlation IDs

=

Observability

⏱ Collect Data

🍾 Find the bottleneck

🛠 Fix it

♻ Repeat

Without data you're just another person with an opinion.

- W. Edwards Deming

📉 📈

5. Statistics Lie

Performance Test Result

Ignore

❌ Mean

❌ Median

❌ Standard Deviation

Anscombe's Quartlet

Anscombe Quartlet - Data
Anscombe Quartlet - Graph

Look at

✅ Max

✅ Percentiles

Good Performance Test Result
Bad Performance Test Result
Percentile Plot

🔎

6. Scaling your performance test

Focus on the actual test

Scaling your test comes later

Paid Performance Testing Platform

Gatling Frontline

BlazeMeter

Flood.io

DIY - (The Hard way)

                       +---------------+
                       | Gatling Image |
                       +---------------+
                              |
                              v
+----------------+      +-----------+
| Cloudformation | ---> | ECS Tasks |
+----------------+      +-----------+
                              |
                              |    +---------------------------+     +-----------------------------+
                              +--> | Save simulation.log to S3 | --> | Aggregate & generate report |
                                   +---------------------------+     +-----------------------------+

Performance Engineering

Quickfire Edition

What about Microservices/Serverless?

  • Watch for timeouts
  • Perform spike tests to determine time taken to scale

What about GraphQL?

  • It's possible, but more complicated
  • Different combination of fields and fragments
  • Make sure you have sensible request tracing

Many birds, one stone

Reusable Tests. Framework selection is critical!


E2E test in Dev/Test Deployment

./gradlew gatlingRun -DbaseUrl="http://dev.env:80/api" -DnumberOfUser=1 -DrunDurationSecs=300

Perf Test

./gradlew gatlingRun -DbaseUrl="http://perf.env:80/api" -DnumberOfUser=2000 -DrunDurationSecs=3600

Smoke Test in Production

./gradlew gatlingRun -DbaseUrl="http://prod.env:80/api" -DnumberOfUser=1 -DrunDurationSecs=10

CI/CD

  • Build a small subset of your performance testing suite as part of your pipeline
  • Monitor the build time and capture performance metrics
  • Fail or Add alert for any executions that over n %
  • Run perf early in the SDLC and as often as possible

It doesn't take a lot to cause an outage

A regular expression that backtracked enormously and exhausted CPU used for HTTP/HTTPS serving.
CloudFlare CPU Goes Boom
Introduce performance profiling for all rules to the test suite. (ETA:  July 19)

Cloudflare Outage July 2019 - https://blog.cloudflare.com/details-of-the-cloudflare-outage-on-july-2-2019/

Test in Production?

Test in Production?

Yes, only if you can cleanup the data

Otherwise, test in a separate environment with configuration similar to prod

Takeaways

  • Performance Test != Non-Functional Testing
  • Know your parameters and limits upfront
  • Test with the end in mind
  • Forgo any assumptions and verify using performance metrics
  • Stats Lie!

Awesome Performance

https://github.com/hbish/awesome-performance

✌️

Thanks


Slide deck: https://dddsydney2019.hbish.com/

@hbish | /in/benshi | hbish.com