🚀
Are we there yet?
Lessons Learnt Performance Testing A Crypto Exchange API
Ben Shi
#dddsydney @hbish | /in/benshi | hbish.com
👋 hi
![DDDSydney 2019 Sponsors](ff63058d1e61818b4a401125030e4d18.webp)
![Ben Shi](35c700cd98e9b05a67540aef9c81ef12.webp)
About Me
Agenda
- Boring Bits
- Lessons Learnt
- Performance Engineering
- Wrap up
Quick Survey 🗳
...a testing practice performed to determine how a system performs in terms of responsiveness and stability...
(source: wikipedia)
So like?
![slow button bash](8bdec1a2ba76450b059cc654dd522194.webp)
fast
![faster button bash](cb24ad39be64294080d903cc3b4c9512.webp)
faster
![fastest button bash](e3278cd74bb7eddfdf33156905caab47.webp)
fastest
What flavours does it come in?
- Load testing
- Stress testing
- Soak testing
- Spike testing
- Breakpoint testing
- Configuration testing
- Isolation testing
- Internet testing
additional reading
The Art of Application Performance Testing - Ian Molyneaux
What can we measure?
What can we measure?
Client
- Latency/Response Time - Round Trip
- Throughput
Server
- Latency/Response Time - Processing Time
- Throughput
- Availability
- Server metrics - CPU, memory, disk I/O, network I/O...
- Connection pooling
- Cache hit/miss ratios
- Queue depth
- and more...
Crypto Exchange: A Crash Course
![Centralized vs Decentralized Exchanges](1636c5bf853d36384a6f9538abc5fd53.webp)
API
![Centralized vs Decentralized Exchanges](b19a544e3785b04e779a84d26412a372.webp)
Typical Architecture
+-------------+---------------+
| RPC/Web API | Websocket API |
+----------------------------------------+
| Matching Engine | Asset Management |
+----------------------------------------+-------------------+ +--------------------+
| Settlement Engine | Account Management | Wallet Management | <---> | Blockchain Network |
+----------------------------------------+---+---------------+ +--------------------+
| KYC Service | MFA/Security Services | |
+--------------------------------------------+ |
| Networking | Database | Event Bus & MQ | <------------------------------+
+--------------------------------------------+
| Cold Wallet Storage |
+--------------------------------------------+
additional reading: How do cryptocurrency exchanges work? and what technologies are driving disruption - Naveen Saraswat
Let the show begin!
- Centralised crypto exchange API
- Order submission/execution
- Gatling
![Panda holding a gatling gun](1da5fc1621b2dbb199f15112cae7ef5d.webp)
Gatling
- non-blocking/asynchronous stack (scala, akka, netty)
- not just a load runner, can be scripted using DSL
- built-in assertion
- good calculation and statistics
- nice reporting
- comes with a recorder
- open-source*
- can run in a distributed fashion and fed into other performance platforms
![gatling scripting](bbc8b7224fc2093f8f82d129a85112c3.webp)
![gatling report](cd3f91321e0f0db1eb032d53613c3e1e.webp)
🎯
1. Test with a goal in mind
Non-Functional Requirements
Expensive & Time Consuming
![Why you no requirements](de155757a48c6275fe60518d1404acec.webp)
Baseline!
![Baseline](ac5708ba4915b0da051ee5f0bdce5a54.webp)
Without data you're just another person with an opinion
- W. Edwards Deming
🔘
2. Consider all the parameters
🙁
POST https://{{host}}/api/order
jwt: abcdefg1234567890
Content-Type: application/json
Accept: application/json
Accept-Charset: utf-8
{
"symbol": "EthBtc",
"side": "Buy",
"order_type": "Limit",
"time_in_force": "Gtc",
"quantity": 2,
"price": 10.3123,
"new_order_resp_type": "Ack",
"timestamp": 1562336244
}
Parameters
val price = ??
val quantity = ??
val Side = Array("Buy", "Sell")
val OrderTypes = Array("Market", "Limit", "StopLoss", "StopLossLimit", "TakeProfit", "TakeProfitLimit", "LimitMaker")
val TimeInForce = Array("Gtc", "Ioc")
val Symbols = Array("EthBtc", "EthLtc", "EthUsdt", "EthXrp")
exec(http("""POST /api/order STOPLOSS SELL ${symbol}""")
.post("/order")
.body(StringBody(
"""{
"symbol": "${symbol}",
"side": "${side}",
"order_type": "StopLoss",
"time_in_force": "Gtc",
"quantity": ${quantity},
"stop_price": ${stopSellPrice},
"new_order_resp_type": "Ack",
"timestamp": ${timestamp}
}""")).asJson
.headers(PostHeaders)
.header("jwt", """${jwt}""")
.check(status.is(200))
)
😕
val orderParams: Iterator[Map[String, Any]] = Iterator.continually(
Map(
"quantity" -> Random.nextDouble() * 100,
"price" -> Random.nextDouble() * 10,
"symbol" -> Symbols(Random.nextInt(Symbols.length)),
"timestamp" -> Instant.now.getEpochSecond,
)
)
😃
val orderParams: Iterator[Map[String, Any]] = Iterator.continually(
elem = {
val symbol = Random.nextInt(Symbols.length)
val marketPrice = getMarketPrice(symbol)
Map(
"quantity" -> Random.nextDouble() * 100,
"price" -> marketPrice,
"sellPrice" -> marketPrice * (1 + (Random.nextInt(5) / 1000)),
"buyPrice" -> marketPrice * (1 - (Random.nextInt(5) / 1000)),
"symbol" -> Symbols(symbol),
"timestamp" -> Instant.now.getEpochSecond,
)
}
)
💉
3. Functional Testing on Steroids
Add some asserts!
- response code
- check orders matched and reconcile with user balance
🤖
Trading Bots
additional functional tests while under load
🛂
4. Trust, but Verify (Developers)
Developer's performance intuitions are often wrong
Myself included
Things Devs might say
- "Get a bigger instance with more ram/cpu"
- "Just add more instances"
- "It must be the database"
- "Non-issue, it hasn't crashed yet"
- "Must be an environment issue"
- "Your tests are wrong"
![Performance Test Result](e30d090ba03709354ad4599572c8c8aa.webp)
🧐
1% tile: 974.0 (ns)
5% tile: 1075.0 (ns)
10% tile: 2292.0 (ns)
25% tile: 2695.0 (ns)
50% tile: 3671.0 (ns)
75% tile: 10440.0 (ns)
90% tile: 10091923.799999999 (ns)
99% tile: 68835579.60000025 (ns)
Things Devs might do
let start = Instant::now();
submit_order();
let duration = start.elapsed();
println!("Time elapsed in submit_order() is: {:?}", duration);
Coordinated Omission Problem
additional reading: "How NOT to Measure Latency" by Gil Tene![People Buying Coffee](89ffe04971f990fae538f1aaa2b69c49.webp)
Application Performance Monitoring Metrics
+
Log Aggregation w/ Correlation IDs
=
Observability
⏱ Collect Data
🍾 Find the bottleneck
🛠 Fix it
♻ Repeat
Without data you're just another person with an opinion.
- W. Edwards Deming
📉 📈
5. Statistics Lie
![Performance Test Result](e30d090ba03709354ad4599572c8c8aa.webp)
Ignore
❌ Mean
❌ Median
❌ Standard Deviation
Anscombe's Quartlet
![Anscombe Quartlet - Data](524927c34d3bd374a42de0b6f886c19b.webp)
![Anscombe Quartlet - Graph](0496da71d504edce239c7ec4571e5732.webp)
Look at
✅ Max
✅ Percentiles
![Good Performance Test Result](e30d090ba03709354ad4599572c8c8aa.webp)
![Bad Performance Test Result](1aff5307ef66ec60e628c73871e37b59.webp)
![Percentile Plot](ff4a6943417491645b02aaf730954288.webp)
🔎
6. Scaling your performance test
Focus on the actual test
Scaling your test comes later
Paid Performance Testing Platform
Gatling Frontline
BlazeMeter
Flood.io
DIY - (The Hard way)
+---------------+
| Gatling Image |
+---------------+
|
v
+----------------+ +-----------+
| Cloudformation | ---> | ECS Tasks |
+----------------+ +-----------+
|
| +---------------------------+ +-----------------------------+
+--> | Save simulation.log to S3 | --> | Aggregate & generate report |
+---------------------------+ +-----------------------------+
⏱
Performance Engineering
Quickfire Edition
What about Microservices/Serverless?
- Watch for timeouts
- Perform spike tests to determine time taken to scale
What about GraphQL?
- It's possible, but more complicated
- Different combination of fields and fragments
- Make sure you have sensible request tracing
Many birds, one stone
Reusable Tests. Framework selection is critical!
E2E test in Dev/Test Deployment
./gradlew gatlingRun -DbaseUrl="http://dev.env:80/api" -DnumberOfUser=1 -DrunDurationSecs=300
Perf Test
./gradlew gatlingRun -DbaseUrl="http://perf.env:80/api" -DnumberOfUser=2000 -DrunDurationSecs=3600
Smoke Test in Production
./gradlew gatlingRun -DbaseUrl="http://prod.env:80/api" -DnumberOfUser=1 -DrunDurationSecs=10
CI/CD
- Build a small subset of your performance testing suite as part of your pipeline
- Monitor the build time and capture performance metrics
- Fail or Add alert for any executions that over n %
- Run perf early in the SDLC and as often as possible
It doesn't take a lot to cause an outage
A regular expression that backtracked enormously and exhausted CPU used for HTTP/HTTPS serving.
![CloudFlare CPU Goes Boom](29a020500649c82153e1d901eb150117.webp)
Introduce performance profiling for all rules to the test suite. (ETA: July 19)
Cloudflare Outage July 2019 - https://blog.cloudflare.com/details-of-the-cloudflare-outage-on-july-2-2019/
Test in Production?
Test in Production?
Yes, only if you can cleanup the data
Otherwise, test in a separate environment with configuration similar to prod
Takeaways
- Performance Test != Non-Functional Testing
- Know your parameters and limits upfront
- Test with the end in mind
- Forgo any assumptions and verify using performance metrics
- Stats Lie!
Awesome Performance
✌️
Thanks
Slide deck: https://dddsydney2019.hbish.com/