🚀
Are we there yet?
Lessons Learnt Performance Testing A Crypto Exchange API
Ben Shi
#dddsydney @hbish | /in/benshi | hbish.com
👋 hi
About Me
Agenda
- Boring Bits
- Lessons Learnt
- Performance Engineering
- Wrap up
Quick Survey 🗳
...a testing practice performed to determine how a system performs in terms of responsiveness and stability...
(source: wikipedia)
So like?
fast
faster
fastest
What flavours does it come in?
- Load testing
- Stress testing
- Soak testing
- Spike testing
- Breakpoint testing
- Configuration testing
- Isolation testing
- Internet testing
additional reading
The Art of Application Performance Testing - Ian Molyneaux
What can we measure?
What can we measure?
Client
- Latency/Response Time - Round Trip
- Throughput
Server
- Latency/Response Time - Processing Time
- Throughput
- Availability
- Server metrics - CPU, memory, disk I/O, network I/O...
- Connection pooling
- Cache hit/miss ratios
- Queue depth
- and more...
Crypto Exchange: A Crash Course
API
Typical Architecture
+-------------+---------------+
| RPC/Web API | Websocket API |
+----------------------------------------+
| Matching Engine | Asset Management |
+----------------------------------------+-------------------+ +--------------------+
| Settlement Engine | Account Management | Wallet Management | <---> | Blockchain Network |
+----------------------------------------+---+---------------+ +--------------------+
| KYC Service | MFA/Security Services | |
+--------------------------------------------+ |
| Networking | Database | Event Bus & MQ | <------------------------------+
+--------------------------------------------+
| Cold Wallet Storage |
+--------------------------------------------+
additional reading: How do cryptocurrency exchanges work? and what technologies are driving disruption - Naveen Saraswat
Let the show begin!
- Centralised crypto exchange API
- Order submission/execution
- Gatling
Gatling
- non-blocking/asynchronous stack (scala, akka, netty)
- not just a load runner, can be scripted using DSL
- built-in assertion
- good calculation and statistics
- nice reporting
- comes with a recorder
- open-source*
- can run in a distributed fashion and fed into other performance platforms
🎯
1. Test with a goal in mind
Non-Functional Requirements
Expensive & Time Consuming
Baseline!
Without data you're just another person with an opinion
- W. Edwards Deming
🔘
2. Consider all the parameters
🙁
POST https://{{host}}/api/order
jwt: abcdefg1234567890
Content-Type: application/json
Accept: application/json
Accept-Charset: utf-8
{
"symbol": "EthBtc",
"side": "Buy",
"order_type": "Limit",
"time_in_force": "Gtc",
"quantity": 2,
"price": 10.3123,
"new_order_resp_type": "Ack",
"timestamp": 1562336244
}
Parameters
val price = ??
val quantity = ??
val Side = Array("Buy", "Sell")
val OrderTypes = Array("Market", "Limit", "StopLoss", "StopLossLimit", "TakeProfit", "TakeProfitLimit", "LimitMaker")
val TimeInForce = Array("Gtc", "Ioc")
val Symbols = Array("EthBtc", "EthLtc", "EthUsdt", "EthXrp")
exec(http("""POST /api/order STOPLOSS SELL ${symbol}""")
.post("/order")
.body(StringBody(
"""{
"symbol": "${symbol}",
"side": "${side}",
"order_type": "StopLoss",
"time_in_force": "Gtc",
"quantity": ${quantity},
"stop_price": ${stopSellPrice},
"new_order_resp_type": "Ack",
"timestamp": ${timestamp}
}""")).asJson
.headers(PostHeaders)
.header("jwt", """${jwt}""")
.check(status.is(200))
)
😕
val orderParams: Iterator[Map[String, Any]] = Iterator.continually(
Map(
"quantity" -> Random.nextDouble() * 100,
"price" -> Random.nextDouble() * 10,
"symbol" -> Symbols(Random.nextInt(Symbols.length)),
"timestamp" -> Instant.now.getEpochSecond,
)
)
😃
val orderParams: Iterator[Map[String, Any]] = Iterator.continually(
elem = {
val symbol = Random.nextInt(Symbols.length)
val marketPrice = getMarketPrice(symbol)
Map(
"quantity" -> Random.nextDouble() * 100,
"price" -> marketPrice,
"sellPrice" -> marketPrice * (1 + (Random.nextInt(5) / 1000)),
"buyPrice" -> marketPrice * (1 - (Random.nextInt(5) / 1000)),
"symbol" -> Symbols(symbol),
"timestamp" -> Instant.now.getEpochSecond,
)
}
)
💉
3. Functional Testing on Steroids
Add some asserts!
- response code
- check orders matched and reconcile with user balance
🤖
Trading Bots
additional functional tests while under load
🛂
4. Trust, but Verify (Developers)
Developer's performance intuitions are often wrong
Myself included
Things Devs might say
- "Get a bigger instance with more ram/cpu"
- "Just add more instances"
- "It must be the database"
- "Non-issue, it hasn't crashed yet"
- "Must be an environment issue"
- "Your tests are wrong"
🧐
1% tile: 974.0 (ns)
5% tile: 1075.0 (ns)
10% tile: 2292.0 (ns)
25% tile: 2695.0 (ns)
50% tile: 3671.0 (ns)
75% tile: 10440.0 (ns)
90% tile: 10091923.799999999 (ns)
99% tile: 68835579.60000025 (ns)
Things Devs might do
let start = Instant::now();
submit_order();
let duration = start.elapsed();
println!("Time elapsed in submit_order() is: {:?}", duration);
Coordinated Omission Problem
additional reading: "How NOT to Measure Latency" by Gil TeneApplication Performance Monitoring Metrics
+
Log Aggregation w/ Correlation IDs
=
Observability
⏱ Collect Data
🍾 Find the bottleneck
🛠 Fix it
♻ Repeat
Without data you're just another person with an opinion.
- W. Edwards Deming
📉 📈
5. Statistics Lie
Ignore
❌ Mean
❌ Median
❌ Standard Deviation
Anscombe's Quartlet
Look at
✅ Max
✅ Percentiles
🔎
6. Scaling your performance test
Focus on the actual test
Scaling your test comes later
Paid Performance Testing Platform
Gatling Frontline
BlazeMeter
Flood.io
DIY - (The Hard way)
+---------------+
| Gatling Image |
+---------------+
|
v
+----------------+ +-----------+
| Cloudformation | ---> | ECS Tasks |
+----------------+ +-----------+
|
| +---------------------------+ +-----------------------------+
+--> | Save simulation.log to S3 | --> | Aggregate & generate report |
+---------------------------+ +-----------------------------+
⏱
Performance Engineering
Quickfire Edition
What about Microservices/Serverless?
- Watch for timeouts
- Perform spike tests to determine time taken to scale
What about GraphQL?
- It's possible, but more complicated
- Different combination of fields and fragments
- Make sure you have sensible request tracing
Many birds, one stone
Reusable Tests. Framework selection is critical!
E2E test in Dev/Test Deployment
./gradlew gatlingRun -DbaseUrl="http://dev.env:80/api" -DnumberOfUser=1 -DrunDurationSecs=300
Perf Test
./gradlew gatlingRun -DbaseUrl="http://perf.env:80/api" -DnumberOfUser=2000 -DrunDurationSecs=3600
Smoke Test in Production
./gradlew gatlingRun -DbaseUrl="http://prod.env:80/api" -DnumberOfUser=1 -DrunDurationSecs=10
CI/CD
- Build a small subset of your performance testing suite as part of your pipeline
- Monitor the build time and capture performance metrics
- Fail or Add alert for any executions that over n %
- Run perf early in the SDLC and as often as possible
It doesn't take a lot to cause an outage
A regular expression that backtracked enormously and exhausted CPU used for HTTP/HTTPS serving.
Introduce performance profiling for all rules to the test suite. (ETA: July 19)
Cloudflare Outage July 2019 - https://blog.cloudflare.com/details-of-the-cloudflare-outage-on-july-2-2019/
Test in Production?
Test in Production?
Yes, only if you can cleanup the data
Otherwise, test in a separate environment with configuration similar to prod
Takeaways
- Performance Test != Non-Functional Testing
- Know your parameters and limits upfront
- Test with the end in mind
- Forgo any assumptions and verify using performance metrics
- Stats Lie!
Awesome Performance
✌️
Thanks
Slide deck: https://dddsydney2019.hbish.com/