6 iterations of backtest refinements with key discoveries: - stop losses don't work for prediction markets (prices gap) - 50% take profit, no stop loss yields +9.37% vs +4.04% baseline - diversification beats concentration: 100 positions → +18.98% - added kalman filter, VPIN, regime detection scorers (research) exit config: take_profit 50%, stop_loss disabled, 48h max hold position sizing: kelly 0.40, max 30% per position, 100 max positions
kalshi-backtest
quant-level backtesting framework for kalshi prediction markets, using a candidate pipeline architecture.
features
- multi-timeframe momentum - detects divergence between short and long-term trends
- bollinger bands mean reversion - signals when price touches statistical extremes
- order flow analysis - tracks buying vs selling pressure via taker_side
- kelly criterion position sizing - dynamic sizing based on edge and win probability
- exit signals - take profit, stop loss, time stops, and score reversal triggers
- category-aware weighting - different strategies for politics, weather, sports, etc.
- ensemble scoring - combine multiple models with dynamic weighting
- cross-market correlations - lead-lag relationships between related markets
- ML ensemble (optional) - LSTM + MLP models via ONNX runtime
architecture
Historical Data (CSV)
|
v
+------------------+
| Backtest Loop | <- simulates time progression
+------------------+
|
v
+------------------+
| Candidate Pipeline |
+------------------+
| |
v v
Sources Filters -> Scorers -> Selector
|
v
+------------------+
| Trade Executor | <- kelly sizing, exit signals
+------------------+
|
v
+------------------+
| P&L Tracker | <- tracks positions, returns
+------------------+
|
v
Performance Metrics
data format
fetch data from kalshi API using the included script:
python scripts/fetch_kalshi_data.py
or download from https://www.deltabase.tech/
markets.csv:
ticker,title,category,open_time,close_time,result,status,yes_bid,yes_ask,volume,open_interest
PRES-2024-DEM,Will Democrats win?,politics,2024-01-01 00:00:00,2024-11-06 00:00:00,no,finalized,45,47,10000,5000
trades.csv:
timestamp,ticker,price,volume,taker_side
2024-01-05 12:00:00,PRES-2024-DEM,45,100,yes
2024-01-05 13:00:00,PRES-2024-DEM,46,50,no
usage
# build
cargo build --release
# run backtest with quant features
cargo run --release -- run \
--data-dir data \
--start 2024-01-01 \
--end 2024-06-01 \
--capital 10000 \
--max-position 500 \
--max-positions 10 \
--kelly-fraction 0.25 \
--max-position-pct 0.25 \
--take-profit 0.20 \
--stop-loss 0.15 \
--max-hold-hours 72 \
--compare-random
# view results
cargo run --release -- summary --results-file results/backtest_result.json
cli options
| option | default | description |
|---|---|---|
| --data-dir | data | directory with markets.csv and trades.csv |
| --start | required | backtest start date |
| --end | required | backtest end date |
| --capital | 10000 | initial capital |
| --max-position | 100 | max shares per position |
| --max-positions | 5 | max concurrent positions |
| --kelly-fraction | 0.25 | fraction of kelly criterion (0.1=conservative, 1.0=full) |
| --max-position-pct | 0.25 | max % of capital per position |
| --take-profit | 0.20 | take profit threshold (20% gain) |
| --stop-loss | 0.15 | stop loss threshold (15% loss) |
| --max-hold-hours | 72 | time stop in hours |
| --compare-random | false | compare vs random baseline |
scorers
basic scorers:
MomentumScorer- price change over lookback periodMeanReversionScorer- deviation from historical meanVolumeScorer- unusual volume detectionTimeDecayScorer- prefer markets with more time to close
quant scorers:
MultiTimeframeMomentumScorer- analyzes 1h, 4h, 12h, 24h windows, detects divergenceBollingerMeanReversionScorer- triggers at upper/lower band touches (2 std)OrderFlowScorer- buy/sell imbalance from taker_sideCategoryWeightedScorer- different weights per categoryEnsembleScorer- combines models with dynamic weightsCorrelationScorer- cross-market lead-lag signals
ml scorers (requires ml feature):
MLEnsembleScorer- LSTM + MLP via ONNX
position sizing
uses kelly criterion with safety multiplier:
kelly = (odds * win_prob - (1 - win_prob)) / odds
safe_kelly = kelly * kelly_fraction
position = min(bankroll * safe_kelly, max_position_pct * bankroll)
exit signals
positions can exit via:
- resolution - market resolves yes/no
- take profit - pnl exceeds threshold
- stop loss - pnl below threshold
- time stop - held too long (capital rotation)
- score reversal - strategy flips bearish
ml training (optional)
train ML models using pytorch, then export to ONNX:
# install dependencies
pip install torch pandas numpy
# train models
python scripts/train_ml_models.py \
--data data/trades.csv \
--markets data/markets.csv \
--output models/ \
--epochs 50
# enable ml feature
cargo build --release --features ml
metrics
- total return ($ and %)
- sharpe ratio (annualized)
- max drawdown
- win rate
- average trade P&L
- average hold time
- trades per day
- return by category
extending
add custom scorers by implementing the Scorer trait:
use async_trait::async_trait;
pub struct MyScorer;
#[async_trait]
impl Scorer for MyScorer {
fn name(&self) -> &'static str {
"MyScorer"
}
async fn score(
&self,
context: &TradingContext,
candidates: &[MarketCandidate],
) -> Result<Vec<MarketCandidate>, String> {
// compute scores...
}
fn update(&self, candidate: &mut MarketCandidate, scored: MarketCandidate) {
if let Some(score) = scored.scores.get("my_score") {
candidate.scores.insert("my_score".to_string(), *score);
}
}
}
then add to the pipeline in backtest.rs.
Description
Languages
Rust
86.8%
Python
13.2%