kalshi-backtest/README.md

kalshi-backtest
===

quant-level backtesting framework for kalshi prediction markets, using a candidate pipeline architecture.


features
---

- **multi-timeframe momentum** - detects divergence between short and long-term trends
- **bollinger bands mean reversion** - signals when price touches statistical extremes
- **order flow analysis** - tracks buying vs selling pressure via taker_side
- **kelly criterion position sizing** - dynamic sizing based on edge and win probability
- **exit signals** - take profit, stop loss, time stops, and score reversal triggers
- **category-aware weighting** - different strategies for politics, weather, sports, etc.
- **ensemble scoring** - combine multiple models with dynamic weighting
- **cross-market correlations** - lead-lag relationships between related markets
- **ML ensemble (optional)** - LSTM + MLP models via ONNX runtime


architecture
---

```
Historical Data (CSV)
         |
         v
+------------------+
|  Backtest Loop   |  <- simulates time progression
+------------------+
         |
         v
+------------------+
| Candidate Pipeline |
+------------------+
    |         |
    v         v
 Sources   Filters -> Scorers -> Selector
    |
    v
+------------------+
|  Trade Executor  |  <- kelly sizing, exit signals
+------------------+
         |
         v
+------------------+
|   P&L Tracker    |  <- tracks positions, returns
+------------------+
         |
         v
    Performance Metrics
```


data format
---

fetch data from kalshi API using the included script:

```bash
python scripts/fetch_kalshi_data.py
```

or download from https://www.deltabase.tech/

**markets.csv**:
```csv
ticker,title,category,open_time,close_time,result,status,yes_bid,yes_ask,volume,open_interest
PRES-2024-DEM,Will Democrats win?,politics,2024-01-01 00:00:00,2024-11-06 00:00:00,no,finalized,45,47,10000,5000
```

**trades.csv**:
```csv
timestamp,ticker,price,volume,taker_side
2024-01-05 12:00:00,PRES-2024-DEM,45,100,yes
2024-01-05 13:00:00,PRES-2024-DEM,46,50,no
```


usage
---

```bash
# build
cargo build --release

# run backtest with quant features
cargo run --release -- run \
    --data-dir data \
    --start 2024-01-01 \
    --end 2024-06-01 \
    --capital 10000 \
    --max-position 500 \
    --max-positions 10 \
    --kelly-fraction 0.25 \
    --max-position-pct 0.25 \
    --take-profit 0.20 \
    --stop-loss 0.15 \
    --max-hold-hours 72 \
    --compare-random

# view results
cargo run --release -- summary --results-file results/backtest_result.json
```


cli options
---

| option | default | description |
|--------|---------|-------------|
| --data-dir | data | directory with markets.csv and trades.csv |
| --start | required | backtest start date |
| --end | required | backtest end date |
| --capital | 10000 | initial capital |
| --max-position | 100 | max shares per position |
| --max-positions | 5 | max concurrent positions |
| --kelly-fraction | 0.25 | fraction of kelly criterion (0.1=conservative, 1.0=full) |
| --max-position-pct | 0.25 | max % of capital per position |
| --take-profit | 0.20 | take profit threshold (20% gain) |
| --stop-loss | 0.15 | stop loss threshold (15% loss) |
| --max-hold-hours | 72 | time stop in hours |
| --compare-random | false | compare vs random baseline |


scorers
---

**basic scorers**:
- `MomentumScorer` - price change over lookback period
- `MeanReversionScorer` - deviation from historical mean
- `VolumeScorer` - unusual volume detection
- `TimeDecayScorer` - prefer markets with more time to close

**quant scorers**:
- `MultiTimeframeMomentumScorer` - analyzes 1h, 4h, 12h, 24h windows, detects divergence
- `BollingerMeanReversionScorer` - triggers at upper/lower band touches (2 std)
- `OrderFlowScorer` - buy/sell imbalance from taker_side
- `CategoryWeightedScorer` - different weights per category
- `EnsembleScorer` - combines models with dynamic weights
- `CorrelationScorer` - cross-market lead-lag signals

**ml scorers** (requires `ml` feature):
- `MLEnsembleScorer` - LSTM + MLP via ONNX


position sizing
---

uses kelly criterion with safety multiplier:

```
kelly = (odds * win_prob - (1 - win_prob)) / odds
safe_kelly = kelly * kelly_fraction
position = min(bankroll * safe_kelly, max_position_pct * bankroll)
```


exit signals
---

positions can exit via:
1. **resolution** - market resolves yes/no
2. **take profit** - pnl exceeds threshold
3. **stop loss** - pnl below threshold
4. **time stop** - held too long (capital rotation)
5. **score reversal** - strategy flips bearish


ml training (optional)
---

train ML models using pytorch, then export to ONNX:

```bash
# install dependencies
pip install torch pandas numpy

# train models
python scripts/train_ml_models.py \
    --data data/trades.csv \
    --markets data/markets.csv \
    --output models/ \
    --epochs 50

# enable ml feature
cargo build --release --features ml
```


metrics
---

- total return ($ and %)
- sharpe ratio (annualized)
- max drawdown
- win rate
- average trade P&L
- average hold time
- trades per day
- return by category


extending
---

add custom scorers by implementing the `Scorer` trait:

```rust
use async_trait::async_trait;

pub struct MyScorer;

#[async_trait]
impl Scorer for MyScorer {
    fn name(&self) -> &'static str {
        "MyScorer"
    }

    async fn score(
        &self,
        context: &TradingContext,
        candidates: &[MarketCandidate],
    ) -> Result<Vec<MarketCandidate>, String> {
        // compute scores...
    }

    fn update(&self, candidate: &mut MarketCandidate, scored: MarketCandidate) {
        if let Some(score) = scored.scores.get("my_score") {
            candidate.scores.insert("my_score".to_string(), *score);
        }
    }
}
```

then add to the pipeline in `backtest.rs`.