The Overfitting Trap
Its easy to optimize a strategy until it performs perfectly on historical data. Then it fails miserably in live trading. The curves were fit to noise, not signal.
The Solution: Multi-Layered Guardrails
1. Train/Test Split
- Training period: 2019-2023 (used for optimization)
- Test period: 2024 (out-of-sample validation)
- Success is judged by TEST performance, not training
2. Iteration Limits
- Maximum 20 optimization runs
- Stops after 5 consecutive runs with no improvement
- Automatic halt when overfitting detected
3. Complexity Limits
- Max 15 tunable parameters
- Max 800 lines of code
- Max 3 timeframes used
4. Per-Iteration Constraints
- Max 50% change per parameter between iterations
- Max 3 parameters changed at once
- Prevents wild swings that could fit noise
Overfitting Detection
The system monitors the overfit ratio = train_score / test_score:
| Ratio | Status |
|---|---|
| Less than 1.2 | Good generalization |
| 1.2-1.5 | Some overfitting (warning) |
| Greater than 1.5 | Significant overfitting (stop) |
Also detects divergence patterns: if training performance improves over 3+ consecutive runs while test performance degrades, optimization halts.
Scoring System
Multi-metric scoring prevents optimizing for a single number:
| Metric | Weight |
|---|---|
| Sharpe Ratio | 30% |
| Win Rate | 25% |
| Net Profit | 20% |
| Max Drawdown | -15% (penalty) |
| Profit/Loss Ratio | 10% |
Target thresholds (on TEST data):
- Sharpe >= 1.0
- Win Rate >= 55%
- Max Drawdown <= 15%
- Minimum 20 trades (statistical significance)
Example Strategy: Triangle Breakout
The example strategy optimized is a Triangle Breakout system for MES (Micro E-mini S&P 500) futures:
Pattern Detection:
- Identifies converging support (uptrend) and resistance (downtrend) lines
- Uses 4-hour bars with configurable lookback (80 bars)
- Requires minimum 40 bars to form valid trendlines
Exit Management:
- Trailing stop activation at 125% of triangle height
- Breakeven lock at 50% of height captured
- Time-based exit after 60 bars if MFE less than 25% of height
Key Parameters (14 total):
TOLERANCE_POINTS: 0.75 ENTRY_BUFFER: 0.25 TRIANGLE_LOOKBACK: 80 MIN_BARS: 40 MIN_TRIANGLE_HEIGHT: 15.0 MAX_TRIANGLE_HEIGHT: 100.0 BE_LOCK_FRAC: 0.50 TRAIL_ACTIVATE_FRAC: 1.25 MAX_HOLD_BARS: 60
Optimization Results
Run 1 (Baseline):
- Train: 60% win rate, -0.17 Sharpe, 10 trades
- Test: 100% win rate, -0.76 Sharpe, 2 trades
- Issue: Low trade counts for statistical significance
Run 2 (Loosened filters):
- Changed: MIN_TRIANGLE_HEIGHT 15->10, MIN_BREAKOUT_ABS 10->6
- Train: 56% win rate, 0.04 Sharpe, 18 trades
- Test: 50% win rate, -0.59 Sharpe, 4 trades
- More trades but lower quality on test
Key Takeaways
DO:
- Always use train/test split
- Judge success by TEST performance
- Make small, incremental changes
- Stop early if test performance degrades
- Prefer simpler strategies
DO NOT:
- Keep optimizing until backtest looks perfect
- Change many parameters at once
- Ignore warning signs of overfitting
- Add complexity without clear benefit
- Trust strategies that only work on training data
Code Structure
quantConnect/ guardrails.py # Overfitting prevention core (533 lines) qc_api.py # QuantConnect API client (285 lines) optimizer.py # Main orchestrator + templates (734 lines) iterate.py # Autonomous optimization loop (410 lines) best_params.json # Best parameters found optimization_results.json # Full run history
This system provides a disciplined framework for strategy development that prioritizes generalization over curve-fitting.