Statly Docs
Research

How Validation Works

See how backtest, paper, and live fit into the validation ladder without overstating certainty.

Validation exists to answer a simple question: how much evidence does this strategy have, and what should the customer honestly do next?

A good feature is not yet a good strategy

The validation ladder exists because empirical forward-association and tradable implementation are different problems.

LayerWhat it asks
Feature evidenceDoes this idea appear to carry information associated with future returns (empirical)?
BacktestDoes a concrete implementation survive historical replay?
PaperDoes it still behave coherently under current conditions?
LiveDoes it remain trustworthy with capital exposed?

This is why the workspace should show both evidence posture and validation posture instead of collapsing everything into one badge.

The ladder

Historical replay

Run the strategy against historical data to produce initial evidence.

Holdout review

Review performance against unseen data periods to check for overfitting.

Paper observation

Observe the strategy against current market conditions without capital at risk.

Live promotion

Move to real execution only when evidence from all prior stages supports it.

A missing stage should increase caution, not create fake certainty. The system suggests the safer next action when evidence is incomplete.

Backtest, Paper, and Live are real actions

Inside the workspace, these are real customer actions — not merely internal labels:

ActionWhat it does
BacktestReplays the strategy against historical data
PaperObserves the strategy against current market data without capital at risk
LiveMoves into real execution with capital exposed

Why backtest is different from evidence

Backtest metrics such as PnL, drawdown, and trade count answer a different question than IC-style evidence.

  • evidence asks whether the feature contains useful information
  • backtest asks whether a tradable rule set survives replay after sizing and cost assumptions

Both are necessary. Neither should be allowed to masquerade as the other.

Institutional status vs. launch posture

Answers how far the feature or pattern-under-study has advanced internally through the governed research path. Examples: discovered, validated, paper_candidate, promoted_live.

Answers what the product suggests to the customer next. Examples: backtest suggested, paper suggested, live with warning, live verified.

These two things are related, but they are not interchangeable.

What warnings mean

Warnings explain missing evidence, not hidden certainty:

  • No recent backtest
  • Stale research evidence
  • No successful paper validation
  • Not institutionally promoted for live

In these cases, the product explains the gap and suggests the next safer action, usually starting with a backtest.