Methodology And Risk

Review the high-level methodology, data framing, and warning philosophy behind the research surface.

This page explains the trust framework behind the research section without exposing internal runbooks or operational thresholds.

Methodology

Statly research is designed around a simple rule: evidence should reflect what could have been known at the time, not what became obvious later.

That is why the product speaks in terms of:

Point-in-time discipline
No-lookahead reasoning
Validation rather than single-run storytelling

Data provenance and hygiene

The public claim should be conservative:

research uses explicit data families rather than vague "black-box AI outputs"
candidate evidence is tied to known market inputs
point-in-time discipline matters more than headline win rate
provenance and recency are part of the trust framework, not afterthoughts

What we can say publicly today:

some lanes are vendor-backed and some are natively collected
manifests, runs, and candidate states are tracked as first-class objects
the product takes data validation seriously enough to surface stale evidence and warning posture instead of pretending every candidate is equally fresh

What we should not over-claim publicly yet:

lane-by-lane outlier policy details for every dataset
a fully published missing-data policy for every exchange feed
municipality-grade or vendor-grade operational detail that belongs in internal runbooks

For the public-facing provenance framing, see Data Provenance And Hygiene.

Data families

The research system works with the following data families:

Family	Description
Trades	Executed trade data
Best bid / offer	Order book top-of-book
Order book depth (L2)	Multi-level depth, imbalance, and order book pressure
Funding	Perpetual funding rates
Open interest	Market-wide open positions
Mark price	Exchange mark price
Index price	Cross-venue index
Liquidations	Forced liquidation events
Oracle price	External price feed data

Some data lanes rely on vendor-backed coverage while others are built on native collection. The customer-facing point is that the research system treats data provenance and validation as part of the trust framework.

Statistical rigor

The research docs should be explicit that screening is not allowed to quietly reward noise.

At the current boundary, the screening engine already uses:

fold-based out-of-sample evidence
a multiple-testing correction path
a concrete selection rule rather than hand-wavy ranking language

Today, the public code-backed correction in the feature-screen path is:

bonferroni_ic_pvalue.v1

This means Statly is already taking the multiple-comparison problem seriously instead of pretending that the best-looking feature automatically deserves trust.

At a high level, that correction can be summarized as:

adjusted_p_value = min(1, raw_p_value * number_of_tests)

That formula matters because it makes it harder for a feature to survive screening just because many alternatives were tried.

Biases the docs should name openly

If the docs want to build trust, they should say these words directly:

look-ahead bias
overfitting
multiple testing
stale evidence
execution-cost optimism

The product becomes more credible when it shows that a favorable backtest or a strong-looking feature can still be rejected for honest reasons.

Backtest realism

A backtest is only useful if it is trying not to lie.

That means the public docs should keep reinforcing that replay quality depends on:

holdout review
paper shadow or paper observation
cost validation
slippage assumptions
implementation drift and latency awareness

The current stack already contains code and tests for execution cost, slippage, latency-aware stress handling, and depth-aware market structure. The customer-facing docs do not need every parameter, but they should clearly state that PnL is not treated as a frictionless fantasy number.

At the conceptual level, the backtest engine is always trying to defend against a fake equation like:

naive_pnl = gross_feature_excess

and replace it with something closer to:

realistic_pnl = gross_excess - fees - slippage - latency / implementation drag

Why warnings exist

Warnings exist because the system should not lie about what is missing.

If a candidate lacks:

Recent backtest evidence
Recent paper observation
Stronger institutional promotion for live

...the product says that clearly and suggests the safer next step.

A warning is not a hidden block. It is an honest statement about the current evidence gap. The operator still has agency to choose their next action within the workspace.

Where features fail

The docs should also teach where good-looking candidates can still break:

thin-liquidity windows
regime transitions
cost sensitivity
live-vs-backtest decay
partial evidence where backtest exists but paper confirmation is still weak

This is not negative marketing. It is how the product earns trust.

Methodology And Risk

On this page