← HFT
Research

Simulation & Backtesting

Methodology

HFT simulation requires tick-level or order-book data replay rather than daily OHLCV bars. Historical tick data is replayed in chronological order, with configurable latency, queue position modeling, and fill assumptions (passive vs aggressive).

Key simulation parameters include round-trip latency, maker/taker fee structure, and queue position at the time of order submission. Survivorship bias and data quality are critical considerations when sourcing historical tick data.

This section dives into a niche but decisive problem: realistic passive-fill simulation. Most HFT backtests fail because they model alpha well but model queue dynamics poorly. If your fill model is optimistic, every downstream result looks better than reality.

Research objective: estimate fill hazard, not just fill ratio

Instead of asking "did this order fill?", ask "what is the instantaneous hazard of fill given queue state and flow?" That turns simulation into a conditional survival problem:

h(t | x) = lim(dt -> 0) P(fill in [t, t+dt] | not filled by t, x) / dt

Here `x` includes queue-ahead size, short-horizon OFI, spread regime, and local trade intensity. This model naturally captures why two orders at same price can have very different outcomes.

Event-driven replay with queue state vectors

In each replay step, we maintain a queue-state vector for every working order:

  • Q_ahead: estimated visible queue ahead at placement
  • dQ_trade: queue consumed by aggressive trades at level
  • dQ_cancel: queue reduction from cancellations ahead
  • dQ_insert: new same-price arrivals that reduce priority

Order fill progression then follows:

Q_ahead(t+1) = Q_ahead(t) - dQ_trade - dQ_cancel + dQ_insert

Fill is triggered when `Q_ahead <= 0` with venue-consistent timing and ack-latency constraints.

Calibration workflow from historical data

A practical calibration pipeline for this niche problem:

  1. Collect order-level outcomes with timestamps and queue context features
  2. Bucket by spread state, volatility state, and time-of-day regime
  3. Fit hazard or discrete-time fill-probability model per bucket
  4. Validate calibration drift using rolling out-of-sample windows

This avoids one global model that silently underfits open/close behavior and overfits midday calm.

PnL decomposition that exposes model errors

We keep a strict decomposition:

NetPnL = SpreadCapture + Rebates - Fees - MarkoutLoss - Slippage - InventoryCarry

If simulation overestimates queue quality, you will usually see inflated SpreadCapture and artificially low MarkoutLoss. This decomposition makes that mismatch obvious during post-run diagnostics.

Markout parity tests (the most useful validation I run)

For each simulated fill, compute k-horizon markouts and compare live vs sim distributions:

Markout_k = side * (Mid_(t+k) - FillPrice_t)

If simulated short-horizon markouts are systematically better than live for similar queue states, your hazard model or cancellation-latency model is too optimistic.

Latency model coupling with queue model

Queue simulation and latency simulation cannot be separated. Cancellation delay changes queue exposure window, which directly changes fill and markout quality. I model latency as regime-dependent random variables:

T_cancel ~ D_open, D_midday, D_news

Then evaluate strategy robustness across these distributions, not one fixed value.

Statistical checks before trusting results

  • Parameter stability across rolling windows
  • Error bounds for fill probability and markout estimates
  • Sensitivity of EV to small latency shifts and fee changes
  • Out-of-sample degradation from open to close regimes

For summary quality metrics:

Sharpe = mean(r) / std(r) * sqrt(N), MaxDrawdown = max(Peak - Trough)

But for this niche use case, calibration error on fill/markout often matters more than Sharpe itself.

How to use this in the project workflow

Use the engine iteratively: fit queue-hazard assumptions, replay with production-like latency, compare simulated and realized markouts, then tighten only the mismatched components. This cycle is slower than simplistic backtesting, but it is what makes live behavior converge with research behavior.

QuantifiedTrader logoQuantifiedTrader

Independent quantitative research on trading methods, backtesting, and market analytics.

Research disclaimer

QuantifiedTrader is operated by an independent quantitative research group. We study, document, and compare different methods of trading, portfolio construction, risk management, and investment analysis. Our work is exploratory and academic in nature—we build tools, run backtests, and publish findings to advance understanding, not to promote any particular strategy or product.

Not investment advice. Nothing on this website constitutes investment, trading, financial, tax, legal, or other professional advice. We do not recommend, endorse, or solicit the purchase or sale of any security, derivative, or financial instrument, nor do we suggest that any strategy, model, or result presented here is suitable for any individual or institution. Any examples, simulations, or performance figures are illustrative research outputs only.

No client or advisory relationship. We do not provide investment advisory, brokerage, portfolio-management, custody, or asset-management services to any person or entity. Browsing this site, using our tools, or contacting us does not create a client, fiduciary, or advisory relationship. We do not manage money on behalf of third parties and do not act as agents for any financial institution.

Research & education only. Content, datasets, backtests, charts, code, and software made available here are for informational and educational research. Materials may be incomplete, simulated, hypothetical, or derived from third-party sources that we do not control. Past performance, backtested results, and historical analyses are not indicative of future results. Market conditions change; models may fail; assumptions may be wrong. You are solely responsible for evaluating any information and for all decisions you make.

No responsibility or liability. To the fullest extent permitted by applicable law, QuantifiedTrader and its contributors disclaim all responsibility and liability for any loss, damage, cost, or expense—direct or indirect—arising from access to, use of, or reliance on this website, its content, or its tools. All materials are provided “as is” and “as available,” without warranties of any kind, whether express or implied, including but not limited to accuracy, completeness, fitness for a particular purpose, or non-infringement.

Non-commercial research sharing. This site does not aim to profit from the knowledge, tools, or datasets published here. Materials are shared for non-commercial research and learning, subject to applicable open-source or site terms where noted. We are a research collective, not a commercial product or service provider.

Contact. For questions about this notice, the site, or published research materials, contact support@quantedx.com. Correspondence is for administrative and research purposes only and does not constitute advice or create any professional obligation on our part.

© 2026 QuantifiedTrader. All rights reserved.