← HFT
Research

Simulation & Backtesting

Methodology

HFT simulation requires tick-level or order-book data replay rather than daily OHLCV bars. Historical tick data is replayed in chronological order, with configurable latency, queue position modeling, and fill assumptions (passive vs aggressive).

Key simulation parameters include round-trip latency, maker/taker fee structure, and queue position at the time of order submission. Survivorship bias and data quality are critical considerations when sourcing historical tick data.

This section dives into a niche but decisive problem: realistic passive-fill simulation. Most HFT backtests fail because they model alpha well but model queue dynamics poorly. If your fill model is optimistic, every downstream result looks better than reality.

Research objective: estimate fill hazard, not just fill ratio

Instead of asking "did this order fill?", ask "what is the instantaneous hazard of fill given queue state and flow?" That turns simulation into a conditional survival problem:

h(t | x) = lim(dt -> 0) P(fill in [t, t+dt] | not filled by t, x) / dt

Here `x` includes queue-ahead size, short-horizon OFI, spread regime, and local trade intensity. This model naturally captures why two orders at same price can have very different outcomes.

Event-driven replay with queue state vectors

In each replay step, we maintain a queue-state vector for every working order:

  • Q_ahead: estimated visible queue ahead at placement
  • dQ_trade: queue consumed by aggressive trades at level
  • dQ_cancel: queue reduction from cancellations ahead
  • dQ_insert: new same-price arrivals that reduce priority

Order fill progression then follows:

Q_ahead(t+1) = Q_ahead(t) - dQ_trade - dQ_cancel + dQ_insert

Fill is triggered when `Q_ahead <= 0` with venue-consistent timing and ack-latency constraints.

Calibration workflow from historical data

A practical calibration pipeline for this niche problem:

  1. Collect order-level outcomes with timestamps and queue context features
  2. Bucket by spread state, volatility state, and time-of-day regime
  3. Fit hazard or discrete-time fill-probability model per bucket
  4. Validate calibration drift using rolling out-of-sample windows

This avoids one global model that silently underfits open/close behavior and overfits midday calm.

PnL decomposition that exposes model errors

We keep a strict decomposition:

NetPnL = SpreadCapture + Rebates - Fees - MarkoutLoss - Slippage - InventoryCarry

If simulation overestimates queue quality, you will usually see inflated SpreadCapture and artificially low MarkoutLoss. This decomposition makes that mismatch obvious during post-run diagnostics.

Markout parity tests (the most useful validation I run)

For each simulated fill, compute k-horizon markouts and compare live vs sim distributions:

Markout_k = side * (Mid_(t+k) - FillPrice_t)

If simulated short-horizon markouts are systematically better than live for similar queue states, your hazard model or cancellation-latency model is too optimistic.

Latency model coupling with queue model

Queue simulation and latency simulation cannot be separated. Cancellation delay changes queue exposure window, which directly changes fill and markout quality. I model latency as regime-dependent random variables:

T_cancel ~ D_open, D_midday, D_news

Then evaluate strategy robustness across these distributions, not one fixed value.

Statistical checks before trusting results

  • Parameter stability across rolling windows
  • Error bounds for fill probability and markout estimates
  • Sensitivity of EV to small latency shifts and fee changes
  • Out-of-sample degradation from open to close regimes

For summary quality metrics:

Sharpe = mean(r) / std(r) * sqrt(N), MaxDrawdown = max(Peak - Trough)

But for this niche use case, calibration error on fill/markout often matters more than Sharpe itself.

How to use this in the project workflow

Use the engine iteratively: fit queue-hazard assumptions, replay with production-like latency, compare simulated and realized markouts, then tighten only the mismatched components. This cycle is slower than simplistic backtesting, but it is what makes live behavior converge with research behavior.

QuantifiedTrader is operated by an independent research-only group focused on building, documenting, and improving open quantitative-finance tools. Our purpose is to study markets, models, and methods—not to sell products, manage assets, or act on behalf of third parties.

No services. We do not provide investment, trading, brokerage, advisory, portfolio-management, custody, tax, legal, or any other professional or commercial services to any person or entity. Nothing on this site constitutes an offer, solicitation, recommendation, or endorsement to buy or sell securities or to adopt any investment strategy.

Research & education only. Content, data, backtests, charts, and software made available here are for informational and educational research. They may be incomplete, simulated, or based on third-party sources; past performance is not indicative of future results. You are solely responsible for your own decisions and for verifying any information before use.

No commercial benefit from shared knowledge. This site does not aim to profit from the knowledge, tools, or datasets published here. Materials are provided without charge for non-commercial research and learning, subject to applicable open-source or site terms where noted.

Disclaimer of warranties. All content and tools are supplied “as is” and “as available,” without warranties of any kind, express or implied, including accuracy, fitness for a particular purpose, or non-infringement. We disclaim liability for any loss or damage arising from use of or reliance on this site, to the fullest extent permitted by law.

Contact & disputes. For questions about this notice, the site, or any dispute relating to published materials, contact support@quantedx.com. We will endeavour to respond in good faith; this contact channel is for administrative and research correspondence only and does not create a client, advisory, or fiduciary relationship.

© 2026 QuantifiedTrader