preview
Linear Regression Prediction Channels in MQL5: Constructing Statistically Grounded Confidence and Prediction Bands

Linear Regression Prediction Channels in MQL5: Constructing Statistically Grounded Confidence and Prediction Bands

MetaTrader 5Indicators |
374 0
Ushana Kevin Iorkumbul
Ushana Kevin Iorkumbul

Introduction

Bollinger Bands and Donchian Channels are the most widely used channel overlays in algorithmic and discretionary trading. Both are computationally straightforward, visually intuitive, and available in virtually every charting platform. Both are also descriptive tools rather than inferential ones: they summarize realized price behavior without committing to an explicit statistical model from which formal coverage statements could be derived. Understanding this distinction is not an academic exercise; it has direct consequences for how channel violations should be interpreted, how channel width should be modeled, and how position-sizing frameworks that reference channel boundaries should be designed.

A Donchian Channel records the highest high and lowest low over a lookback window. Its width is entirely determined by extreme realized prices. It says nothing about the expected future distribution of prices relative to any trend. A Bollinger Band places bands at a fixed multiple of the sample standard deviation above and below a simple moving average. The standard deviation multiplier, almost universally set to 2.0, is an arbitrary scaling convention rather than a parameter derived from a probability model. The resulting envelope has no formal coverage guarantee: you cannot claim that the band contains 95% of future closing prices in a statistically precise sense, because no such derivation was performed.

Statistical inference requires a generative model, parameter estimation, and a probability distribution for the estimation error. This article develops exactly that structure using Ordinary Least Squares regression fitted to a rolling window of closing prices. The result is a pair of bands with coverage properties that are formally defined within the model: a confidence interval for the mean of the regression line, and a prediction interval for an individual observation, both computed using Student's t critical values with the appropriate degrees of freedom rather than a fixed multiplier. Note that "formally defined within the model" does not mean "empirically guaranteed on price data." The nominal 95% holds only when the OLS assumptions hold, and for financial prices those assumptions are routinely violated. Both regression channels and Bollinger Bands require empirical calibration before any coverage claim is trusted. Regression channels are better motivated theoretically because their width is derived from an explicit model, not a heuristic multiplier.



Section 1: Theoretical Foundation

Ordinary Least Squares on a Rolling Window

Given n closing prices y₁, y₂, …, yₙ observed at integer bar indices x₁, x₂, …, xₙ, Ordinary Least Squares estimates the linear model:

ŷ = β₀ + β₁ · x

by minimizing the sum of squared residuals:

SSE = Σᵢ (yᵢ − β₀ − β₁ · xᵢ)²

The closed-form estimators are:

β₁ = [n·Σ(xᵢ·yᵢ) − Σxᵢ·Σyᵢ] / [n·Σxᵢ² − (Σxᵢ)²]

β₀ = ȳ − β₁·x̄

where x̄ and ȳ are the sample means of the bar indices and closing prices respectively.

OLS Variable Reference

Symbol Meaning
x Bar index within the regression window
y Closing price
β₀ Intercept — regression line value at x = 0
β₁ Slope — price change per bar index unit
SSE Sum of squared residuals
σ² Residual variance (estimated from SSE)
n Window size (number of bars in regression)
Mean of bar indices
ȳ Mean of closing prices
Sxx Σ(xᵢ − x̄)² — variance structure of the x-domain

Residual Variance and Degrees of Freedom

Once β₀ and β₁ are estimated, the residual for each observation is:

eᵢ = yᵢ − (β₀ + β₁·xᵢ)

The unbiased estimator of residual variance is:

s² = SSE / (n − 2)

The denominator is n − 2 rather than n because two parameters, the slope and the intercept, have been estimated from the data. Each estimated parameter consumes one degree of freedom. Dividing by n − 2 corrects for this and produces an unbiased estimator of the true error variance σ².

Degrees of Freedom Reference

Parameter Value
Sample size n
Estimated parameters 2 (β₀ and β₁)
Residual degrees of freedom n − 2

For a window of 50 bars, the regression has 48 degrees of freedom. This directly controls the width of the t-distribution used in subsequent interval calculations.

Confidence Intervals for the Mean Response

The confidence interval at position x within the regression window quantifies uncertainty in the estimated mean response, the expected value of y at that x given the estimated parameters. The formula is:

ŷ(x) ± t(α/2, n−2) · s · √[ 1/n + (x − x̄)² / Sxx ]

where:

  • t(α/2, n−2) is the critical value from the Student's t distribution with n−2 degrees of freedom at significance level α (e.g., 0.05 for 95% coverage)
  • s = √s² is the residual standard error
  • Sxx = Σ(xᵢ − x̄)²

The term under the square root has two components. The 1/n term reflects global estimation uncertainty, uncertainty about where the regression line lies on average across the entire window. The (x − x̄)²/Sxx term reflects leverage: how far the evaluation point x is from the center of the x-domain. Points far from x̄ have greater influence on the slope estimate and consequently accumulate more uncertainty.

This means confidence interval bands are narrowest at x̄ (the center of the regression window) and widen toward both edges. This is a deterministic geometric property, not a consequence of sparse data at the window boundaries.

Prediction Intervals for Individual Observations

A prediction interval answers a different question: given the estimated regression line, what range of individual y values is plausible at position x? The formula adds a 1 under the square root to account for irreducible observation-level noise:

ŷ(x) ± t(α/2, n−2) · s · √[ 1 + 1/n + (x − x̄)² / Sxx ]

This extra 1 represents the variance of a new, independent observation drawn from the process. Even if the regression parameters were known exactly, so that the 1/n and leverage terms vanished, a future observation would still scatter around the line with variance s². Prediction intervals are therefore always strictly wider than confidence intervals at every point in the window.

In-Sample Edge Versus One-Step-Ahead Forecast

A precise statement of where the interval is evaluated matters as much as the formula itself. With bar indices assigned as x = 0, 1, …, n−1 over the window, two distinct evaluation points are meaningful.

Evaluating at x = n−1, the last observed bar inside the window, produces an in-sample edge interval: the conditional scatter band around the fitted line at the most recent observation used to fit it. This is the natural quantity to display as the right edge of a channel that tracks current price.

Evaluating at x = n, one bar beyond the window, produces a genuine one-step-ahead prediction interval for the next, not-yet-observed bar. Its fitted value and half-width are:

ŷ(n) = β̂₀ + β̂₁·n

ŷ(n) ± t(α/2, n−2) · s · √[ 1 + 1/n + (n − x̄)² / Sxx ]

These are not interchangeable. The in-sample edge interval describes dispersion at a point the model has already seen; the one-step-ahead interval describes dispersion at a point it has not. The accompanying indicator implements both and exposes the choice through an input, so the displayed band can be labeled honestly as either an edge band or a forecast band. Even the one-step-ahead interval is still constructed in-sample because the regression is refit on each bar using data up to that bar. A rigorous forecast evaluation requires the out-of-sample coverage test described later.

Channel Type Comparison

Channel Width Source Statistical Basis
Donchian Channel Historical extrema None — descriptive
Bollinger Bands Fixed multiplier × standard deviation Heuristic
Confidence Interval Mean estimation uncertainty (OLS) Student's t, n−2 df
Prediction Interval Individual observation uncertainty (OLS) Student's t, n−2 df

Why the Student's t Distribution Is Required

Using z-values (normal distribution critical values) instead of t-values is asymptotically valid but systematically underestimates interval width for small and moderate samples. With 48 degrees of freedom, the 97.5th percentile of the t-distribution is approximately 2.0106, compared to 1.9600 for the standard normal. The discrepancy is small but grows rapidly as n decreases. For n = 10 bars (8 df), the t critical value is approximately 2.306, which produces a 17.7% wider interval than the z-approximation.

As n → ∞, the t-distribution converges to the standard normal. For practical rolling regression windows of 30 to 200 bars, the Student's t correction is numerically meaningful and should be applied explicitly.

The Leverage Interpretation of Band Widening

The quantity:

hᵢ = 1/n + (xᵢ − x̄)² / Sxx

is the hat matrix diagonal (leverage) for observation i. It measures how much influence the i-th observation has on its own fitted value. Leverage is maximized at the endpoints of the x-domain and minimized at x̄.

When evaluating intervals at the leftmost or rightmost bars of the regression window, the evaluation point is as far from x̄ as it can be within the sample, maximizing the leverage term and widening the band. Geometrically, a fitted line is least constrained at the edges of the fitted range. A small change in the estimated slope produces the largest vertical displacement there, so the line can pivot around the centroid while still fitting the central cluster. (When the interval is instead evaluated at x = n, one bar beyond the window, this becomes genuine extrapolation and the leverage term grows further still.)


Geometric explanation of leverage in OLS regression channels

Geometric explanation of leverage in OLS regression channels, drawn with the same five-line rendering the indicator uses. The fitted line (solid blue) pivots around the window mean x̄, where slope uncertainty produces the smallest displacement and the boundary lines are closest together. Moving toward either window edge, the leverage term (xᵢ − x̄)²/Sxx grows quadratically, so the confidence lines (solid green) and the wider prediction lines (dashed crimson) fan outward symmetrically. This widening is a deterministic consequence of the hat-matrix geometry, not of reduced data density at the boundaries.


This property distinguishes regression channels from Bollinger Bands, which produce uniform width across the entire window since they use a global standard deviation with no position-dependent leverage correction.

OLS Assumptions and Their Violation in Financial Markets

OLS is derived under four core assumptions. Financial time series routinely violate most of them.

  1. Independence of residuals: OLS assumes that eᵢ and eⱼ are uncorrelated for i ≠ j. Price returns frequently exhibit autocorrelation at short lags (microstructure effects) and at longer horizons (momentum and mean reversion). Provided the regressors remain exogenous and the errors have zero conditional mean, autocorrelation does not by itself bias the OLS slope estimator; it does, however, invalidate the standard errors used to construct intervals, so the computed coverage diverges from the nominal 95%. In dynamic models of financial series, that exogeneity is itself a subtle assumption rather than a given.
  2. Homoscedasticity: OLS assumes that var(eᵢ) = σ² is constant across all observations. Financial returns exhibit volatility clustering: periods of high volatility produce large residuals, periods of low volatility produce small residuals. The single s² estimated from the regression window averages across these regimes and understates interval width during volatile periods while overstating it during calm periods.
  3. Normality of residuals: normality is not required for OLS to produce unbiased coefficient estimates, nor for the Gauss-Markov result that the OLS estimator is the Best Linear Unbiased Estimator — that property concerns the coefficients, not the intervals. Normality is required for the exact finite-sample t-interval to hold its nominal coverage. Financial returns have heavier tails than the normal distribution, so the nominal 95% coverage of the prediction interval may correspond to true coverage below 95%; extreme moves breach the interval more often than the model predicts. Note also that even unbiased coefficients and a correct point estimate do not guarantee that the naive t-interval keeps its coverage under heteroscedasticity or autocorrelation.
  4. Stationarity of the trend: OLS fits a deterministic linear trend within the window. If the true data-generating process involves structural breaks, regime changes, or stochastic trends (random walks), the regression will misspecify the conditional mean and residuals will absorb trend components, inflating s² and widening intervals artificially.

These violations do not eliminate the value of regression channels. They do mean that the 95% label should be treated as a model-derived approximation rather than a guaranteed empirical frequency.

Rolling Regression Computational Complexity

A naive implementation recomputes all five regression sums (Σx, Σy, Σx², Σxy, Σy²) from scratch on every bar, yielding O(n) work per bar and O(N·n) total work for an entire price series of length N with a window of size n.

An incremental implementation maintains running sums and updates them by adding the newest observation and subtracting the oldest on each new bar. This reduces per-bar work to O(1) and total complexity to O(N), independent of window size.

Method Per-Bar Complexity Total Complexity
Full recomputation O(n) O(N·n)
Incremental statistics O(1) O(N)

For clarity of exposition, the implementation in this article uses full recomputation within each window pass. An incremental optimization can be layered on top of the architecture presented here without changing the statistical logic.


Section 2: Software Architecture

The implementation is split across five header files and one main indicator file. This structure separates statistical concerns from indicator mechanics and makes each module independently testable.

OLSStatistics.mqh        — Slope, intercept, SSE computation
ResidualAnalysis.mqh     — Residual variance, degrees of freedom
TDistribution.mqh        — Student's t critical value lookup
ConfidenceInterval.mqh   — Mean uncertainty band calculation
PredictionInterval.mqh   — Individual observation band calculation
RegressionChannels.mq5   — Main indicator: buffers, line plots, diagnostics

A note on coding style: all source files follow the MetaQuotes Standard Library conventions. Member functions are declared with explicit (void) for empty parameter lists, constructors and destructors are declared explicitly, return statements use parenthesized expressions, and brace placement follows the library's two-space indentation pattern. This makes the code immediately recognizable to experienced MQL5 developers and consistent with the platform's own includes.


Section 3: A Practical Note on Rendering: Why the Indicator Draws Lines Rather Than Filled Bands

Before presenting the modules, one design decision deserves explanation because it shapes the main indicator file. The natural instinct when visualizing nested channels is to shade the prediction interval in one color and the confidence interval in another using the DRAW_FILLING plot type. In practice, MetaTrader 5's DRAW_FILLING renderer is reliable only for a single filled region bounded by two lines. It does not dependably render two filled regions that overlap or share a boundary, which is exactly the geometry of a confidence interval nested inside a prediction interval.

Two related constraints compound the problem. First, the MQL5 color type is a three-byte BGR value with no alpha channel, so DRAW_FILLING colors are always fully opaque; there is no way to make an inner fill semi-transparent so the outer fill shows through. Attempts to encode an alpha byte in a hexadecimal color literal are silently truncated and reinterpreted as a solid BGR color. Second, when two filled regions share an exact boundary (the confidence upper edge coinciding with the lower edge of the upper prediction zone, for instance), the renderer treats the span as continuous and the later-drawn fill does not reliably paint over the earlier one.

The robust solution is to render the channel as five distinct DRAW_LINE plots: the regression line plus the four interval boundaries. Line plots render correctly in every MetaTrader 5 build, the nesting is visually clear through color and line style, and the underlying statistical buffers remain identical to what a filled version would use. The remainder of this article documents that line-based implementation.


Section 4: Module Implementations with Explanations

OLSStatistics.mqh — Regression Parameter Estimation

//+------------------------------------------------------------------+
//|                                                OLSStatistics.mqh |
//| Ordinary Least Squares parameter estimation for rolling windows. |
//+------------------------------------------------------------------+
#ifndef __OLS_STATISTICS_MQH__
#define __OLS_STATISTICS_MQH__

//+------------------------------------------------------------------+
//| Structure: SOLSResult                                            |
//| Holds all regression outputs from a single OLS pass.             |
//+------------------------------------------------------------------+
struct SOLSResult
  {
   double            slope;          // Estimated slope β₁
   double            intercept;      // Estimated intercept β₀
   double            sse;            // Sum of squared errors
   double            x_mean;         // Mean of bar indices x̄
   double            sxx;            // Σ(xᵢ − x̄)²
   int               n;              // Number of observations used
   bool              valid;          // False if computation failed
  };

//+------------------------------------------------------------------+
//| Class: COLSStatistics                                            |
//| Fits an OLS regression line to a price array over the fixed      |
//| integer grid x = 0,1,..,count-1.                                 |
//+------------------------------------------------------------------+
class COLSStatistics
  {
public:
                     COLSStatistics(void);
                    ~COLSStatistics(void);

   //--- x_mean and sxx are precomputed by the caller because, for a
   //--- fixed window over the grid 0..count-1, they are constants:
   //---   x_mean = (count-1)/2,  sxx = count(count^2-1)/12.
   SOLSResult        Compute(const double &y[], int count,
                             double x_mean, double sxx);
  };

//+------------------------------------------------------------------+
//| Constructor                                                      |
//+------------------------------------------------------------------+
COLSStatistics::COLSStatistics(void)
  {
  }

//+------------------------------------------------------------------+
//| Destructor                                                       |
//+------------------------------------------------------------------+
COLSStatistics::~COLSStatistics(void)
  {
  }

//+------------------------------------------------------------------+
//| COLSStatistics::Compute                                          |
//| Accepts a price array, its length, and the precomputed x-domain  |
//| constants x_mean and sxx. Only the y-dependent sums are computed |
//| here, in a single pass. Returns a populated SOLSResult.          |
//+------------------------------------------------------------------+
SOLSResult COLSStatistics::Compute(const double &y[], int count,
                                   double x_mean, double sxx)
  {
   SOLSResult result;
   result.valid = false;

   if(count < 3)
      return(result);

   if(MathAbs(sxx) < 1e-14)
      return(result);

//--- Accumulate only the y-dependent sums. The x-only sums are not
//--- needed because x_mean and sxx are supplied by the caller.
   double sum_y  = 0.0;
   double sum_xy = 0.0;
   double sum_yy = 0.0;

   for(int i = 0; i < count; i++)
     {
      double xi = (double)i;
      double yi = y[i];

      if(!MathIsValidNumber(yi))
         return(result);

      sum_y  += yi;
      sum_xy += xi * yi;
      sum_yy += yi * yi;
     }

   double n      = (double)count;
   double y_mean = sum_y / n;

//--- Sxy = Σ(xᵢ − x̄)(yᵢ − ȳ) = Σxᵢyᵢ − n·x̄·ȳ
   double sxy = sum_xy - n * x_mean * y_mean;

//--- OLS slope and intercept
   double slope     = sxy / sxx;
   double intercept = y_mean - slope * x_mean;

//--- SSE = Syy − β₁·Sxy, where Syy = Σyᵢ² − n·ȳ²
   double syy = sum_yy - n * y_mean * y_mean;
   double sse = syy - slope * sxy;

   if(sse < 0.0)
      sse = 0.0;

   result.slope     = slope;
   result.intercept = intercept;
   result.sse       = sse;
   result.x_mean    = x_mean;
   result.sxx       = sxx;
   result.n         = count;
   result.valid     = true;

   return(result);
  }

#endif
//+------------------------------------------------------------------+

The SOLSResult structure bundles every quantity that downstream modules require: the fitted slope and intercept for computing ŷ, the SSE for computing residual variance, x̄ and Sxx for computing the leverage term, and n for degrees of freedom. Bundling these avoids repeated recomputation across the pipeline.

COLSStatistics::Compute accepts the closing price array, its length, and the two x-domain constants x̄ and Sxx. Those constants are passed in rather than recomputed because, for a fixed window over the integer grid x = 0, 1, …, n−1, they never change: x̄ = (n−1)/2 and Sxx = n(n²−1)/12. The main indicator computes them once at initialization, so the per-bar work here is reduced to the sums that actually depend on price. This is a deliberate optimization — recomputing the x-only sums on every bar would be pure waste, since the x-grid is identical for every window.

Inside the loop, only the y-dependent sums are accumulated: Σy, Σxy, and Σy². From these, Sxy = Σxᵢyᵢ − n·x̄·ȳ and the slope β₁ = Sxy/Sxx follow directly, and the intercept is β₀ = ȳ − β₁·x̄. Each price is checked with MathIsValidNumber before use, so a corrupt or non-finite quote causes the routine to return an invalid result rather than poisoning the sums. The guard MathAbs(sxx) < 1e-14 is retained as a defensive backstop, although for the integer grid with n ≥ 3 the supplied Sxx is deterministically positive and far from zero.

The SSE is computed via the algebraic identity SSE = Syy − β₁·Sxy, where Syy = Σyᵢ² − n·ȳ², rather than by summing squared residuals in a second loop. This is numerically equivalent for well-conditioned data; the clamp if(sse < 0.0) sse = 0.0 absorbs the small negative values that floating-point cancellation can otherwise produce, which would yield an imaginary standard error downstream. For data that is very nearly linear or nearly constant, a direct second pass over the residuals can be marginally more stable, but for financial prices the identity form is both adequate and faster.

Computational complexity: O(n) per call, a single pass over the price array, now carrying a smaller constant factor because the x-domain statistics are no longer recomputed inside the loop.

ResidualAnalysis.mqh — Variance and Degrees of Freedom

//+------------------------------------------------------------------+
//|                                             ResidualAnalysis.mqh |
//| Residual variance and degrees of freedom from OLS output.        |
//+------------------------------------------------------------------+
#ifndef __RESIDUAL_ANALYSIS_MQH__
#define __RESIDUAL_ANALYSIS_MQH__

#include "OLSStatistics.mqh"

//+------------------------------------------------------------------+
//| Structure: SResidualStatistics                                   |
//| Holds residual analysis outputs.                                 |
//+------------------------------------------------------------------+
struct SResidualStatistics
  {
   double            variance;            // s² = SSE / (n−2)
   double            std_error;           // s  = √s²
   int               degrees_of_freedom;  // n − 2
   bool              valid;
  };

//+------------------------------------------------------------------+
//| Class: CResidualAnalysis                                         |
//| Derives residual statistics from a completed OLS result.         |
//+------------------------------------------------------------------+
class CResidualAnalysis
  {
public:
                     CResidualAnalysis(void);
                    ~CResidualAnalysis(void);

   SResidualStatistics Compute(const SOLSResult &ols);
  };

//+------------------------------------------------------------------+
//| Constructor                                                      |
//+------------------------------------------------------------------+
CResidualAnalysis::CResidualAnalysis(void)
  {
  }

//+------------------------------------------------------------------+
//| Destructor                                                       |
//+------------------------------------------------------------------+
CResidualAnalysis::~CResidualAnalysis(void)
  {
  }

//+------------------------------------------------------------------+
//| CResidualAnalysis::Compute                                       |
//| Receives a validated SOLSResult and returns residual statistics. |
//+------------------------------------------------------------------+
SResidualStatistics CResidualAnalysis::Compute(const SOLSResult &ols)
  {
   SResidualStatistics res;
   res.valid = false;

   if(!ols.valid || ols.n < 3)
      return(res);

   int df = ols.n - 2;

   if(df < 1)
      return(res);

   double variance  = ols.sse / (double)df;
   double std_error = MathSqrt(variance);

   res.variance           = variance;
   res.std_error          = std_error;
   res.degrees_of_freedom = df;
   res.valid              = true;

   return(res);
  }

#endif
//+------------------------------------------------------------------+

CResidualAnalysis::Compute performs a single computation: dividing SSE by n−2 to obtain the unbiased residual variance s². The square root yields the residual standard error s, which is the primary scaling factor for all interval half-widths.

The guard df < 1 prevents division by zero or degenerate variance for windows with fewer than three observations. In practice the inp_regression_window input in the main indicator enforces a sensible minimum at a higher level, so this branch is a defensive backstop rather than an expected path.

The module's only purpose is to centralize the degrees-of-freedom arithmetic. By separating this from COLSStatistics, the architecture ensures that variance estimation is isolated from parameter estimation. If an analyst wished to substitute a robust variance estimator (such as an HC3 sandwich estimator for heteroscedastic residuals), only this module would require modification. The OLS estimator remains the Best Linear Unbiased Estimator under the Gauss-Markov theorem regardless of whether the residual distribution is normal; the interval coverage guarantee, however, relies on the normality assumption, which is why SResidualStatistics exposes both the variance and the degrees of freedom for the t-distribution module to consume.

TDistribution.mqh — Student's t Critical Value

//+------------------------------------------------------------------+
//|                                                TDistribution.mqh |
//| Student's t critical value approximation for arbitrary df.       |
//+------------------------------------------------------------------+
#ifndef __T_DISTRIBUTION_MQH__
#define __T_DISTRIBUTION_MQH__

//+------------------------------------------------------------------+
//| Class: CTDistribution                                            |
//| Provides two-tailed t critical values via rational approximation.|
//| All public results signal failure through a negative return so   |
//| that callers can detect invalid input rather than receiving a    |
//| silently substituted value.                                      |
//+------------------------------------------------------------------+
class CTDistribution
  {
public:
                     CTDistribution(void);
                    ~CTDistribution(void);

   //--- Returns t_{alpha/2, df}; returns -1.0 on invalid input.
   double            CriticalValue(int degrees_of_freedom, double alpha);

private:
   double            InverseNormalApprox(double p);
   double            TQuantileApprox(double p, int df);
  };

//+------------------------------------------------------------------+
//| Constructor                                                      |
//+------------------------------------------------------------------+
CTDistribution::CTDistribution(void)
  {
  }

//+------------------------------------------------------------------+
//| Destructor                                                       |
//+------------------------------------------------------------------+
CTDistribution::~CTDistribution(void)
  {
  }

//+------------------------------------------------------------------+
//| CTDistribution::CriticalValue                                    |
//| Returns t_{alpha/2, df} — the upper tail critical value for a    |
//| two-tailed interval at significance level alpha.                 |
//| Example: alpha=0.05, df=48 returns approximately 2.0106.         |
//| Returns -1.0 if df < 1 or alpha is outside (0, 1), so the caller |
//| can treat the result as invalid instead of proceeding with a     |
//| substituted constant.                                            |
//+------------------------------------------------------------------+
double CTDistribution::CriticalValue(int degrees_of_freedom, double alpha)
  {
//--- Reject invalid degrees of freedom and significance levels.
   if(degrees_of_freedom < 1)
      return(-1.0);
   if(alpha <= 0.0 || alpha >= 1.0)
      return(-1.0);

   double p = 1.0 - alpha * 0.5;
   return(TQuantileApprox(p, degrees_of_freedom));
  }

//+------------------------------------------------------------------+
//| CTDistribution::InverseNormalApprox                              |
//| Rational approximation to the inverse standard normal CDF.       |
//| Accuracy: absolute error < 4.5e-4 for 0 < p < 1.                 |
//| Source: Abramowitz and Stegun 26.2.17                            |
//+------------------------------------------------------------------+
double CTDistribution::InverseNormalApprox(double p)
  {
   if(p <= 0.0)
      return(-1e15);
   if(p >= 1.0)
      return(1e15);

   double sign = 1.0;
   double q    = p;

   if(q < 0.5)
     {
      sign = -1.0;
      q    = 1.0 - q;
     }

   double t = MathSqrt(-2.0 * MathLog(1.0 - q));

   double c0 = 2.515517;
   double c1 = 0.802853;
   double c2 = 0.010328;
   double d1 = 1.432788;
   double d2 = 0.189269;
   double d3 = 0.001308;

   double numerator   = c0 + c1 * t + c2 * t * t;
   double denominator = 1.0 + d1 * t + d2 * t * t + d3 * t * t * t;

   double z = t - numerator / denominator;

   return(sign * z);
  }

//+------------------------------------------------------------------+
//| CTDistribution::TQuantileApprox                                  |
//| Approximates the t-distribution quantile at probability p        |
//| with the given degrees of freedom.                               |
//| Uses the Cornish-Fisher expansion for moderate-to-large df,      |
//| and a direct closed-form for df=1 and df=2.                      |
//+------------------------------------------------------------------+
double CTDistribution::TQuantileApprox(double p, int df)
  {
//--- Handle special cases with exact closed-form results
   if(df == 1)
     {
      //--- Cauchy distribution: t = tan(π·(p − 0.5))
      return(MathTan(M_PI * (p - 0.5)));
     }

   if(df == 2)
     {
      //--- t₂ quantile: t = (2p−1)/√(2p(1−p))
      double q = 2.0 * p - 1.0;
      double r = 2.0 * p * (1.0 - p);
      if(r < 1e-15)
         return(1e15);
      return(q / MathSqrt(r));
     }

//--- For df >= 3 use the Cornish-Fisher expansion
//--- t ≈ z + (z³+z)/(4·df) + (5z⁵+16z³+3z)/(96·df²) + ...
//--- where z is the corresponding normal quantile

   double z  = InverseNormalApprox(p);
   double z2 = z * z;
   double z3 = z2 * z;
   double z5 = z3 * z2;

   double v  = (double)df;
   double v2 = v * v;

   double t = z
              + (z3 + z) / (4.0 * v)
              + (5.0 * z5 + 16.0 * z3 + 3.0 * z) / (96.0 * v2)
              + (3.0 * z5 * z2 + 19.0 * z5 + 17.0 * z3 - 15.0 * z) / (384.0 * v2 * v);

   return(t);
  }

#endif
//+------------------------------------------------------------------+

CTDistribution::CriticalValue is the single public interface: it accepts degrees of freedom and a significance level α (e.g., 0.05 for 95% intervals) and returns the upper-tail critical value t(α/2, df). It validates both arguments first — degrees of freedom below 1, or an α outside the open interval (0, 1), produce a negative return that the caller treats as invalid rather than a silently substituted constant. This is the engineering counterpart to the statistical care taken elsewhere: an out-of-range request fails visibly instead of quietly returning a plausible-looking number.

The implementation uses the Cornish-Fisher expansion, a classical series expansion that expresses t-quantiles as perturbations of the corresponding normal quantile. For the values of df encountered in rolling regression windows (typically 28 to 198 for windows of 30 to 200), this expansion converges rapidly. The absolute error of this combined approximation (Abramowitz-Stegun inverse-normal seed followed by the Cornish-Fisher t-expansion) is small across the practical range, but the article does not claim a guaranteed bound: the responsible way to use it is to verify the returned critical values against a reference table for the degrees of freedom actually in use. The diagnostics output includes the computed t critical value precisely so it can be checked against a standard table — for example, the value near 2.011 reported for 48 degrees of freedom matches the tabulated 97.5th percentile. For applications demanding certified accuracy, the module can be replaced by a tabulated lookup with interpolation or a direct numerical inversion of the t CDF without altering any other module.

InverseNormalApprox implements the Abramowitz and Stegun rational approximation (formula 26.2.17) for the inverse normal CDF. This is the seed for the Cornish-Fisher expansion. It has a maximum absolute error below 4.5 × 10⁻⁴ across the entire (0, 1) domain.

The special cases for df = 1 (Cauchy distribution) and df = 2 (exact algebraic form) are included for completeness and to keep the series approximation within its reliable range. In practice a regression window of fewer than four bars is numerically meaningless; the minimum-window guard in the main indicator ensures these branches are never reached during normal operation.

A precomputed lookup table for specific df values would be compact and exact but would require interpolation for intermediate df and would not generalize to different α inputs. The Cornish-Fisher approach is slightly more code but fully parametric.

ConfidenceInterval.mqh — Mean Uncertainty Bands

//+------------------------------------------------------------------+
//|                                       ConfidenceInterval.mqh     |
//| Computes OLS confidence intervals for the mean response.         |
//+------------------------------------------------------------------+
#ifndef __CONFIDENCE_INTERVAL_MQH__
#define __CONFIDENCE_INTERVAL_MQH__

#include "OLSStatistics.mqh"
#include "ResidualAnalysis.mqh"
#include "TDistribution.mqh"

//+------------------------------------------------------------------+
//| Structure: SIntervalBand                                         |
//| Holds the upper and lower band values at a single x position.    |
//+------------------------------------------------------------------+
struct SIntervalBand
  {
   double            upper;
   double            lower;
   double            fitted;
   bool              valid;
  };

//+------------------------------------------------------------------+
//| Class: CConfidenceInterval                                       |
//| Calculates mean confidence interval bands over a regression.     |
//+------------------------------------------------------------------+
class CConfidenceInterval
  {
public:
                     CConfidenceInterval(void);
                    ~CConfidenceInterval(void);

   SIntervalBand     Evaluate(const SOLSResult           &ols,
                              const SResidualStatistics  &res,
                              double                     t_critical,
                              int                        x_position);
  };

//+------------------------------------------------------------------+
//| Constructor                                                      |
//+------------------------------------------------------------------+
CConfidenceInterval::CConfidenceInterval(void)
  {
  }

//+------------------------------------------------------------------+
//| Destructor                                                       |
//+------------------------------------------------------------------+
CConfidenceInterval::~CConfidenceInterval(void)
  {
  }

//+------------------------------------------------------------------+
//| CConfidenceInterval::Evaluate                                    |
//| Computes the confidence interval at position x_position.         |
//|                                                                  |
//| Formula:                                                         |
//|    ŷ ± t · s · √(1/n + (x − x̄)² / Sxx)                           |
//+------------------------------------------------------------------+
SIntervalBand CConfidenceInterval::Evaluate(const SOLSResult           &ols,
      const SResidualStatistics  &res,
      double                     t_critical,
      int                        x_position)
  {
   SIntervalBand band;
   band.valid = false;

   if(!ols.valid || !res.valid)
      return(band);

   double x      = (double)x_position;
   double fitted = ols.intercept + ols.slope * x;

   double deviation = x - ols.x_mean;
   double leverage  = (1.0 / (double)ols.n) + (deviation * deviation) / ols.sxx;

   double half_width = t_critical * res.std_error * MathSqrt(leverage);

   band.fitted = fitted;
   band.upper  = fitted + half_width;
   band.lower  = fitted - half_width;
   band.valid  = true;

   return(band);
  }

#endif
//+------------------------------------------------------------------+

CConfidenceInterval::Evaluate implements the closed-form confidence interval half-width computation at a single evaluation point x_position.

The leverage variable corresponds to hᵢ = 1/n + (xᵢ − x̄)²/Sxx, the hat-matrix diagonal. The computation is structured to make this explicit: the deviation x − x_mean is squared and divided by Sxx (the denominator of the slope estimator), then added to the global uncertainty term 1/n. This is the standard hat-matrix formula evaluated at an arbitrary point rather than only at observed data points, which is correct because we evaluate the band at every bar index from 0 to n−1.

The half_width variable is t · s · √h, where t is the precomputed critical value, s is the residual standard error, and √h is the square root of the leverage. This factorization makes each contribution legible: t scales for sample size and confidence level, s scales for intrinsic price noisiness, and √h scales for position within the regression window.

The result structure SIntervalBand carries the fitted value (the point on the regression line) alongside upper and lower, so the calling indicator code can populate the regression-line buffer and both confidence-line buffers from a single call. This structure is defined here and shared with the prediction-interval module.

PredictionInterval.mqh — Individual Observation Bands

//+------------------------------------------------------------------+
//|                                           PredictionInterval.mqh |
//| Computes OLS prediction intervals for individual observations.   |
//+------------------------------------------------------------------+
#ifndef __PREDICTION_INTERVAL_MQH__
#define __PREDICTION_INTERVAL_MQH__

#include "OLSStatistics.mqh"
#include "ResidualAnalysis.mqh"
#include "TDistribution.mqh"
#include "ConfidenceInterval.mqh"   // Provides SIntervalBand

//+------------------------------------------------------------------+
//| Class: CPredictionInterval                                       |
//| Calculates prediction interval bands over a regression.          |
//+------------------------------------------------------------------+
class CPredictionInterval
  {
public:
                     CPredictionInterval(void);
                    ~CPredictionInterval(void);

   SIntervalBand     Evaluate(const SOLSResult           &ols,
                              const SResidualStatistics  &res,
                              double                     t_critical,
                              int                        x_position);
  };

//+------------------------------------------------------------------+
//| Constructor                                                      |
//+------------------------------------------------------------------+
CPredictionInterval::CPredictionInterval(void)
  {
  }

//+------------------------------------------------------------------+
//| Destructor                                                       |
//+------------------------------------------------------------------+
CPredictionInterval::~CPredictionInterval(void)
  {
  }

//+------------------------------------------------------------------+
//| CPredictionInterval::Evaluate                                    |
//| Computes the prediction interval at position x_position.         |
//|                                                                  |
//| Formula:                                                         |
//|    ŷ ± t · s · √(1 + 1/n + (x − x̄)² / Sxx)                       |
//|                                                                  |
//| The additional 1 under the square root represents the variance   |
//| of a new individual observation around the regression line.      |
//+------------------------------------------------------------------+
SIntervalBand CPredictionInterval::Evaluate(const SOLSResult           &ols,
      const SResidualStatistics  &res,
      double                     t_critical,
      int                        x_position)
  {
   SIntervalBand band;
   band.valid = false;

   if(!ols.valid || !res.valid)
      return(band);

   double x      = (double)x_position;
   double fitted = ols.intercept + ols.slope * x;

   double deviation     = x - ols.x_mean;
   double pred_variance = 1.0
                          + (1.0 / (double)ols.n)
                          + (deviation * deviation) / ols.sxx;

   double half_width = t_critical * res.std_error * MathSqrt(pred_variance);

   band.fitted = fitted;
   band.upper  = fitted + half_width;
   band.lower  = fitted - half_width;
   band.valid  = true;

   return(band);
  }

#endif
//+------------------------------------------------------------------+

CPredictionInterval::Evaluate is structurally identical to the confidence-interval method with one critical difference: the pred_variance variable uses 1.0 + 1/n + (x−x̄)²/Sxx rather than 1/n + (x−x̄)²/Sxx. The leading 1.0 represents the unit variance of a new observation drawn from the error distribution. Even if the regression parameters were estimated with infinite precision, a new price observation would still deviate from the line by approximately ±s.

This added 1.0 guarantees that prediction intervals are always strictly wider than confidence intervals. The ratio of prediction half-width to confidence half-width at any point x is:

ratio = √(1 + 1/n + (x−x̄)²/Sxx) / √(1/n + (x−x̄)²/Sxx)

At x = x̄ (window center), leverage is minimized and the ratio is largest. For a window of 50 bars at the center: √(1 + 1/50) / √(1/50) ≈ √51 ≈ 7.14 times wider. Near the window edges the leverage term grows and the relative excess narrows, but prediction intervals remain strictly wider at all points.

This module includes ConfidenceInterval.mqh to obtain the SIntervalBand definition. The earlier version of this module omitted that include and relied on the main indicator file pulling the definition in first, which made the header fail to compile on its own. Including the defining header directly removes that fragile ordering dependency; the #ifndef include guards ensure the struct is defined exactly once even when several modules include it.

RegressionChannels.mq5 — Main Indicator

//+------------------------------------------------------------------+
//|                                           RegressionChannels.mq5 |
//| Rolling OLS regression channels with confidence and prediction   |
//| intervals computed using Student's t-distribution.               |
//| Line-only rendering: regression line plus four boundary lines.   |
//+------------------------------------------------------------------+

#property description "Linear Regression Channels: OLS + Student's t"
#property indicator_chart_window
#property indicator_buffers 5
#property indicator_plots   5

//--- Plot 1: Regression line
#property indicator_label1  "Regression Line"
#property indicator_type1   DRAW_LINE
#property indicator_color1  clrDodgerBlue
#property indicator_style1  STYLE_SOLID
#property indicator_width1  2

//--- Plot 2: Prediction interval upper boundary
#property indicator_label2  "PI Upper"
#property indicator_type2   DRAW_LINE
#property indicator_color2  clrCrimson
#property indicator_style2  STYLE_DASH
#property indicator_width2  1

//--- Plot 3: Prediction interval lower boundary
#property indicator_label3  "PI Lower"
#property indicator_type3   DRAW_LINE
#property indicator_color3  clrCrimson
#property indicator_style3  STYLE_DASH
#property indicator_width3  1

//--- Plot 4: Confidence interval upper boundary
#property indicator_label4  "CI Upper"
#property indicator_type4   DRAW_LINE
#property indicator_color4  clrForestGreen
#property indicator_style4  STYLE_SOLID
#property indicator_width4  1

//--- Plot 5: Confidence interval lower boundary
#property indicator_label5  "CI Lower"
#property indicator_type5   DRAW_LINE
#property indicator_color5  clrForestGreen
#property indicator_style5  STYLE_SOLID
#property indicator_width5  1

//--- Includes
#include <Linear_Regression_Prediction_Channels/OLSStatistics.mqh>
#include <Linear_Regression_Prediction_Channels/ResidualAnalysis.mqh>
#include <Linear_Regression_Prediction_Channels/TDistribution.mqh>
#include <Linear_Regression_Prediction_Channels/ConfidenceInterval.mqh>
#include <Linear_Regression_Prediction_Channels/PredictionInterval.mqh>

//+------------------------------------------------------------------+
//| Evaluation mode for the interval bands.                          |
//| CURRENT_EDGE: evaluate at x = window-1, the last observed bar    |
//|               inside the window. This is an in-sample edge band. |
//| NEXT_BAR:     evaluate at x = window, a genuine one-step-ahead   |
//|               forecast point one bar beyond the window.          |
//+------------------------------------------------------------------+
enum ENUM_EVAL_MODE
  {
   EVAL_CURRENT_EDGE = 0, // In-Sample Edge (x = n-1)
   EVAL_NEXT_BAR     = 1  // One-Step-Ahead Forecast (x = n)
  };

//--- Input parameters
input int            inp_regression_window = 50;              // Regression Window (Bars)
input double         inp_confidence_level  = 0.95;            // Confidence Level (0.90 to 0.99)
input ENUM_EVAL_MODE inp_eval_mode         = EVAL_CURRENT_EDGE;// Interval Evaluation Mode
input bool           inp_show_ci           = true;            // Show Confidence Interval Lines
input bool           inp_show_pi           = true;            // Show Prediction Interval Lines
input bool           inp_print_diagnostics = true;            // Print Diagnostics To Experts Log

//--- Indicator buffers
double g_buffer_regression[];
double g_buffer_pi_upper[];
double g_buffer_pi_lower[];
double g_buffer_ci_upper[];
double g_buffer_ci_lower[];

//--- Module instances (global)
COLSStatistics      g_ols_engine;
CResidualAnalysis   g_residual_engine;
CTDistribution      g_t_distribution;
CConfidenceInterval g_ci_engine;
CPredictionInterval g_pi_engine;

//--- Precomputed x-domain constants (fixed for a given window size)
double g_x_mean = 0.0;   // x̄ = (n-1)/2
double g_sxx    = 0.0;   // Σ(xᵢ − x̄)² = n(n²−1)/12
int    g_x_eval = 0;     // evaluation position (n-1 or n)

//--- Reusable working buffer for the price window (allocated once)
double g_prices[];

//--- Diagnostics state
bool g_diagnostics_printed = false;

//+------------------------------------------------------------------+
//| Self-check: verify x-domain constants for a known window.        |
//| Returns true when the closed-form constants match the expected   |
//| values for the active window, providing a lightweight unit test  |
//| that runs once at initialization.                                |
//+------------------------------------------------------------------+
bool SelfCheckXConstants(int window)
  {
   double expect_mean = (double)(window - 1) / 2.0;
   double expect_sxx  = (double)window * ((double)window * window - 1.0) / 12.0;

   if(MathAbs(g_x_mean - expect_mean) > 1e-9)
      return(false);
   if(MathAbs(g_sxx - expect_sxx) > 1e-6)
      return(false);

   return(true);
  }

//+------------------------------------------------------------------+
//| Custom indicator initialization function                         |
//+------------------------------------------------------------------+
int OnInit(void)
  {
//--- Validate inputs
   if(inp_regression_window < 5)
     {
      Print("RegressionChannels: inp_regression_window must be >= 5. Received: " +
            IntegerToString(inp_regression_window));
      return(INIT_PARAMETERS_INCORRECT);
     }

   if(inp_confidence_level <= 0.50 || inp_confidence_level >= 1.00)
     {
      Print("RegressionChannels: inp_confidence_level must be in (0.50, 1.00). Received: " +
            DoubleToString(inp_confidence_level, 4));
      return(INIT_PARAMETERS_INCORRECT);
     }

//--- Precompute x-domain constants for the fixed window. Because bar
//--- indices are always 0..window-1, these never change between bars,
//--- so they are computed once here rather than on every bar.
   int n = inp_regression_window;
   g_x_mean = (double)(n - 1) / 2.0;
   g_sxx    = (double)n * ((double)n * n - 1.0) / 12.0;

//--- Choose the evaluation position from the selected mode.
   g_x_eval = (inp_eval_mode == EVAL_NEXT_BAR) ? n : (n - 1);

//--- Run the lightweight self-check on the x-domain constants.
   if(!SelfCheckXConstants(n))
     {
      Print("RegressionChannels: x-domain self-check failed. "
            "x_mean=" + DoubleToString(g_x_mean, 4) +
            " sxx=" + DoubleToString(g_sxx, 4));
      return(INIT_FAILED);
     }

//--- Allocate the reusable price working buffer once.
   if(ArrayResize(g_prices, n) != n)
     {
      Print("RegressionChannels: failed to allocate price buffer.");
      return(INIT_FAILED);
     }

//--- Bind buffers
   SetIndexBuffer(0, g_buffer_regression, INDICATOR_DATA);
   SetIndexBuffer(1, g_buffer_pi_upper,   INDICATOR_DATA);
   SetIndexBuffer(2, g_buffer_pi_lower,   INDICATOR_DATA);
   SetIndexBuffer(3, g_buffer_ci_upper,   INDICATOR_DATA);
   SetIndexBuffer(4, g_buffer_ci_lower,   INDICATOR_DATA);

//--- Set empty value per plot (5 plots, indices 0..4)
   PlotIndexSetDouble(0, PLOT_EMPTY_VALUE, EMPTY_VALUE);
   PlotIndexSetDouble(1, PLOT_EMPTY_VALUE, EMPTY_VALUE);
   PlotIndexSetDouble(2, PLOT_EMPTY_VALUE, EMPTY_VALUE);
   PlotIndexSetDouble(3, PLOT_EMPTY_VALUE, EMPTY_VALUE);
   PlotIndexSetDouble(4, PLOT_EMPTY_VALUE, EMPTY_VALUE);

//--- Apply current visibility from inputs. Using DRAW_LINE / DRAW_NONE
//--- here (rather than only in OnInit on first load) means the plots
//--- track input changes when the indicator is re-initialized.
   int pi_type = inp_show_pi ? DRAW_LINE : DRAW_NONE;
   int ci_type = inp_show_ci ? DRAW_LINE : DRAW_NONE;
   PlotIndexSetInteger(1, PLOT_DRAW_TYPE, pi_type);
   PlotIndexSetInteger(2, PLOT_DRAW_TYPE, pi_type);
   PlotIndexSetInteger(3, PLOT_DRAW_TYPE, ci_type);
   PlotIndexSetInteger(4, PLOT_DRAW_TYPE, ci_type);

//--- Require enough historical bars
   IndicatorSetInteger(INDICATOR_DIGITS, _Digits);

   string mode_tag   = (inp_eval_mode == EVAL_NEXT_BAR) ? "fwd" : "edge";
   string short_name = "RegCh(" +
                       IntegerToString(inp_regression_window) + "," +
                       DoubleToString(inp_confidence_level * 100.0, 0) + "%," +
                       mode_tag + ")";
   IndicatorSetString(INDICATOR_SHORTNAME, short_name);

   g_diagnostics_printed = false;

   return(INIT_SUCCEEDED);
  }

//+------------------------------------------------------------------+
//| Custom indicator deinitialization function                       |
//+------------------------------------------------------------------+
void OnDeinit(const int reason)
  {
  }

//+------------------------------------------------------------------+
//| Helper: write EMPTY_VALUE into every buffer at one bar           |
//+------------------------------------------------------------------+
void ClearBar(int bar)
  {
   g_buffer_regression[bar] = EMPTY_VALUE;
   g_buffer_pi_upper[bar]   = EMPTY_VALUE;
   g_buffer_pi_lower[bar]   = EMPTY_VALUE;
   g_buffer_ci_upper[bar]   = EMPTY_VALUE;
   g_buffer_ci_lower[bar]   = EMPTY_VALUE;
  }

//+------------------------------------------------------------------+
//| Custom indicator iteration function                              |
//+------------------------------------------------------------------+
int OnCalculate(const int      rates_total,
                const int      prev_calculated,
                const datetime &time[],
                const double   &open[],
                const double   &high[],
                const double   &low[],
                const double   &close[],
                const long     &tick_volume[],
                const long     &volume[],
                const int      &spread[])
  {
   int window = inp_regression_window;

//--- Need at least one full window before drawing
   if(rates_total < window)
      return(0);

//--- Determine starting bar for this pass
   int start_bar = prev_calculated - 1;
   if(start_bar < window - 1)
      start_bar = window - 1;

//--- Fill leading bars with EMPTY_VALUE
   if(prev_calculated == 0)
     {
      for(int i = 0; i < window - 1; i++)
         ClearBar(i);
     }

//--- Compute alpha from confidence level
   double alpha = 1.0 - inp_confidence_level;

//--- Process each bar from start_bar to the most recent
   for(int bar = start_bar; bar < rates_total; bar++)
     {
      //--- Copy this window's closing prices into the reusable buffer.
      //--- prices[0] = oldest bar in window, prices[window-1] = bar itself.
      for(int k = 0; k < window; k++)
         g_prices[k] = close[bar - (window - 1) + k];

      //--- Step 1: Fit OLS regression (passing the precomputed x stats)
      SOLSResult ols = g_ols_engine.Compute(g_prices, window, g_x_mean, g_sxx);

      if(!ols.valid)
        {
         ClearBar(bar);
         continue;
        }

      //--- Step 2: Compute residual variance
      SResidualStatistics res = g_residual_engine.Compute(ols);

      if(!res.valid)
        {
         ClearBar(bar);
         continue;
        }

      //--- Step 3: Get t critical value; skip the bar if it is invalid
      double t_crit = g_t_distribution.CriticalValue(res.degrees_of_freedom, alpha);

      if(t_crit < 0.0)
        {
         ClearBar(bar);
         continue;
        }

      //--- Step 4: Evaluate intervals at the configured position.
      //--- g_x_eval is window-1 for the in-sample edge, or window for a
      //--- one-step-ahead forecast one bar beyond the window.
      SIntervalBand ci = g_ci_engine.Evaluate(ols, res, t_crit, g_x_eval);
      SIntervalBand pi = g_pi_engine.Evaluate(ols, res, t_crit, g_x_eval);

      //--- Default this bar to empty, then fill in whatever is valid/enabled
      ClearBar(bar);

      //--- Regression line
      if(ci.valid)
         g_buffer_regression[bar] = ci.fitted;

      //--- Prediction interval boundary lines
      if(inp_show_pi && pi.valid)
        {
         g_buffer_pi_upper[bar] = pi.upper;
         g_buffer_pi_lower[bar] = pi.lower;
        }

      //--- Confidence interval boundary lines
      if(inp_show_ci && ci.valid)
        {
         g_buffer_ci_upper[bar] = ci.upper;
         g_buffer_ci_lower[bar] = ci.lower;
        }

      //--- Step 5: Print diagnostics on first valid bar only
      if(inp_print_diagnostics && !g_diagnostics_printed && ci.valid && pi.valid)
        {
         PrintDiagnostics(ols, res, t_crit, ci, pi, bar);
         g_diagnostics_printed = true;
        }
     }

   return(rates_total);
  }

//+------------------------------------------------------------------+
//| PrintDiagnostics                                                 |
//| Outputs regression statistics to the MetaTrader Experts log.     |
//+------------------------------------------------------------------+
void PrintDiagnostics(const SOLSResult          &ols,
                      const SResidualStatistics  &res,
                      double                     t_crit,
                      const SIntervalBand        &ci,
                      const SIntervalBand        &pi,
                      int                        bar_index)
  {
   string mode_text = (inp_eval_mode == EVAL_NEXT_BAR)
                      ? "One-Step-Ahead Forecast (x = n)"
                      : "In-Sample Edge (x = n-1)";

   Print("=== RegressionChannels Diagnostics ===");
   Print("Evaluation Mode     = " + mode_text);
   Print("Evaluation x        = " + IntegerToString(g_x_eval));
   Print("Bar Index           = " + IntegerToString(bar_index));
   Print("Regression Window   = " + IntegerToString(ols.n));
   Print("Degrees Of Freedom  = " + IntegerToString(res.degrees_of_freedom));
   Print("Regression Slope    = " + DoubleToString(ols.slope, 8));
   Print("Regression Intercept= " + DoubleToString(ols.intercept, 8));
   Print("SSE                 = " + DoubleToString(ols.sse, 8));
   Print("Residual Variance   = " + DoubleToString(res.variance, 8));
   Print("Residual Std Error  = " + DoubleToString(res.std_error, 8));
   Print("t Critical Value    = " + DoubleToString(t_crit, 6));
   Print("x Mean              = " + DoubleToString(ols.x_mean, 4));
   Print("Sxx                 = " + DoubleToString(ols.sxx, 4));
   Print("--- Confidence Interval ---");
   Print("CI Fitted           = " + DoubleToString(ci.fitted, _Digits));
   Print("CI Upper            = " + DoubleToString(ci.upper,  _Digits));
   Print("CI Lower            = " + DoubleToString(ci.lower,  _Digits));
   Print("CI Half Width       = " + DoubleToString(ci.upper - ci.fitted, _Digits));
   Print("--- Prediction Interval ---");
   Print("PI Fitted           = " + DoubleToString(pi.fitted, _Digits));
   Print("PI Upper            = " + DoubleToString(pi.upper,  _Digits));
   Print("PI Lower            = " + DoubleToString(pi.lower,  _Digits));
   Print("PI Half Width       = " + DoubleToString(pi.upper - pi.fitted, _Digits));
   Print("--- Width Ratio PI/CI ---");
   double ci_hw = ci.upper - ci.fitted;
   double pi_hw = pi.upper - pi.fitted;
   if(ci_hw > 1e-15)
      Print("Width Ratio         = " + DoubleToString(pi_hw / ci_hw, 6));
   Print("======================================");
  }
//+------------------------------------------------------------------+

The indicator declares five buffers and five plots, all of type DRAW_LINE. Each plot consumes exactly one buffer, so the indicator_buffers and indicator_plots counts match cleanly and the buffer index of each plot is simply its declaration order. This is the simplest possible buffer arrangement and is immune to the DRAW_FILLING rendering issues discussed in Section 3. The buffer-to-plot map is: buffer 0 = regression line, buffer 1 = PI upper, buffer 2 = PI lower, buffer 3 = CI upper, buffer 4 = CI lower.

OnInit validates the two critical inputs (window size and confidence level) before doing anything else, returning INIT_PARAMETERS_INCORRECT if either is out of range. It then precomputes the x-domain constants x̄ and Sxx from their closed forms and selects the evaluation position g_x_eval from the chosen mode — n−1 for the in-sample edge, or n for the one-step-ahead forecast. A lightweight self-check, SelfCheckXConstants, confirms that the precomputed x̄ and Sxx match the values expected for the active window before the indicator proceeds; this is a unit-style assertion that runs once and fails fast if the closed-form arithmetic is ever broken by a future edit. The reusable price buffer g_prices is allocated a single time here rather than on every bar.

After binding buffers and assigning EMPTY_VALUE as each plot's empty-value marker, OnInit sets the draw type of every band plot from the current input flags: DRAW_LINE when the band is enabled, DRAW_NONE when it is disabled. Because this runs on every initialization, the plots track input changes. Toggling a band or switching evaluation mode takes effect on re-initialization without removing and re-adding the indicator. The short name encodes the window, the confidence level, and the active mode (edge or forward) for immediate identification on the chart.

OnDeinit is intentionally empty. Clearing indicator buffers at deinitialization serves no purpose, since the buffers are released with the indicator instance; the cleanup is left to the platform rather than performed as unnecessary ceremony.

OnCalculate is the computational core. The start_bar logic respects MetaTrader's incremental recalculation protocol: when prev_calculated > 0 only the newly arrived bar is processed, and on a fresh attach (prev_calculated == 0) all bars are processed, with the one-bar overlap (prev_calculated − 1) ensuring the most recently confirmed bar is recomputed and no gap is left. For each bar the window's closing prices are copied into the preallocated g_prices buffer (prices[0] is the oldest bar in the window, prices[window−1] is the bar itself), avoiding a per-bar allocation.

The pipeline within the bar loop follows the module hierarchy exactly: OLS fit, residual analysis, t critical value, then interval evaluation at g_x_eval. Each step is checked before the next. An invalid OLS fit or residual result routes to ClearBar(bar) and continue, and the t critical value is now checked explicitly — if CriticalValue returns its negative sentinel for invalid input, the bar is cleared rather than drawn with a substituted constant. Both interval methods are evaluated at the configured position, so the displayed band is an in-sample edge band at x = n−1 or a genuine one-step-ahead forecast band at x = n, depending on the mode. The repeated buffer-clearing is factored into the ClearBar helper so that every invalid-bar path and the per-bar default behave identically, removing a class of copy-paste inconsistency. Diagnostics are printed only once per indicator lifetime, on the first fully valid bar, guarded by g_diagnostics_printed, so the Experts log is not flooded during full historical recalculation.

Deployment

All six files are presented complete and non-truncated. To deploy:

  • Create a folder named Linear_Regression_Prediction_Channels under MQL5/Include/ in the MetaTrader 5 data directory, and place the five .mqh files there. The angle-bracket include paths in RegressionChannels.mq5 resolve against the Include directory.
  • Place RegressionChannels.mq5 under MQL5/Indicators/ (a subfolder is optional).
  • Open the indicator in MetaEditor and compile with F7. It should compile with no errors and no warnings.
  • Attach it to a chart from the Navigator. Input changes — including toggling bands on or off and switching the evaluation mode — are applied cleanly on the indicator's normal re-initialization, so there is no need to remove and re-add it.

No external libraries are required beyond the five project headers.


Section 5: Expected Diagnostics Output

When the indicator is attached to a EURUSD H1 chart with a 50-bar window, a 95% confidence level, and the default in-sample edge mode, it writes a single diagnostic block to the Experts log on the first fully valid bar. The following is the actual output from such a run:

=== RegressionChannels Diagnostics ===
Evaluation Mode     = In-Sample Edge (x = n-1)
Evaluation x        = 49
Bar Index           = 49
Regression Window   = 50
Degrees Of Freedom  = 48
Regression Slope    = 0.00002402
Regression Intercept= 1.12126659
SSE                 = 0.00034255
Residual Variance   = 0.00000714
Residual Std Error  = 0.00267141
t Critical Value    = 2.011095
x Mean              = 24.5000
Sxx                 = 10412.5000
--- Confidence Interval ---
CI Fitted           = 1.12244
CI Upper            = 1.12394
CI Lower            = 1.12095
CI Half Width       = 0.00150
--- Prediction Interval ---
PI Fitted           = 1.12244
PI Upper            = 1.12802
PI Lower            = 1.11687
PI Half Width       = 0.00558
--- Width Ratio PI/CI ---
Width Ratio         = 3.725425
======================================

Several independent checks confirm the implementation behaves as the theory requires. The Evaluation Mode and Evaluation x lines make explicit which point the bands describe: here the in-sample edge at x = 49, the last bar inside the 50-bar window. The reported x Mean of 24.5000 and Sxx of 10412.5000 match the closed-form values for the integer grid 0…49 exactly — x̄ = (n−1)/2 = 24.5 and Sxx = n(n²−1)/12 = 10412.5 — which is the same arithmetic the initialization self-check verifies before any drawing occurs.

The t critical value of 2.011095 matches the tabulated 97.5th percentile of the Student's t distribution for 48 degrees of freedom (approximately 2.0106), confirming that the Cornish-Fisher approximation is accurate at the degrees of freedom in use. Because this value is printed, it can be checked against a reference table directly, which is the intended way to validate the approximation rather than trusting it blindly.

The interval relationships hold as constructed. The fitted value is identical for the confidence and prediction intervals (1.12244) because both are centered on the same regression line; only their half-widths differ. The prediction half-width (0.00558) is several times the confidence half-width (0.00150), giving a width ratio of 3.725425. This ratio reflects the dominance of irreducible individual-observation variance over mean-estimation uncertainty at the evaluated point. The value is position-dependent: at the window center the ratio would be larger, since the shared leverage term is smallest there, and in one-step-ahead mode at x = 50 it would be slightly smaller, since the leverage term grows just beyond the window edge.


Rolling OLS regression channel on a EURUSD H1 chart

Rolling OLS regression channel on a EURUSD H1 chart, showing the solid blue regression line, the two solid green confidence-interval lines nested inside, and the two dashed crimson prediction-interval lines outside, with all five lines narrowing toward the window center and widening at the edges.


Section 6: Reading the Channel on a Live Chart

When attached to a trending market, the five lines order themselves consistently from a single bar's Data Window readout: PI Lower < CI Lower < Regression < CI Upper < PI Upper. A representative live reading on EURUSD H1 produced PI Lower 1.14343, CI Lower 1.14740, Regression 1.14886, CI Upper 1.15032, and PI Upper 1.15429. The strict nesting holds at every bar because the prediction half-width contains the confidence half-width by construction.

The interpretation of a boundary touch differs by line. Price reaching or exceeding a confidence-interval line is unremarkable; the confidence interval describes uncertainty in the trend itself, not in individual prices, so prices routinely sit outside it. Price reaching or exceeding a prediction-interval line is the statistically meaningful event. The strength of that statement depends on the evaluation mode. In one-step-ahead mode (x = n), the prediction line is a forecast for the next bar, and under the model roughly one close in twenty should fall outside the 95% band; a markedly higher rate signals assumption violations and should be confirmed with the out-of-sample coverage test of Section 8. In in-sample edge mode (x = n−1), the line describes dispersion at a bar the model has already fitted, so the "one in twenty" reading does not transfer directly and the band is better understood as a descriptive edge than as a forecast.


Section 7: Geometric Interpretation of the Widening Effect

The widening of both intervals toward the window boundaries is often described informally as a consequence of having less data at the edges. This description is incorrect, and the distinction matters for interpretation.

The regression window contains exactly n observations everywhere; there is no data-density gradient. The widening is a consequence of the hat matrix geometry. When a regression line is fitted to a scatter of points, the slope is estimated by effectively rotating the line around the centroid (x̄, ȳ) of the data. Small perturbations in the slope, caused by the uncertainty in the slope estimator, produce large displacements of the fitted line at positions far from x̄ and small displacements near x̄.

Formally, if the slope estimate β₁ has standard error SE(β₁) = s/√Sxx, then the uncertainty in ŷ(x) is:

var(ŷ(x)) = s² · [ 1/n + (x − x̄)² / Sxx ]

The (x − x̄)² term grows quadratically with distance from the centroid. At x̄ it vanishes entirely, leaving only the 1/n term. At the window edges, where x = 0 or x = n−1, the deviation is ±(n−1)/2 and the leverage is maximized.


Section 8: Residual Distribution and Model Diagnostics

A well-specified OLS model should produce residuals that are approximately normally distributed with constant variance. Systematic departures indicate model misspecification.

Distribution of residuals from a rolling 50-bar OLS fit on synthetic EURUSD H1 closing prices

Distribution of residuals from a rolling 50-bar OLS fit on synthetic EURUSD H1 closing prices, overlaid with the Gaussian curve implied by the estimated residual variance. The empirical distribution is sharply peaked at the center and carries heavier tails than the Gaussian (excess kurtosis ≈ +2.98) while remaining nearly symmetric (skewness ≈ +0.04). This leptokurtic shape illustrates why the nominal 95% prediction interval coverage should be treated as approximate: fat-tailed residuals breach the band more often than a normal model predicts.

In practice, financial price residuals frequently exhibit:

  • Excess kurtosis (fat tails): more extreme residuals than a normal distribution predicts, which causes the actual coverage of the 95% prediction interval to fall below 95%; the bands will be breached more often than the nominal level implies.
  • Skewness: asymmetric residual distributions, common in instruments with directional momentum.
  • Autocorrelation in residuals: if the Durbin-Watson statistic deviates materially from 2.0, the OLS standard errors are biased and the interval widths are unreliable.

None of these violations make the formula derivations algebraically incorrect. They cause the probabilistic coverage to diverge from the nominal values. A statistically literate user of this indicator treats the 95% labels as model outputs conditional on the OLS assumptions and calibrates interpretation accordingly.

Empirical Coverage and Robust Alternatives

Because the nominal 95% is a model statement, it should be checked rather than assumed. A minimal empirical test records, over a long out-of-sample stretch, the fraction of bars whose close falls outside the one-step-ahead 95% prediction interval evaluated at x = n. Under correct specification that fraction should be near 5%; a materially higher rate is direct evidence that the assumptions — most often homoscedasticity or normality — are being violated by the prevailing regime. A more complete study repeats this across instruments and timeframes and compares the breach rate against a Bollinger envelope of comparable width.

Two structural refinements address the most common violations. First, fitting the regression on log prices or on price differences rather than on raw price levels mitigates the near-non-stationarity that makes raw-price residuals strongly autocorrelated. Second, replacing the homoscedastic standard error s with a heteroscedasticity-robust estimator (such as HC3) or a heteroscedasticity-and-autocorrelation-consistent estimator (such as Newey-West), or modeling the conditional variance separately with a GARCH-type specification, produces interval widths that adapt to volatility clustering. These are natural extensions of the architecture presented here: only the residual-variance module would change, while the OLS, t-distribution, and interval modules remain intact.


Section 9: Comparison of Channel Indicators

Indicator Width Determinant Position Dependence Statistical Interpretation
Donchian Channel Historical high/low range None Descriptive extrema
Bollinger Bands Sample std dev × fixed multiplier None (uniform width) Heuristic volatility envelope
OLS Confidence Interval Mean estimation uncertainty Yes — narrowest at x̄ Uncertainty in estimated trend
OLS Prediction Interval Mean + observation uncertainty Yes — narrowest at x̄ Uncertainty for an individual observation

The critical structural distinction is position dependence. Bollinger Bands produce a symmetric, uniform envelope regardless of where a bar falls relative to the moving-average window. Regression channels produce position-dependent envelopes that are narrowest at the center of the regression window and widen deterministically toward both edges. This asymmetry is a direct consequence of the statistical model rather than a stylistic choice.

Bollinger Bands should not be characterized as statistically invalid. They are a well-defined descriptive tool for measuring volatility relative to a moving average. Their limitation is that they make no formal probability claims and cannot be calibrated to a target coverage level without specifying an explicit generative model.


Conclusion

This article presented a complete MQL5 implementation of linear regression channels built on Ordinary Least Squares estimation, Student's t critical values, and formal confidence and prediction intervals. The slope and intercept are estimated in a single pass over each window, the residual variance uses the n−2 denominator that accounts for the two estimated parameters, and the t critical values — obtained from a Cornish-Fisher approximation whose output is meant to be checked against a reference table — replace the fixed normal-distribution multiplier that descriptive channels rely on.

Two distinctions carry the practical weight of the design. First, the confidence interval describes uncertainty in the estimated trend, while the wider prediction interval adds the scatter of an individual observation; both narrow at the window center and widen toward the edges as leverage grows. Second, the band evaluated at x = n−1 is an in-sample edge band, whereas the band at x = n is a genuine one-step-ahead forecast, and the indicator exposes this as a selectable mode so the displayed channel can be labeled honestly. Throughout, the nominal 95% is exact only within the model: on financial prices, where independence, homoscedasticity, normality, and stationarity are routinely violated, it is a motivated approximation that should be confirmed by an empirical coverage test rather than assumed.

On the engineering side, the per-bar cost is an O(n) pass over the price window — already reduced in constant factor by precomputing the fixed x-domain statistics, and reducible to O(1) with running sums. Rendering uses five distinct DRAW_LINE plots to sidestep MetaTrader 5's unreliable overlapping-fill behavior and its lack of an alpha channel. Calibrating channel width from an explicit model and its true degrees of freedom, rather than from a heuristic multiplier, gives these channels a clearer interpretation than Bollinger or Donchian envelopes — provided their coverage claims are validated before being trusted as probabilistic signals.


Programs used in the article

# Name Type Description
1 OLSStatistics.mqh Include File Fits the ordinary least squares regression line to each window of closing prices, returning the slope, intercept, and sum of squared errors; it takes the fixed x-axis statistics as inputs so they are computed once rather than on every bar.
2 ResidualAnalysis.mqh Include File
Turns the regression's sum of squared errors into the residual variance and standard error, dividing by the n−2 degrees of freedom that remain after the slope and intercept are estimated.
3 TDistribution.mqh Include File
Supplies the Student's t critical value for any degrees of freedom and confidence level using a Cornish-Fisher approximation, and reports a clear failure signal instead of substituting a fallback number when the input is invalid.
4 ConfidenceInterval.mqh Include File
Computes the confidence interval around the regression line — the band describing uncertainty in the estimated trend itself — and defines the shared band structure that the prediction-interval module also uses.
5 PredictionInterval.mqh Include File
Computes the wider prediction interval, which adds the scatter of an individual observation to the trend uncertainty and so describes where a single price point is expected to fall.
6 RegressionChannels.mq5 Custom Indicator The main indicator that ties the modules together: it runs the rolling regression bar by bar, draws the regression line and the four interval boundaries as five separate lines, lets the user choose between an in-sample edge band and a one-step-ahead forecast band, verifies its own setup constants, and prints a diagnostic summary to the Experts log.
7 Linear_Regression.zip Zip Archive Zip archive containing all the attached files and their paths relative to the terminal's root folder.

Features of Custom Indicators Creation Features of Custom Indicators Creation
Creation of Custom Indicators in the MetaTrader trading system has a number of features.
The MQL5 Standard Library Explorer (Part 13): Implementing the Math Solvers Library in Trading The MQL5 Standard Library Explorer (Part 13): Implementing the Math Solvers Library in Trading
We present a complete workflow for adaptive filtering in MQL5 using the CNlEq Levenberg–Marquardt–like solver. The EA fits a VAMAC model—two EWMAs with an ATR‑based scaling—by supplying residuals and a Jacobian through CNlEq's reverse‑communication loop, with optional numerical or analytical derivatives. Code, setup instructions, and GBPUSD H1 tests show how to replace static thresholds with on‑bar re‑estimation.
Features of Experts Advisors Features of Experts Advisors
Creation of expert advisors in the MetaTrader trading system has a number of features.
Meta-Labeling the Classics (Part 2): Filtering and Sizing ADX Trades Meta-Labeling the Classics (Part 2): Filtering and Sizing ADX Trades
The DI crossover often triggers in ranges where +DI and -DI oscillate without persistence. We build a two-layer hybrid: Optuna's TPE optimizes a regime gate over ADXR threshold, DI lookback, and minimum DI separation to maximize signal precision on a held-out window, then a Random Forest uses eleven ADX-derived features to accept or scale entries via afml.bet_sizing. The result filters ranging-market bursts and calibrates position size on EURUSD H1.