Article

Detrended Fluctuation Analysis (DFA)

Detrended Fluctuation Analysis (DFA) is a powerful method for analyzing long-range correlations in time series data. It is widely used in various fields, including finance, biology, and physics. This article will guide you through the DFA algorithm step by step, including the necessary formulas and code snippets to help you understand each stage of the process.

Overview

DFA aims to identify scaling properties of non-stationary time series. Unlike traditional methods, DFA can handle data with trends and non-stationarities. The core idea is to examine how fluctuations in the data vary with time scales.

Key Steps in DFA

Cumulative Sum of the Data
Detrending the Data
Calculating Fluctuations
Estimating the DFA Exponent

Step 1: Cumulative Sum of the Data

The first step in DFA is to compute the cumulative sum of the time series data, which removes the mean from the data.

\[y(t) = \sum_{i=1}^{t} (x_i - \langle x \rangle)\]

where:

\(y(t)\) is the cumulative sum at time t.
\(x_i\) is the value of the time series at index i.
\(\langle x \rangle\) is the mean of the time series.

Python Code

import numpy as np
def cumulative_sum(x):
    return np.cumsum(x - np.mean(x))

Step 2: Detrending the Data and Calculating Local Fluctuations

Next, we split the cumulative sum into non-overlapping segments of length n and perform linear regression on each segment to remove trends.

For each segment:

\[X = (x_1, x_2, \ldots, x_n) \quad \text{(data points in the segment)}\]

\[Y = \text{slope} \cdot t + \text{intercept} \quad \text{(fitted line)}\]

The detrended values are computed as:

\[x(t)=X(t)-Y(t)\]

The root-mean-square deviation from the local trend for each segment is calculated, which gives us the fluctuation function for that segment.

\[F(n, i) = \sqrt{\frac{1}{n} \sum_{t=in+1}^{in+n} \left( x_t - Y_t \right)^2}\]

where:

\(F(n, i)\) is the fluctuation for segment i of length n.
\(Y_t\) is the fitted value from the linear regression.

Python Code

import numpy as np
from numpy.lib.stride_tricks import as_strided

def calc_rms(x, scale):
    shape = (x.shape[0] // scale, scale)
    X = as_strided(x, shape=shape)
    scale_ax = np.arange(scale)
    rms = np.zeros(X.shape[0])

    for e, xcut in enumerate(X):
        coeff = np.polyfit(scale_ax, xcut, 1)  # Linear regression
        xfit = np.polyval(coeff, scale_ax)  # Fitted values
        rms[e] = np.sqrt(np.mean((xcut - xfit)**2))  # RMS of the detrended data

    return rms

Step 3: Calculating Total Fluctuation

And their root-mean-square is the total fluctuation:

\[F(n)={\sqrt {{\frac {1}{N/n}}\sum _{i=1}^{N/n}F(n,i)^{2}}}\]

Python Code

def calculate_fluctuations(y, scales):
    fluct = np.zeros(len(scales))

    for e, sc in enumerate(scales):
        fluct[e] = np.sqrt(np.mean(calc_rms(y, sc)**2))

    return fluct

Step 4: Estimating the DFA Exponent

Finally, we fit a line to the logarithmic values of the fluctuation function versus the scale to estimate the DFA exponent α.

\[\log F(n) = \alpha \log n + \beta\]

Python Code

def dfa(x, scale_lim=[5, 9], scale_dens=0.25, show=False):
    y = cumulative_sum(x)
    scales = (2 ** np.arange(scale_lim[0], scale_lim[1], scale_dens)).astype(np.int)
    fluct = calculate_fluctuations(y, scales)
    
    coeff = np.polyfit(np.log2(scales), np.log2(fluct), 1)
    
    if show:
        plt.loglog(scales, fluct, 'bo')
        plt.loglog(scales, 2**np.polyval(coeff, np.log2(scales)), 'r', label=r'$\alpha$ = %0.2f' % coeff[0])
        plt.title('DFA')
        plt.xlabel(r'$\log_{10}$(time window)')
        plt.ylabel(r'$\log_{10} <F(t)>$')
        plt.legend()
        plt.show()

    return scales, fluct, coeff[0]

A straight line of slope \(\alpha\) on the log-log plot indicates a statistical self-affinity of form \(F(n)\propto n^{\alpha }\). Since \(F(n)\) monotonically increases with \(n\), we always have \(\alpha>0\).

The scaling exponent \(\alpha\) is a generalization of the Hurst exponent, with the precise value giving information about the series self-correlations:

· \(\alpha <1/2\): anti-correlated

· \(\alpha \simeq ½\): uncorrelated, white noise

· \(\alpha >1/2\): correlated

· \(\alpha \simeq 1\): 1/f-noise, pink noise

· \(\alpha >1\): non-stationary, unbounded

· \(\alpha \simeq 3/2\): Brownian noise

Because the expected displacement in an uncorrelated random walk of length N grows like \(\sqrt {N}\), an exponent of \(\tfrac {1}{2}\) would correspond to uncorrelated white noise. When the exponent is between 0 and 1, the result is fractional Gaussian noise.

Complete Code Example

Combining all the parts, here’s the complete code for DFA applied to daily Bitcoin time series for the past 5 years:

import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import hilbert
from numpy.lib.stride_tricks import as_strided

def cumulative_sum(x):
    return np.cumsum(x - np.mean(x))

def calc_rms(x, scale):
    shape = (x.shape[0] // scale, scale)
    X = as_strided(x, shape=shape)
    scale_ax = np.arange(scale)
    rms = np.zeros(X.shape[0])

    for e, xcut in enumerate(X):
        coeff = np.polyfit(scale_ax, xcut, 1)
        xfit = np.polyval(coeff, scale_ax)
        rms[e] = np.sqrt(np.mean((xcut - xfit)**2))

    return rms

def calculate_fluctuations(y, scales):
    fluct = np.zeros(len(scales))

    for e, sc in enumerate(scales):
        fluct[e] = np.sqrt(np.mean(calc_rms(y, sc)**2))

    return fluct

def dfa(x, scale_lim=[5, 9], scale_dens=0.25, show=False):
    y = cumulative_sum(x)
    scales = (2 ** np.arange(scale_lim[0], scale_lim[1], scale_dens)).astype(np.int)
    fluct = calculate_fluctuations(y, scales)

    coeff = np.polyfit(np.log2(scales), np.log2(fluct), 1)

    if show:
        plt.loglog(scales, fluct, 'bo')
        plt.loglog(scales, 2**np.polyval(coeff, np.log2(scales)), 'r', label=r'$\alpha$ = %0.2f' % coeff[0])
        plt.title('DFA')
        plt.xlabel(r'$\log_{10}$(time window)')
        plt.ylabel(r'$\log_{10} <F(t)>$')
        plt.legend()
        plt.show()

    return scales, fluct, coeff[0]

if __name__ == '__main__':
    import yfinance as yf

    # Download BTC-USD data from yfinance
    data = yf.download('BTC-USD', period='5y')

    # Calculate returns: (P_t - P_t-1) / P_t-1
    r = data['Close'].diff() / data['Close'].shift(1)
    r.dropna(inplace=True)

    scales, fluct, alpha = dfa(r, show=True)
    print("Scales:", scales)
    print("Fluctuations:", fluct)
    print("DFA Exponent: {}".format(alpha))

Pasted image 20250315134110.png We get \(\alpha = 0.57\), which indicates persistent behavior. This means that trends tend to continue. In financial markets, this suggests that an upward movement in price tends to be followed by further upward movements, and vice versa for downward movements.

Conclusion

This article provided a comprehensive guide to Detrended Fluctuation Analysis (DFA). By following the outlined steps, you can analyze the scaling behavior of your time series data effectively. The accompanying code snippets allow you to implement DFA and visualize the results, aiding in understanding the long-range correlations in your data.