Article

A Bitcoin Trading Strategy with State-Space Models with Python and Backtrader

In the quest for profitable algorithmic trading strategies, identifying and acting on market trends is a cornerstone. While simple moving averages can offer a glimpse, more sophisticated statistical models can provide deeper insights into the underlying dynamics of price movements. This article explores a Python-based Bitcoin trading strategy that leverages state-space models to decompose price data, estimate trends, and generate trading signals, all within the flexible Backtrader framework.

We’ll walk through the components of this strategy, from data acquisition and model fitting to signal generation and backtesting, complete with code explanations to help you understand and potentially adapt this approach.

The Core Idea: State-Space Models for Trend Estimation

At the heart of our strategy lies the concept of a State-Space Model (SSM). SSMs provide a powerful way to represent a time series (like Bitcoin prices) through a set of unobserved or latent “state” variables that evolve over time. The observed data is then considered a function of these hidden states.

Specifically, we use the UnobservedComponents model from the statsmodels library in Python. This allows us to decompose the time series into components like:

Trend: The underlying direction of the price. We’ll model this as a “local linear trend” ('lltrend'), which means the model estimates both the current level of the trend and its slope (rate of change). Importantly, these components are stochastic, allowing them to change over time to adapt to new market conditions.
Seasonality (Optional): Patterns that repeat over a fixed period (e.g., weekly). While not the focus of this specific version, it can be incorporated.
Irregular Component (Noise): The random fluctuations not captured by other components.

By filtering the observed data through this model, we can obtain estimates of the unobserved trend level and, crucially for our strategy, the slope of the trend. A positive slope suggests an uptrend, while a negative slope indicates a downtrend.

Before feeding prices into the model, we apply a logarithmic transformation (np.log(prices)). This is a common practice in financial modeling as it helps to stabilize the variance of the series and makes exponential growth appear linear, often improving model fit and interpretation.

The Trading Strategy: StateSpaceTrendVolatility

Let’s dive into the StateSpaceTrendVolatility class, the engine of our trading logic, built using Backtrader.

Python

import backtrader as bt
import numpy as np
import statsmodels.api as sm
import pandas as pd
from datetime import datetime
import warnings

class StateSpaceTrendVolatility(bt.Strategy):
    params = (
        ('model_update_period', 60),    # Default: How often to re-fit the model (bars)
        ('lookback_period', 365),       # Default: Data window for model fitting (bars)
        ('trend_slope_threshold_buy', 0.0005), # Default: Min positive slope for buy
        ('trend_slope_threshold_sell', -0.0005),# Default: Max negative slope for sell
    )

    def __init__(self):
        self.btc_close = self.datas[0].close
        self.order = None
        self.model_fit_day = -self.p.model_update_period # Ensure model fits early
        self.trend_level = None 
        self.trend_slope = None 
        self.log_price_at_fit = None # For debugging model fit

    def log(self, txt, dt=None):
        dt = dt or self.datas[0].datetime.date(0)
        print(f'{dt.isoformat()} {txt}')

    # notify_order and notify_trade methods for logging (standard Backtrader)
    def notify_order(self, order):
        if order.status in [order.Submitted, order.Accepted]:
            return
        if order.status in [order.Completed]:
            if order.isbuy():
                self.log(f'BUY EXECUTED, Price: {order.executed.price:.2f}, Cost: {order.executed.value:.2f}, Comm: {order.executed.comm:.2f}')
            elif order.issell():
                self.log(f'SELL EXECUTED, Price: {order.executed.price:.2f}, Cost: {order.executed.value:.2f}, Comm: {order.executed.comm:.2f}')
            # self.bar_executed = len(self) # Optional: track bar of execution
        elif order.status in [order.Canceled, order.Margin, order.Rejected]:
            self.log('Order Canceled/Margin/Rejected')
        self.order = None

    def notify_trade(self, trade):
        if not trade.isclosed:
            return
        self.log(f'OPERATION PROFIT, GROSS {trade.pnl:.2f}, NET {trade.pnlcomm:.2f}')

    # ... (fit_state_space_model and next methods discussed below) ...

Strategy Parameters:

model_update_period: Defines how often (in trading bars) the state-space model is re-estimated using fresh data.
lookback_period: Specifies the window of past data (in bars) used for each model fitting.
trend_slope_threshold_buy: The minimum positive slope value that triggers a buy signal.
trend_slope_threshold_sell: The maximum (most negative) slope value that triggers a sell/short signal.

These parameters are crucial for tuning the strategy’s responsiveness and sensitivity.

Fitting the State-Space Model (fit_state_space_model)

This method is where the statistical heavy lifting occurs. It’s called periodically as defined by model_update_period.

Python

    def fit_state_space_model(self):
        current_bar_index = len(self)
        if current_bar_index < self.p.lookback_period:
            self.log(f"Not enough data to fit model yet. Have {current_bar_index}, need {self.p.lookback_period}.")
            return False

        # Collect historical close prices for the lookback period
        dates = [bt.num2date(self.datas[0].datetime[-i]) for i in range(self.p.lookback_period - 1, -1, -1)]
        closes = [self.datas[0].close[-i] for i in range(self.p.lookback_period - 1, -1, -1)]
        
        ts_data = pd.Series(closes, index=pd.to_datetime(dates)).dropna()
        
        self.log(f"Fitting SSM: Data from {ts_data.index[0].date()} to {ts_data.index[-1].date()}, {len(ts_data)} points.")
        if ts_data.empty or (ts_data <= 0).any():
            self.log("ERROR: Data for model is empty or contains non-positive values.")
            return False

        try:
            log_ts_data = np.log(ts_data)
            self.log_price_at_fit = log_ts_data.iloc[-1] 

            if log_ts_data.isnull().any() or np.isinf(log_ts_data).any():
                self.log("ERROR: log_ts_data contains NaN or Inf values!")
                return False

            # Define the Unobserved Components model: Local Linear Trend
            model = sm.tsa.UnobservedComponents(
                log_ts_data,
                level='lltrend', 
                # irregular=True # Optional: add if residuals are white noise
            )
            
            with warnings.catch_warnings(): # Suppress common fitting warnings
                warnings.simplefilter("ignore")
                result = model.fit(method='lbfgs', disp=False, maxiter=500) 

            self.log(f"DEBUG: Model converged: {result.mle_retvals['converged']}")
            self.log(f"DEBUG: Model log-likelihood: {result.llf:.4f}")
            
            filtered_state = result.filter_results.filtered_state
            
            if filtered_state.shape[1] >= 2: # Expecting at least level and slope
                self.trend_level = filtered_state[-1, 0]  # Last log-level
                self.trend_slope = filtered_state[-1, 1]  # Last log-slope
                self.log(f"Model Fit: Actual Last Log-Price: {self.log_price_at_fit:.4f}, Estimated Log-Trend Level: {self.trend_level:.4f}, Slope: {self.trend_slope:.4f}")
                
                # Sanity check for model fit
                if np.isclose(self.trend_level, 0.0, atol=1e-3) and not np.isclose(self.log_price_at_fit, 0.0, atol=1e-3):
                    self.log("WARNING: Estimated log-trend level is suspiciously close to zero. Check model fit quality.")
                
                if np.isnan(self.trend_level) or np.isnan(self.trend_slope):
                    self.log("ERROR: Trend level or slope is NaN after model fit.")
                    return False
                return True
            else:
                self.log(f"ERROR: Filtered state unexpected shape: {filtered_state.shape}")
                return False
        except Exception as e:
            self.log(f"ERROR fitting state-space model: {e}")
            import traceback
            self.log(traceback.format_exc())
            return False

Key steps in fit_state_space_model:

Data Collection: Gathers lookback_period worth of closing prices from the Backtrader data feed.
Preprocessing: Converts to a Pandas Series, drops NaNs, and checks for non-positive values before log transformation.
Log Transformation: Applies np.log() to the price data.
Model Definition: Initializes sm.tsa.UnobservedComponents with level='lltrend'.
Model Fitting: Uses model.fit() (with L-BFGS optimizer by default here) to estimate the model parameters and unobserved states. Warnings common during optimization are suppressed for cleaner output.
State Extraction: Retrieves the filtered_state (estimates based on data up to the current point). For a ‘lltrend’ model, the last values of the first state component (level) and the second state component (slope) are extracted.
Debugging & Sanity Checks: Includes logs for convergence, log-likelihood, and a crucial check comparing the estimated log-trend level to the actual last log price used in fitting. This helps flag obviously poor model fits.

Generating Signals (next method)

The next() method is called for each new bar of data. It decides whether to refit the model and then checks for trading signals.

Python

    def next(self):
        # Periodically re-fit the model
        if len(self) >= self.model_fit_day + self.p.model_update_period and \
           len(self) >= self.p.lookback_period:
            if self.fit_state_space_model():
                self.model_fit_day = len(self) # Update last fit day
            else:
                self.log("Model fitting failed. Holding off on trading.")
                # Consider closing open positions if model becomes unreliable
                return 

        if self.order: # An order is pending, do nothing
            return

        if self.trend_level is None or self.trend_slope is None:
            # Model not ready or failed, do nothing
            return 

        current_position_size = self.getposition().size

        # Buy Signal Logic
        if self.trend_slope > self.p.trend_slope_threshold_buy:
            if current_position_size == 0: # Not in market
                self.log(f'BUY CREATE @ {self.btc_close[0]:.2f}, Slope: {self.trend_slope:.5f}, Est.LogLvl: {self.trend_level:.4f}')
                self.order = self.buy()
            elif current_position_size < 0: # Currently short
                self.log(f'COVER SHORT & CONSIDER BUY @ {self.btc_close[0]:.2f}, Slope: {self.trend_slope:.5f}')
                self.order = self.close() # Close short position
                                          # Buy will be re-evaluated on next bar if signal persists

        # Sell Signal Logic
        elif self.trend_slope < self.p.trend_slope_threshold_sell:
            if current_position_size == 0: # Not in market
                self.log(f'SELL CREATE (SHORT) @ {self.btc_close[0]:.2f}, Slope: {self.trend_slope:.5f}, Est.LogLvl: {self.trend_level:.4f}')
                self.order = self.sell()
            elif current_position_size > 0: # Currently long
                self.log(f'LIQUIDATE LONG & CONSIDER SHORT @ {self.btc_close[0]:.2f}, Slope: {self.trend_slope:.5f}')
                self.order = self.close() # Close long position
                                          # Short will be re-evaluated on next bar
        # else: # Optional: Neutral zone - close positions if slope flattens
            # if current_position_size != 0:
            #     self.order = self.close()

Signal logic:

If the trend_slope exceeds trend_slope_threshold_buy, a buy order is placed (if not already long). If short, the short position is closed.
If the trend_slope falls below trend_slope_threshold_sell, a sell (short) order is placed (if not already short). If long, the long position is closed.

Setting Up the Backtest

The if __name__ == '__main__': block orchestrates the backtest:

Python

from curl_cffi import requests # For robust yfinance downloads
session = requests.Session(impersonate="chrome") # Mimic browser
from sys import exit # For clean exit on data errors
import matplotlib.pyplot as plt
# %matplotlib qt5 # For interactive plots in IPython/Spyder, run this in your console

if __name__ == '__main__':
    cerebro = bt.Cerebro()

    # --- Data Feed Section ---
    try:
        data_df = yf.download('BTC-USD', period='3y', interval='1d', session=session)
    except Exception as e_yf_sess: # Fallback if session download fails
        print(f"Failed to download data using yfinance with custom session: {e_yf_sess}")
        print("Attempting yfinance download without custom session...")
        try:
            data_df = yf.download('BTC-USD', period='3y', interval='1d')
        except Exception as e_yf:
            print(f"Failed to download data using standard yfinance: {e_yf}")
            exit() # Exit if data can't be fetched

    if data_df.empty:
        print("Could not download BTC-USD data. Exiting.")
        exit()

    print("DEBUG: Columns after yf.download:", data_df.columns)
    if isinstance(data_df.columns, pd.MultiIndex):
        print("DEBUG: DataFrame has MultiIndex columns. Assuming ('Field', '') structure and getting level 0.")
        # This handles a common yfinance MultiIndex like ('Open', ''), ('Close', '')
        if all(col_val == '' for col_val in data_df.columns.get_level_values(1)):
            data_df.columns = data_df.columns.get_level_values(0)
        else: # Fallback for other MultiIndex structures - adjust as needed
            print(f"DEBUG: MultiIndex structure is not ('Field', ''). Attempting droplevel(0) or check structure manually. Current: {data_df.columns}")
            # If structure is ('Ticker', 'Field'), droplevel(0) might work
            # For safety, you might need to inspect and flatten based on specific structure
            # data_df.columns = data_df.columns.droplevel(0) # Example
    
    rename_map = { # Standardize to lowercase for Backtrader
        'Open': 'open', 'High': 'high', 'Low': 'low', 'Close': 'close',
        'Adj Close': 'adjclose', 'Volume': 'volume'
    }
    data_df.rename(columns=rename_map, inplace=True)
    for col in ['open', 'high', 'low', 'close', 'volume']: # Check essential columns
        if col not in data_df.columns:
            raise ValueError(f"Missing required column '{col}'. Columns are: {data_df.columns}")
    data_df['openinterest'] = 0 # Required by Backtrader
    data_df.index = pd.to_datetime(data_df.index)
    
    data = bt.feeds.PandasData(dataname=data_df)
    cerebro.adddata(data)

    # Add Strategy with tuned parameters (example values)
    cerebro.addstrategy(StateSpaceTrendVolatility,
                        model_update_period=90,  # Refit every ~3 months
                        lookback_period=365,     # Use 1 year of data for fitting
                        trend_slope_threshold_buy=0.001, # Stricter buy threshold
                        trend_slope_threshold_sell=-0.001)# Stricter sell threshold

    # Broker, Sizer, Analyzers
    cerebro.broker.setcash(100000.0)
    cerebro.broker.setcommission(commission=0.001) # 0.1%
    cerebro.addsizer(bt.sizers.PercentSizer, percents=90) # Use 90% of cash

    cerebro.addanalyzer(bt.analyzers.SharpeRatio, _name='sharpe_ratio', timeframe=bt.TimeFrame.Days, annualize=True, riskfreerate=0.0)
    cerebro.addanalyzer(bt.analyzers.AnnualReturn, _name='annual_return')
    cerebro.addanalyzer(bt.analyzers.DrawDown, _name='drawdown')
    cerebro.addanalyzer(bt.analyzers.TradeAnalyzer, _name='trade_analyzer')

    print('Starting Portfolio Value: %.2f' % cerebro.broker.getvalue())
    results = cerebro.run() # Optimization can be run with cerebro.optstrategy
    print('Final Portfolio Value: %.2f' % cerebro.broker.getvalue())

    # Print Analysis (with safer dictionary access)
    strat = results[0] # Assuming single strategy run
    print(f"\n--- Strategy Analysis ---")
    # ... (Detailed analyzer printout as in your provided script) ...
    # (Example for Sharpe)
    sharpe_analysis = strat.analyzers.sharpe_ratio.get_analysis()
    print(f"Sharpe Ratio: {sharpe_analysis.get('sharperatio', 'N/A'):.2f}")


    # Plotting
    plt.rcParams['figure.figsize'] = [12, 7] # Adjust figure size
    # For Spyder/IPython with Qt5 backend, ensure '%matplotlib qt5' was run in console
    cerebro.plot(iplot=False, style='candlestick') 
    plt.tight_layout()
    # plt.show() # May be needed depending on environment

Data Handling Highlights:

curl_cffi with Session(impersonate="chrome"): This makes yfinance downloads more robust by mimicking a web browser, which can help bypass potential blocks from Yahoo Finance.
Robust Column Handling: The code now includes more careful checks and attempts to flatten potential MultiIndex columns returned by yfinance, a common source of issues. It then renames columns to lowercase as Backtrader expects and adds the mandatory openinterest column.
Strategy Parameter Overrides: Notice how parameters like model_update_period=90, lookback_period=365, etc., are passed when adding the strategy. This allows for easy tuning without modifying the class defaults.

Interpreting Results & Plotting:

The script includes standard Backtrader analyzers to evaluate performance: Sharpe Ratio, Annual Return, Drawdown, and Trade Analyzer. cerebro.plot(iplot=False) will use your Matplotlib backend (like Qt5 if you’ve set %matplotlib qt5 in an interactive console) to display the results, including price, trades, and portfolio value.

Stochastic Volatility Aspect

The original prompt for this type of strategy often mentions “stochastic volatility.” In this implementation, while we don’t have a dedicated GARCH component within the UnobservedComponents model for the observation error’s variance, the level=‘lltrend’ specification itself means the level and the trend (slope) are stochastic. Their variances are estimated by the model. This means the trend itself can become more or less variable, indirectly reflecting changes in market volatility through its own dynamics. A full GARCH-in-SSM is a more advanced topic, often requiring custom state-space model definitions.

--- Strategy Analysis ---
Sharpe Ratio: 0.99
Annual Returns:
  2022: 0.00%
  2023: 42.05%
  2024: 108.99%
  2025: 0.77%
Max Drawdown: 26.29%
Max Money Drawdown: 87883.43
Total Trades: 5
Winning Trades: 3
Losing Trades: 1
Win Rate: 60.00%
Average Winning Trade: 66385.05
Average Losing Trade: -1016.25
Profit Factor: 195.97

Key Considerations and Path Forward

Model Health is Paramount: The success of this strategy hinges on the state-space model accurately estimating the trend level and slope. The debugging logs (Actual Last Log-Price vs. Estimated Log-Trend Level) are vital. If the model fit is poor, the signals will be unreliable.
Parameter Tuning:
- lookback_period: Shorter periods make the model more reactive but potentially noisier. Longer periods offer more stability but slower adaptation.
- model_update_period: Balances adaptiveness with computational cost.
- trend_slope_threshold_buy/sell: These are highly sensitive. Small changes can drastically alter the number of trades and performance. Systematic optimization (e.g., using cerebro.optstrategy) is recommended, followed by out-of-sample validation to avoid overfitting.
Risk Management: The current strategy lacks explicit stop-losses. Adding risk management rules is crucial for any real-world application.
Computational Cost: Fitting state-space models can be computationally intensive, especially with long lookback periods or frequent updates.
Overfitting: Be extremely cautious when interpreting backtest results, especially after optimization. What works on historical data may not work in the future. Walk-forward optimization and testing on truly unseen data are essential.

Conclusion

This state-space modeling approach offers a sophisticated way to define and trade market trends. By decomposing prices into unobserved components, we aim to get a clearer picture of the underlying market direction. Backtrader provides the environment to rigorously test such ideas. However, success depends on a well-fitting statistical model, careful parameter tuning, robust data handling, and sound risk management principles. This framework serves as a solid starting point for further exploration and refinement in your algorithmic trading journey.