A rigorous approach to measuring and managing financial risk hinges on understanding how expected returns, volatility, and the co‑movement of assets interact—and how to combine assets into portfolios that optimize these trade‑offs.
Python Setup & Data Fetching First, we’ll set up
our Python environment and fetch some stock data using
yfinance
.
Python
import yfinance as yf
import pandas as pd
import numpy as np
# Define the tickers and the time period for historical data
= ['AAPL', 'MSFT', 'GOOG']
tickers = '^GSPC' # S&P 500
market_index = '2020-01-01'
start_date = '2023-12-31'
end_date
# Download historical stock data for 'Adj Close' prices
# 'Adj Close' accounts for dividends and stock splits
try:
= yf.download(tickers, start=start_date, end=end_date, auto_adjust=False)['Adj Close']
data = yf.download(market_index, start=start_date, end=end_date, auto_adjust=False)['Close']
market_data except Exception as e:
print(f"Error downloading data: {e}")
= pd.DataFrame() # Create empty dataframe to avoid further errors
data = pd.Series(dtype='float64') # Create empty series
market_data
# Display the first few rows of the data
print("Stock Data:")
print(data.head())
print("\nMarket Data (S&P 500):")
print(market_data.head())
Stock Data:
Ticker AAPL GOOG MSFT
Date
2020-01-02 72.716072 68.046196 153.323242
2020-01-03 72.009117 67.712280 151.414124
2020-01-06 72.582916 69.381874 151.805511
2020-01-07 72.241524 69.338585 150.421371
2020-01-08 73.403648 69.884995 152.817352
Market Data (S&P 500):
Ticker ^GSPC
Date
2020-01-02 3257.850098
2020-01-03 3234.850098
2020-01-06 3246.280029
2020-01-07 3237.179932
2020-01-08 3253.050049
1.1 Portfolio Return
If you invest fractions \(w_i\) of your
wealth in \(n\) assets whose individual
returns are \(r_i\), the portfolio
return over a period is \[r_p = \sum_{i=1}^n
w_i\,r_i,\] and its expectation is \[E[r_p] = \sum_{i=1}^n w_i\,E[r_i],\] with
\(\sum_i w_i = 1\).
1.2 Portfolio Volatility
Risk is measured by standard deviation. For asset ii, \[\sigma_i^2 = E\bigl[(r_i - E[r_i])^2\bigr], \quad
\sigma_i = \sqrt{\sigma_i^2}.\] Portfolio variance blends
individual variances and covariances: \[\sigma_p^2 = \sum_{i=1}^n\sum_{j=1}^n
w_i\,w_j\,\mathrm{Cov}(r_i,r_j) = \sum_{i,j}
w_i\,w_j\,\sigma_{ij},\] where \(\sigma_{ij}=\mathrm{Cov}(r_i,r_j)\).
1.3 Correlation \[\rho_{ij} = \frac{\sigma_{ij}}{\sigma_i\,\sigma_j}, \quad -1\le \rho_{ij}\le 1.\] Less-than-perfect correlations (\(\rho<1\)) enable diversification and reduce \(\sigma_p\).
# Calculate daily returns
= asset_data.pct_change().dropna()
asset_returns = market_data.pct_change().dropna()
market_returns
print("Asset Daily Returns (head):")
print(asset_returns.head())
print("\nMarket Daily Returns (head):")
print(market_returns.head())
# Calculate mean daily returns for each asset (proxy for E[r_i])
= asset_returns.mean()
mean_daily_returns print("\nMean Daily Returns:")
print(mean_daily_returns)
# Calculate expected portfolio return (daily)
= np.sum(mean_daily_returns * weights)
expected_portfolio_return_daily print(f"\nExpected Daily Portfolio Return: {expected_portfolio_return_daily:.6f}")
# Annualize the expected portfolio return (assuming 252 trading days)
= 252
annual_trading_days = expected_portfolio_return_daily * annual_trading_days
expected_portfolio_return_annual print(f"Expected Annual Portfolio Return: {expected_portfolio_return_annual:.4f}")
# Calculate daily covariance matrix of asset returns
= asset_returns.cov()
cov_matrix_daily print("\nDaily Covariance Matrix of Asset Returns:")
print(cov_matrix_daily)
# Calculate portfolio variance (daily)
= np.dot(weights.T, np.dot(cov_matrix_daily, weights))
portfolio_variance_daily print(f"\nDaily Portfolio Variance: {portfolio_variance_daily:.8f}")
# Calculate portfolio volatility (standard deviation - daily)
= np.sqrt(portfolio_variance_daily)
portfolio_volatility_daily print(f"Daily Portfolio Volatility: {portfolio_volatility_daily:.6f}")
# Annualize portfolio volatility
= portfolio_volatility_daily * np.sqrt(annual_trading_days)
portfolio_volatility_annual print(f"Annual Portfolio Volatility: {portfolio_volatility_annual:.4f}")
# Calculate daily correlation matrix
= asset_returns.corr()
correlation_matrix print("\nDaily Correlation Matrix of Asset Returns:")
print(correlation_matrix)
Asset Daily Returns (head):
Ticker AAPL GOOG MSFT
Date
2020-01-03 -0.009722 -0.004907 -0.012452
2020-01-06 0.007968 0.024657 0.002585
2020-01-07 -0.004703 -0.000624 -0.009118
2020-01-08 0.016087 0.007880 0.015928
2020-01-09 0.021241 0.011044 0.012493
Market Daily Returns (head):
Ticker ^GSPC
Date
2020-01-03 -0.007060
2020-01-06 0.003533
2020-01-07 -0.002803
2020-01-08 0.004902
2020-01-09 0.006655
Mean Daily Returns:
Ticker
AAPL 0.001187
GOOG 0.000942
MSFT 0.001095
dtype: float64
Expected Daily Portfolio Return: 0.001075
Expected Annual Portfolio Return: 0.2708
Daily Covariance Matrix of Asset Returns:
Ticker AAPL GOOG MSFT
Ticker
AAPL 0.000447 0.000307 0.000338
GOOG 0.000307 0.000444 0.000333
MSFT 0.000338 0.000333 0.000422
Daily Portfolio Variance: 0.00036328
Daily Portfolio Volatility: 0.019060
Annual Portfolio Volatility: 0.3026
Daily Correlation Matrix of Asset Returns:
Ticker AAPL GOOG MSFT
Ticker
AAPL 1.000000 0.689177 0.777003
GOOG 0.689177 1.000000 0.769150
MSFT 0.777003 0.769150 1.000000
To calibrate models to observed return series, Maximum Likelihood Estimation (MLE) finds parameter values \(\theta\) that maximize the probability of the data.
\[L(\theta) = \prod_{i=1}^n f(x_i;\theta).\] - Log‑Likelihood: \[\ell(\theta) = \sum_{i=1}^n \ln f(x_i;\theta),\]
which is maximized by solving \(\dfrac{\partial \ell}{\partial \theta_j}=0\) for each parameter .
Example (Normal Distribution):
If \(x_i\sim\mathcal{N}(\mu,\sigma^2)\), \[\ell(\mu,\sigma) =
-\tfrac{n}{2}\ln(2\pi)-n\ln\sigma
-\frac{1}{2\sigma^2}\sum_i(x_i-\mu)^2.\]
First‑order conditions yield
\(\hat\mu=\tfrac{1}{n}\sum_i
x_i\),
\(\hat\sigma=\sqrt{\tfrac{1}{n}\sum_i(x_i-\hat\mu)^2}\).
# For demonstration, let's calculate sample mean and std for AAPL's daily returns
# These are the MLE estimates if we assume returns are normally distributed.
= asset_returns['AAPL']
aapl_returns = aapl_returns.mean() # MLE for mu
mu_hat_aapl = aapl_returns.std(ddof=0) # MLE for sigma (ddof=0 for population std)
sigma_hat_aapl
print(f"\nMLE estimate for AAPL daily return mean (mu_hat): {mu_hat_aapl:.6f}")
print(f"MLE estimate for AAPL daily return std (sigma_hat): {sigma_hat_aapl:.6f}")
MLE estimate for AAPL daily return mean (mu_hat): 0.001187
MLE estimate for AAPL daily return std (sigma_hat): 0.021135
3.1 Normality Assumption
Many models assume returns are normally distributed (μ,σ,), so that
about 68% of outcomes lie within one σof μ,
about 95% lie within two σ.
3.2 Efficient Frontier
Harry Markowitz showed that, when plotting portfolios’ \((\sigma_p,E[r_p])\), the upper boundary
(the efficient frontier) comprises portfolios offering
the highest return for each risk level.
To find the minimum‑variance portfolio delivering a target expected return \(\mu_P\):
\[\begin{aligned} &\min_{w}\;\tfrac12\,w^\top\Sigma\,w\\ &\text{s.t.}\quad w^\top\mathbf{1}=1,\quad w^\top r=\mu_P, \end{aligned}\]
where \(r\) is the vector of expected asset returns and \(\Sigma\) their covariance matrix. Introducing Lagrange multipliers leads to explicit weights
\[w^* = \lambda_1\,\Sigma^{-1}r\;+\;\lambda_2\,\Sigma^{-1}\mathbf{1},\]
with
\[\begin{aligned} A &= \mathbf{1}^\top\Sigma^{-1}r,\quad B = r^\top\Sigma^{-1}r,\quad C = \mathbf{1}^\top\Sigma^{-1}\mathbf{1},\\ \Delta &= BC - A^2, \end{aligned}\] \[\lambda_1=(C\mu_P - A)/\Delta,\]\[\lambda_2=(B - A\mu_P)/\Delta.\] Plotting \(\sigma_p^2\) vs. \(\mu_P\) yields a parabola; its upper branch is the efficient frontier.
import matplotlib.pyplot as plt
# Ensure mean_daily_returns and cov_matrix_daily are available and not empty
if 'mean_daily_returns' in locals() and not mean_daily_returns.empty and \
'cov_matrix_daily' in locals() and not cov_matrix_daily.empty and \
len(mean_daily_returns) == cov_matrix_daily.shape[0] and \
len(tickers) == len(mean_daily_returns): # Check consistency
= mean_daily_returns.values # Vector of mean daily returns
r = cov_matrix_daily.values # Covariance matrix of daily returns
Sigma = len(r)
num_assets_frontier = np.ones(num_assets_frontier)
ones
try:
= np.linalg.inv(Sigma) # Inverse of the covariance matrix
Sigma_inv
# Calculate A, B, C, and Delta
= ones.T @ Sigma_inv @ r
A = r.T @ Sigma_inv @ r
B = ones.T @ Sigma_inv @ ones
C = B * C - A**2
Delta
if Delta == 0:
print("Delta is zero. Cannot calculate efficient frontier using this method (assets might be perfectly correlated or other degeneracy).")
else:
# Determine a range of target expected portfolio returns (mu_P) for plotting
# We'll go from the Global Minimum Variance Portfolio return upwards
= A / C # Expected return of the Global Minimum Variance Portfolio (daily)
mu_gmvp_daily
# Let's target a range of returns from GMVP up to the max individual asset return (or a bit higher)
= mu_gmvp_daily
min_target_return_daily = np.max(r) * 1.5 # Go a bit beyond the max individual asset return
max_target_return_daily
# If max_target_return_daily is less than min_target_return_daily (e.g. if max(r)*1.5 < A/C)
# adjust max_target_return_daily to be slightly above min_target_return_daily
if max_target_return_daily <= min_target_return_daily:
= min_target_return_daily * (1.05 if min_target_return_daily > 0 else 0.95) # 5% more or less depending on sign
max_target_return_daily if max_target_return_daily == min_target_return_daily : # if mu_gmvp is zero
= 0.001 # a small positive value if mu_gmvp_daily is 0
max_target_return_daily
= np.linspace(min_target_return_daily, max_target_return_daily, 100)
target_returns_daily
= []
portfolio_volatilities_daily = [] # Store actual mu_p used for plotting
portfolio_returns_daily_plot = []
all_weights
for mu_P_daily in target_returns_daily:
= (C * mu_P_daily - A) / Delta
lambda1 = (B - A * mu_P_daily) / Delta
lambda2
= lambda1 * (Sigma_inv @ r) + lambda2 * (Sigma_inv @ ones)
w_star
# Calculate portfolio variance and standard deviation for this mu_P_daily
# var_p_daily = w_star.T @ Sigma @ w_star # This should be (C*mu_P_daily^2 - 2*A*mu_P_daily + B) / Delta
= (C * (mu_P_daily**2) - 2 * A * mu_P_daily + B) / Delta # Variance of portfolio for a given mu_P
var_p_daily
# Due to numerical precision, var_p_daily could be extremely small negative. Take abs or max with 0.
if var_p_daily < 0: var_p_daily = 0 # Avoid math domain error with sqrt
= np.sqrt(var_p_daily)
std_p_daily
portfolio_volatilities_daily.append(std_p_daily)# This is our target return
portfolio_returns_daily_plot.append(mu_P_daily)
all_weights.append(w_star)
# Convert to numpy arrays for easier annualization
= np.array(portfolio_volatilities_daily)
portfolio_volatilities_daily = np.array(portfolio_returns_daily_plot)
portfolio_returns_daily_plot
# Annualize for plotting
= portfolio_returns_daily_plot * annual_trading_days
portfolio_returns_annual_plot = portfolio_volatilities_daily * np.sqrt(annual_trading_days)
portfolio_volatilities_annual
# Plotting the Efficient Frontier
=(10, 6))
plt.figure(figsize'b-', lw=2, label='Efficient Frontier')
plt.plot(portfolio_volatilities_annual, portfolio_returns_annual_plot,
# Plot individual assets
= np.sqrt(np.diag(Sigma)) * np.sqrt(annual_trading_days)
asset_volatilities_annual = r * annual_trading_days
asset_returns_annual ='o', s=50, label='Individual Assets')
plt.scatter(asset_volatilities_annual, asset_returns_annual, markerfor i, ticker in enumerate(tickers): # Use asset_tickers from global scope
f' {ticker}', fontsize=9)
plt.text(asset_volatilities_annual[i], asset_returns_annual[i],
# Plot Global Minimum Variance Portfolio (GMVP)
= np.sqrt(1/C)
sigma_gmvp_daily = mu_gmvp_daily * annual_trading_days
mu_gmvp_annual = sigma_gmvp_daily * np.sqrt(annual_trading_days)
sigma_gmvp_annual ='*', color='red', s=150, label='Global Minimum Variance Portfolio (GMVP)')
plt.scatter(sigma_gmvp_annual, mu_gmvp_annual, marker' GMVP', fontsize=9)
plt.text(sigma_gmvp_annual, mu_gmvp_annual,
'Efficient Frontier')
plt.title('Annualized Volatility (Standard Deviation)')
plt.xlabel('Annualized Expected Return')
plt.ylabel(
plt.legend()True)
plt.grid(
plt.show()
# You can also find the portfolio with the highest Sharpe Ratio (Tangency Portfolio)
# This requires a risk-free rate and typically numerical optimization,
# or solving for specific lambdas if a risk-free asset is introduced analytically.
# For now, we'll just plot the frontier of risky assets.
except np.linalg.LinAlgError:
print("Covariance matrix is singular or not invertible. Cannot calculate efficient frontier.")
except Exception as e:
print(f"An error occurred during efficient frontier calculation: {e}")
elif not ('mean_daily_returns' in locals() and 'cov_matrix_daily' in locals()):
print("Mean returns or covariance matrix not available. Skipping efficient frontier calculation.")
elif mean_daily_returns.empty or cov_matrix_daily.empty:
print("Mean returns or covariance matrix is empty. Skipping efficient frontier calculation.")
else:
print("Data inconsistency. Skipping efficient frontier calculation. Check dimensions of returns and covariance matrix vs asset_tickers.")
The market price of risk \(\lambda\) is the extra expected return per unit of risk:
\[\lambda = \frac{E[r] - r_f}{\sigma},\]
where \(r_f\) is the risk‑free rate. In derivatives theory, no‑arbitrage arguments show that for any claim with drift \(\mu\) and volatility \(\sigma\):
\[\frac{\mu - r_f}{\sigma} = \lambda,\]
consistent across all claims on the same underlying.
print("\nDefining Risk-Free Rate and Market Parameters...")
= 0.0 # Placeholder for general context
daily_risk_free_rate_placeholder = np.nan
annual_risk_free_rate_placeholder = np.nan
annual_market_return_placeholder = np.nan
market_premium_placeholder = np.nan
market_mean_daily_for_premium
if 'annual_trading_days' in locals() and 'market_returns' in locals() and not market_returns.empty:
= (1 + daily_risk_free_rate_placeholder)**annual_trading_days - 1
annual_risk_free_rate_placeholder
if isinstance(market_returns, pd.DataFrame):
if market_returns.shape[1] == 0:
= np.nan
market_mean_daily_for_premium print("Warning: market_returns DataFrame is empty. Market mean daily return is NaN.")
elif market_returns.shape[1] == 1:
= market_returns.iloc[:, 0].mean()
market_mean_daily_for_premium else:
= market_returns.iloc[:, 0].mean()
market_mean_daily_for_premium print(f"Warning: market_returns DataFrame has {market_returns.shape[1]} columns. Used mean of first column: {market_returns.columns[0]}.")
elif isinstance(market_returns, pd.Series):
= market_returns.mean()
market_mean_daily_for_premium else:
= np.nan
market_mean_daily_for_premium print("Error: market_returns is of an unexpected type. Market mean daily return set to NaN.")
if pd.isna(market_mean_daily_for_premium):
print("Warning: Market mean daily return is NaN. Cannot calculate annual market return accurately.")
else:
= (1 + market_mean_daily_for_premium)**annual_trading_days - 1
annual_market_return_placeholder = annual_market_return_placeholder - annual_risk_free_rate_placeholder
market_premium_placeholder print(f"Annualized Risk-Free Rate (placeholder): {annual_risk_free_rate_placeholder:.4f}")
print(f"Annualized Expected Market Return (placeholder): {annual_market_return_placeholder:.4f}")
print(f"Market Risk Premium (placeholder): {market_premium_placeholder:.4f}")
else:
print("annual_trading_days or market_returns not available/empty. Skipping some parameter calculations.")
Defining Risk-Free Rate and Market Parameters...
Annualized Risk-Free Rate (placeholder): 0.0000
Annualized Expected Market Return (placeholder): 0.1300
Market Risk Premium (placeholder): 0.1300
6.1 Key Assumptions
Single‑period horizon; investors care only about mean/variance
Frictionless markets, no taxes or transaction costs
Assets infinitely divisible; unlimited borrowing/lending at \(r_f\)
Homogeneous expectations
6.2 Security Market Line
Assets’ required returns depend solely on beta (\(\beta_i\)), their sensitivity to market
moves:
\[\beta_i = \frac{\mathrm{Cov}(r_i,r_m)}{\mathrm{Var}(r_m)}, \qquad E[r_i] = r_f + \beta_i\bigl(E[r_m] - r_f\bigr).\]
# Calculate betas by regressing each asset on the market
import statsmodels.api as sm # For beta calculation via regression
print("\nCalculating Betas via Regression...")
= {}
betas_regression
if not asset_returns.empty and not market_returns.empty and tickers:
for t in tickers:
if t in asset_returns.columns:
= pd.concat([asset_returns[t], market_returns], axis=1).dropna()
df_regression = ['Asset', 'Market']
df_regression.columns if len(df_regression) > 1:
= sm.add_constant(df_regression['Market'])
X = df_regression['Asset']
y try:
= sm.OLS(y, X)
model = model.fit()
results = results.params['Market']
betas_regression[t] except Exception as e:
print(f"Could not calculate beta for {t} via regression: {e}")
= np.nan
betas_regression[t] else:
print(f"Not enough data points to calculate beta for {t} after alignment.")
= np.nan
betas_regression[t] else:
print(f"Ticker {t} not found in asset_returns columns.")
= np.nan
betas_regression[t] print("Betas calculated via regression:", betas_regression)
else:
print("Asset returns, market returns, or tickers list is empty. Skipping beta calculation.")
# Plot the Security Market Line (SML)
print("\nPlotting Security Market Line...")
if 'betas_regression' in locals() and betas_regression and \
'market_returns' in locals() and not market_returns.empty and \
'mean_daily_returns' in locals() and not mean_daily_returns.empty and \
'tickers' in locals() and tickers and \
'annual_trading_days' in locals():
= 0.0001 # Specific Rf for SML as per your code
daily_risk_free_rate_sml = (1 + daily_risk_free_rate_sml)**annual_trading_days - 1
annual_risk_free_rate_sml
= np.nan
market_mean_daily_sml if isinstance(market_returns, pd.DataFrame):
if market_returns.shape[1] == 0:
= np.nan
market_mean_daily_sml print("Warning: market_returns DataFrame is empty for SML. Market mean daily return is NaN.")
elif market_returns.shape[1] == 1:
= market_returns.iloc[:, 0].mean()
market_mean_daily_sml else:
= market_returns.iloc[:, 0].mean()
market_mean_daily_sml print(f"Warning: market_returns DataFrame has {market_returns.shape[1]} columns for SML. Used mean of first column: {market_returns.columns[0]}.")
elif isinstance(market_returns, pd.Series):
= market_returns.mean()
market_mean_daily_sml else:
= np.nan
market_mean_daily_sml print("Error: market_returns is of an unexpected type for SML. Market mean daily return set to NaN.")
if pd.isna(market_mean_daily_sml):
print("Market mean daily return is NaN for SML. Cannot plot SML accurately.")
else:
= (1 + market_mean_daily_sml)**annual_trading_days - 1
annual_market_return_sml = float(annual_market_return_sml - annual_risk_free_rate_sml)
market_premium_sml
= np.linspace(0, 2.5, 50)
beta_values_sml = annual_risk_free_rate_sml + beta_values_sml * market_premium_sml
sml_expected_returns
=(10, 6))
plt.figure(figsize'r-', lw=2, label='Security Market Line (SML)')
plt.plot(beta_values_sml, sml_expected_returns,
= 0
plotted_sml_assets_count for t in tickers:
if t in betas_regression and not pd.isna(betas_regression[t]) and \
in mean_daily_returns.index and not pd.isna(mean_daily_returns[t]):
t = betas_regression[t]
asset_beta = (1 + mean_daily_returns[t])**annual_trading_days - 1
asset_historical_annual_return =70, label=f'{t} (Historical)')
plt.scatter(asset_beta, asset_historical_annual_return, s* 1.02, asset_historical_annual_return * 1.02, t, fontsize=9)
plt.text(asset_beta +=1
plotted_sml_assets_count else:
print(f"Skipping {t} on SML plot: Beta or historical return missing/NaN.")
if plotted_sml_assets_count == 0 and len(tickers) > 0 :
print("Warning: No individual assets were plotted on the SML chart. Check beta/return data for all tickers.")
0, annual_risk_free_rate_sml, marker='s', s=100, color='blue', edgecolor='black', label='Risk-Free Asset ($R_f$)')
plt.scatter(0 + 0.03, annual_risk_free_rate_sml, '$R_f$', fontsize=9)
plt.text(1, annual_market_return_sml, marker='P', s=150, color='green', edgecolor='black', label='Market Portfolio ($\\beta=1$)')
plt.scatter(1 * 1.02, annual_market_return_sml * 1.02, 'Market', fontsize=9)
plt.text(
'Security Market Line (SML)')
plt.title('Beta ($\\beta$)')
plt.xlabel('Annualized Expected Return ($E[R_i]$)')
plt.ylabel(
plt.legend()True)
plt.grid(0, color='black', lw=0.5)
plt.axhline(0, color='black', lw=0.5)
plt.axvline(
plt.show()else:
print("Prerequisite data for SML plot is missing or inconsistent. Skipping SML plot.")
Calculating Betas via Regression...
Betas calculated via regression: {'AAPL': np.float64(1.1896773359003732), 'MSFT': np.float64(1.1735182508973696), 'GOOG': np.float64(1.118838130001145)}
6.3 Beyond CAPM
Multi‑factor models (e.g., the Arbitrage Pricing
Theory) express expected returns as a sum of exposures to
several risk factors \(F_j\):
\[E[r_i] = r_f + \sum_{j=1}^m \beta_{i,j}\bigl(E[r_{F_j}] - r_f\bigr),\]
with \(\beta_{i,j}=\mathrm{Cov}(r_i,r_{F_j})/\mathrm{Var}(r_{F_j})\).
With these tools—expected returns, risk measures, parameter estimation, portfolio optimization, market pricing of risk, and asset‑pricing models—you have a complete framework to quantify and manage the trade‑off between risk and return.