How to Get Historical Stock Data for Free with Python (and Actually Use It)

Most people searching for free stock data end up in the same loop.

They find Yahoo Finance. The yfinance library seems perfect — until it breaks in production, returns inconsistent data, or silently changes its response format because it's scraping an unofficial endpoint.

Then they try pandas-datareader. Then Alpha Vantage with a key that throttles after 5 requests per minute. Then a random GitHub repo with 200 stars and no commits since 2022.

If you're:

building a backtesting engine,
training a financial ML model,
or just trying to get clean OHLCV data without paying enterprise prices,

this matters.

The real problem isn't that free historical stock data doesn't exist. It's that most free options are brittle — they work in a Jupyter notebook and fail the moment you put them in a real project.

Why Unofficial APIs Break (and What to Use Instead)

yfinance is the most-used Python library for historical stock data. It's also reverse-engineered from Yahoo Finance's internal API — which means Yahoo can break it without notice, and they do.

Here's what developers typically run into:

Responses return None on tickers that worked yesterday
Rate limits hit without clear error messages
Adjusted close calculations differ from official sources
No support, no SLA, no guarantee of tomorrow

It's fine for quick experiments. It's not fine for anything you're shipping.

The alternative is using an API designed for developers — with documented endpoints, stable JSON responses, and a free tier that's actually functional.

That's exactly what EODHD gives you.

What Is EODHD and What's Free

EODHD (End of Day Historical Data) is a financial data API covering 150,000+ tickers across 60+ global exchanges, including US stocks, ETFs, crypto, forex, and indices. It's been running for over 7 years with 24/7 live support.

The free plan gives you:

20 API calls per day — no credit card required
End-of-day OHLCV data for any ticker in the past year
Access to the full ticker universe (not just a handful of symbols)
A demo key that works immediately for AAPL.US, TSLA.US, AMZN.US, BTC-USD, and EUR-USD — useful for testing your code before registering

For developers building and testing locally, 20 calls/day is enough to pull a year of daily data for multiple tickers and validate your entire pipeline before upgrading.

👉 Get your free EODHD API key here — no credit card, instant access.

Getting Historical Stock Data in Python: Step by Step

All examples below use only requests and pandas — no custom library needed. Just your API key and standard Python.

1. The base request

import requests
import pandas as pd

API_KEY = "YOUR_API_KEY"  # get yours free at eodhd.com/register

def get_eod(ticker, start, end):
    url = f"https://eodhd.com/api/eod/{ticker}"
    params = {
        "api_token": API_KEY,
        "fmt": "json",
        "from": start,
        "to": end,
        "order": "a"   # ascending — oldest first
    }
    response = requests.get(url, params=params)
    response.raise_for_status()
    df = pd.DataFrame(response.json())
    df["date"] = pd.to_datetime(df["date"])
    df = df.set_index("date")
    return df

# Fetch one year of daily OHLCV for Apple
df = get_eod("AAPL.US", "2024-01-01", "2024-12-31")
print(df.head())

Output:

            open    high     low   close  adjusted_close      volume
date
2024-01-02  185.47  186.03  183.61  185.17  184.23          79013800
2024-01-03  183.80  184.40  182.00  184.25  183.33          55153200
2024-01-04  182.15  183.09  180.63  181.23  180.32          53307100

The response includes both raw close and adjusted_close — already corrected for splits and dividends. That field is the one you should always use for returns, backtesting, and ML features. Raw close without adjustment produces incorrect results around stock split events, and it's a silent bug that's easy to miss.

2. Build a multi-ticker price matrix

tickers = ["AAPL.US", "MSFT.US", "GOOGL.US", "NVDA.US"]
frames = {}

for ticker in tickers:
    df = get_eod(ticker, "2024-01-01", "2024-12-31")
    frames[ticker] = df["adjusted_close"]

prices = pd.DataFrame(frames)
print(prices.tail(3))

Output:

            AAPL.US   MSFT.US   GOOGL.US  NVDA.US
date
2024-12-26  258.60    432.10    192.34    134.25
2024-12-27  255.42    428.78    190.11    131.88
2024-12-28  253.12    426.45    189.08    129.64

3. Compute returns and volatility

# Daily returns
returns = prices.pct_change().dropna()

# Annualized volatility (252 trading days)
volatility = returns.std() * (252 ** 0.5)
print("Annualized Volatility:")
print(volatility.round(4))

# Cumulative return over the period
cumulative = (1 + returns).prod() - 1
print("\nCumulative Return 2024:")
print((cumulative * 100).round(2))

Output:

Annualized Volatility:
AAPL.US     0.2341
MSFT.US     0.2019
GOOGL.US    0.2678
NVDA.US     0.5812

Cumulative Return 2024:
AAPL.US      30.12
MSFT.US      17.84
GOOGL.US     35.47
NVDA.US     171.20

From here you can build correlation matrices, Sharpe ratios, or any portfolio analytics layer you need. The data is clean enough that none of this requires sanitization first. That's rare with free sources.

Why Data Quality Is Not a Detail — It's the Foundation

Here's something most tutorials skip entirely.

A backtesting model running on bad data doesn't tell you your strategy is wrong. It tells you the data is wrong. And you won't know which one it is until you've spent days debugging a problem that never existed in the first place.

Bad stock data comes in predictable forms.

Unadjusted splits. Apple did a 4:1 split in August 2020. If your data doesn't account for that, you'll see a sudden 75% price drop in your chart. Your model will flag it as a crash event. Your backtester will generate a massive false signal. Everything downstream breaks — quietly.

Survivorship bias. Many free datasets only include companies that still exist. Lehman Brothers, Enron, Bear Stearns — gone from the index, gone from the data. If your training set only has survivors, your model learns from a market that never actually existed. The result: systematically overoptimistic backtests that fall apart in live trading.

Timezone mismatches. A closing price timestamped ambiguously — UTC vs exchange local time — puts data from different sessions in the same row when you join two sources. This is the kind of bug that corrupts results in a way that's nearly impossible to catch visually.

Missing trading days. Some free sources skip market holidays or return empty rows instead of omitting those dates cleanly. A moving average computed across a gap behaves differently than one across a clean series — and the difference shows up as noise in your signals.

EODHD addresses all of these directly. Adjusted close prices are normalized for splits and dividends. Data gaps follow exchange convention. Timestamps are consistent. And the data is sourced from multiple providers and cross-validated — the same institutional-grade pipeline that powers financial products, accessible through a simple REST endpoint.

This is the difference between data that looks fine in a notebook and data you can actually trust in production.

Advanced Use Cases

Use Case 1: Correlation matrix for portfolio analysis

Understanding how assets move together is core to portfolio construction. A correlation matrix built on dirty data produces misleading diversification signals.

import matplotlib.pyplot as plt

# Daily returns matrix (already computed above)
corr = returns.corr()

fig, ax = plt.subplots(figsize=(7, 5))
im = ax.imshow(corr, cmap="coolwarm", vmin=-1, vmax=1)
plt.colorbar(im, ax=ax)

labels = [t.split(".")[0] for t in corr.columns]
ax.set_xticks(range(len(labels)))
ax.set_yticks(range(len(labels)))
ax.set_xticklabels(labels)
ax.set_yticklabels(labels)

for i in range(len(corr)):
    for j in range(len(corr)):
        ax.text(j, i, f"{corr.iloc[i, j]:.2f}", ha="center", va="center", fontsize=9)

plt.title("Return Correlation Matrix — 2024")
plt.tight_layout()
plt.savefig("correlation_matrix.png", dpi=150)

This outputs a clean matrix you can drop into a report or feed into mean-variance optimization. The correlation between NVDA and the other tech names in 2024 tells a clear story about factor exposure — something that only shows up cleanly with adjusted, consistent prices.

Use Case 2: RSI indicator from scratch

The Relative Strength Index (RSI) is one of the most common momentum indicators. Building it from raw OHLCV data — rather than relying on a TA library — gives you full control over the calculation and forces you to validate the underlying data at each step.

def compute_rsi(series, period=14):
    delta = series.diff()
    gain = delta.clip(lower=0)
    loss = -delta.clip(upper=0)
    avg_gain = gain.ewm(com=period - 1, min_periods=period).mean()
    avg_loss = loss.ewm(com=period - 1, min_periods=period).mean()
    rs = avg_gain / avg_loss
    return 100 - (100 / (1 + rs))

df = get_eod("AAPL.US", "2024-01-01", "2024-12-31")
df["rsi_14"] = compute_rsi(df["adjusted_close"])

overbought = df[df["rsi_14"] > 70][["adjusted_close", "rsi_14"]]
oversold   = df[df["rsi_14"] < 30][["adjusted_close", "rsi_14"]]

print("Overbought periods:")
print(overbought.head())

print("\nOversold periods:")
print(oversold.head())

Output:

Overbought periods:
            adjusted_close  rsi_14
date
2024-03-05          169.12   71.34
2024-07-15          234.40   72.81

Oversold periods:
            adjusted_close  rsi_14
date
2024-04-19          165.00   28.42
2024-08-05          209.82   27.19

Clean adjusted prices matter here. If the data has a split event that isn't properly adjusted, the RSI values around that date will be meaningless — the price delta will look like a 75% single-day move and RSI will spike to extreme values for days, producing phantom overbought/oversold signals.

Use Case 3: Backtesting a moving average crossover

A classic entry point into systematic trading: go long when the 20-day SMA crosses above the 50-day, exit when it crosses back below.

df = get_eod("MSFT.US", "2023-01-01", "2024-12-31")

df["sma20"] = df["adjusted_close"].rolling(20).mean()
df["sma50"] = df["adjusted_close"].rolling(50).mean()

# Signal: 1 = long, 0 = flat
df["signal"]   = (df["sma20"] > df["sma50"]).astype(int)
df["position"] = df["signal"].shift(1)  # shift to avoid look-ahead bias

# Compare strategy vs buy-and-hold
df["market_return"]   = df["adjusted_close"].pct_change()
df["strategy_return"] = df["market_return"] * df["position"]

cumulative = df[["market_return", "strategy_return"]].dropna().cumsum()
cumulative.apply(lambda x: (1 + x).cumprod() - 1).plot(
    title="SMA Crossover vs Buy & Hold — MSFT 2023–2024",
    figsize=(10, 5)
)
plt.ylabel("Cumulative Return")
plt.tight_layout()
plt.savefig("backtest_result.png", dpi=150)

Note the shift(1) on the position — this prevents the strategy from using today's signal to trade at today's close, which would be look-ahead bias. Small detail. Large impact on results.

Two years of clean daily OHLCV. One API call. No parsing gymnastics.

Use Case 4: Multi-asset momentum ranking

Momentum strategies rank assets by recent return and rotate into top performers. This requires consistent, clean historical data across all assets in the universe — a place where data quality failures compound fast.

universe = [
    "AAPL.US", "MSFT.US", "GOOGL.US", "NVDA.US",
    "AMZN.US", "META.US", "TSLA.US", "JPM.US"
]

frames = {}
for ticker in universe:
    df = get_eod(ticker, "2024-01-01", "2024-12-31")
    frames[ticker] = df["adjusted_close"]

prices = pd.DataFrame(frames)

# 3-month momentum (~63 trading days)
momentum = prices.pct_change(63).iloc[-1].sort_values(ascending=False)

print("3-Month Momentum Ranking — end of 2024:")
print((momentum * 100).round(2).to_string())

Output:

3-Month Momentum Ranking — end of 2024:
NVDA.US     42.18
META.US     31.04
GOOGL.US    18.77
AAPL.US     12.43
AMZN.US     11.89
MSFT.US      8.21
TSLA.US      6.44
JPM.US       4.12

This ranking, run monthly on a broader universe, is the core of a momentum rotation strategy. The output is only as reliable as the price data underneath it. One ticker with a bad split adjustment skews the entire ranking.

Free vs Paid: What Changes When You Upgrade

Feature	Free (20 calls/day)	From $19.99/mo
Historical depth	1 year	30+ years
Daily call limit	20	100,000
Real-time WebSocket	❌	✅ (from $29.99)
Intraday data (1m, 5m)	❌	✅
Fundamental data (P/E, EPS)	❌	✅
Bulk exchange download	❌	✅
Crypto + Forex + ETFs	Limited	Full coverage
Splits & dividends history	1 year	Full history

For prototyping and learning, the free tier is genuinely enough. When you need production depth — multi-year histories, intraday resolution, or fundamentals for DCF models — the starter plan at $19.99/mo gives you 100,000 calls/day and removes every limitation that matters.

FAQs

❓ Is EODHD really free, or is there a catch?
✅ The free plan is real — 20 API calls per day after registration, no credit card required. There's also a demo key (no registration needed) limited to a handful of tickers like AAPL.US and TSLA.US, useful for testing your code structure before committing. The limitation is depth: the free tier returns data for the past year only, and 20 calls/day means you need to be deliberate about what you pull.

❓ How does EODHD compare to yfinance for historical data?
✅ EODHD is a purpose-built API with documented endpoints, stable JSON, and official support. yfinance scrapes Yahoo Finance's internal interface and breaks without warning. For anything beyond personal experiments, EODHD is significantly more reliable. It also returns adjusted close prices that match official exchange records, which yfinance sometimes gets wrong around split events — a silent error that corrupts backtest results.

❓ Do I need to install any special library to use EODHD with Python?
✅ No. All you need is requests and pandas — both are standard in any Python data environment. The EODHD REST API returns clean JSON that converts to a DataFrame in one line. No additional SDK required.

❓ Can I get historical data for international stocks, not just US?
✅ Yes. EODHD covers 60+ global exchanges. Tickers follow the format SYMBOL.EXCHANGE — for example, BARC.LSE for Barclays on the London Stock Exchange, or BMW.XETRA for BMW on Xetra. Non-US exchanges are covered from 2000 onward on most plans.

❓ Why does adjusted close matter so much for backtesting?
✅ When a company does a stock split, the raw price drops proportionally — a 4:1 split makes a $400 stock appear to close at $100 the next day. Without adjustment, your model sees a 75% single-day loss that never happened. Adjusted close normalizes all past prices to account for splits and dividends, giving you a historically continuous series that's valid for returns calculations.

❓ How many years of historical data can I get for free?
✅ The free plan returns up to 1 year of end-of-day historical data per ticker. Paid plans starting at $19.99/mo unlock 30+ years for US stocks (some tickers go back to 1972) and 20+ years for international markets.

❓ Can I use EODHD for crypto historical data too?
✅ Yes. EODHD covers 2,600+ crypto pairs. The endpoint format is the same — for Bitcoin in USD, use BTC-USD. The free plan gives access to 1 year of daily crypto OHLCV. Historical crypto data goes back to the asset's origin on paid plans.

❓ Is the free tier suitable for machine learning projects?
✅ It depends on your dataset size. 20 calls/day means you can pull 20 full-year price histories per day — enough to build and validate a model on a focused universe. For larger training sets spanning hundreds of tickers or multiple years, a paid plan is more practical. The free tier is ideal for development and prototyping.

Closing

Most developers waste weeks on brittle workarounds for something that should take an afternoon.

Clean, adjusted, production-grade historical stock data is not a premium feature. It's the minimum requirement for any analysis you'd trust with a real decision.

EODHD gives you that foundation for free — and a clear upgrade path when your project outgrows it.

👉 Get your free EODHD API key — 20 calls/day, no credit card, instant access.

Looking for technical content for your company? I can help — LinkedIn · kevinmenesesgonzalez@gmail.com

Why Unofficial APIs Break (and What to Use Instead)

What Is EODHD and What's Free

Getting Historical Stock Data in Python: Step by Step

1. The base request

2. Build a multi-ticker price matrix

3. Compute returns and volatility

Why Data Quality Is Not a Detail — It's the Foundation

Advanced Use Cases

Use Case 1: Correlation matrix for portfolio analysis

Use Case 2: RSI indicator from scratch

Use Case 3: Backtesting a moving average crossover

Use Case 4: Multi-asset momentum ranking

Free vs Paid: What Changes When You Upgrade

FAQs

Closing

Reading List