Price Repair
The repair package provides functionality to detect and correct common data quality issues in Yahoo Finance data.
Overview
Yahoo Finance data sometimes contains errors that can affect analysis:
| Error Type | Description | Example |
|---|---|---|
| 100x Errors | Price in wrong currency unit | $150 shown as $15000 (cents) |
| Stock Split Errors | Split adjustment not applied | Pre-split prices not adjusted |
| Dividend Errors | Dividend adjustment incorrect | Adj Close miscalculated |
| Capital Gains Double-Count | CG counted twice in Adj Close | ETF/MutualFund specific |
| Missing/Zero Values | Price gaps or zeros | Trading halts, data errors |
Basic Usage
Enable repair when fetching history:
import (
"github.com/wnjoon/go-yfinance/pkg/ticker"
"github.com/wnjoon/go-yfinance/pkg/models"
)
t, err := ticker.New("AAPL")
if err != nil {
log.Fatal(err)
}
defer t.Close()
// Enable all repairs
bars, err := t.History(models.HistoryParams{
Period: "1y",
Interval: "1d",
Repair: true,
})
Fine-Grained Control
Use RepairOptions for specific repairs:
bars, err := t.History(models.HistoryParams{
Period: "1y",
Interval: "1d",
Repair: true,
RepairOptions: &models.RepairOptions{
FixCapitalGains: true, // ETF/MutualFund
FixSplits: true, // Stock splits
FixDividends: false, // Disable dividend repair
FixUnitMixups: true, // 100x errors
FixZeroes: true, // Missing values
},
})
Capital Gains Repair
For ETFs and Mutual Funds, Yahoo Finance sometimes double-counts capital gains in the Adjusted Close calculation.
The Problem
When an ETF distributes capital gains: 1. Yahoo adds Capital Gains to the Dividends column 2. Yahoo also adjusts Adj Close for Capital Gains separately 3. Result: Capital Gains counted twice, historical prices appear too low
Detection
The repair algorithm: 1. Calculates normal price volatility (excluding distribution days) 2. Compares price drop vs dividend expectation vs (dividend + capital gains) 3. If >66% of events show double-counting, repairs all
Example
// For ETF with capital gains
t, _ := ticker.New("VWILX")
bars, _ := t.History(models.HistoryParams{
Period: "1y",
Interval: "1d",
Repair: true,
RepairOptions: &models.RepairOptions{
FixCapitalGains: true,
},
})
// Check if any bars were repaired
for _, bar := range bars {
if bar.Repaired {
fmt.Printf("%s was repaired\n", bar.Date)
}
}
Dividend Repair
Fixes various dividend-related errors in Yahoo Finance data.
Error Types
- Missing Adjustment: Adj Close = Close (no dividend applied)
- Dividend 100x Too Big: Currency unit error (e.g., $ shown as cents)
- Dividend 100x Too Small: Currency unit error (e.g., cents shown as $)
- Phantom Dividend: Duplicate dividend within 7-17 days
Detection Algorithm
For each dividend: 1. Calculate expected price drop based on dividend yield 2. Compare with typical window volatility 3. Check Adj Close ratio before/after dividend 4. Identify anomalies exceeding normal volatility
Example
// Fix dividend errors
bars, _ := t.History(models.HistoryParams{
Period: "1y",
Interval: "1d",
Repair: true,
RepairOptions: &models.RepairOptions{
FixDividends: true,
},
})
Interval Restrictions
Dividend repair is only applied to 1d intervals. Weekly and monthly intervals are too volatile for reliable detection. For longer intervals, fetch daily data, repair, then resample.
Stock Split Repair
Detects when Yahoo fails to apply split adjustments to historical prices.
Detection Algorithm
- Identifies split events from data
- Uses IQR-based outlier detection for normal volatility
- Detects price changes matching split ratio
- Filters false positives using volume analysis
Example
// Check for split repair
bars, _ := t.History(models.HistoryParams{
Period: "max",
Interval: "1wk",
Repair: true,
RepairOptions: &models.RepairOptions{
FixSplits: true,
},
})
Unit Mixup Repair (100x Errors)
Yahoo Finance sometimes returns prices in wrong currency units (e.g., dollars vs cents, pounds vs pence).
Types of Errors
- Random 100x Errors: Sporadic errors scattered throughout data
- Unit Switch: Permanent change in currency unit at some date
Detection Algorithm
For random errors: 1. Apply 2D median filter (3x3 window) to OHLC prices 2. Calculate ratio of actual to median-filtered prices 3. Round ratio to nearest 20, check if ~100 4. Correct outliers by dividing/multiplying by 100
For unit switch: 1. Calculate daily percentage changes 2. Use IQR to estimate normal volatility 3. Detect changes ~100x larger than normal 4. Apply correction to all preceding bars
Example
// Fix 100x currency errors
bars, _ := t.History(models.HistoryParams{
Period: "1y",
Interval: "1d",
Repair: true,
RepairOptions: &models.RepairOptions{
FixUnitMixups: true,
},
})
Currency-Specific Handling
- USD, GBP, EUR, etc.: 100x multiplier (dollars/cents)
- KWF (Kuwaiti Dinar): 1000x multiplier (dinar/fils)
Zero Value Repair
Fixes bars where prices are 0 or NaN but trading likely occurred.
Detection Criteria
A zero bar is flagged for repair if: - Volume > 0 (trading occurred) - Stock split event present - Dividend payment present - Continuous price movement around the gap
Repair Methods
- Interpolation: If both prev and next bars exist, average the boundary prices
- Forward Fill: Use previous bar's close if only prev available
- Backward Fill: Use next bar's open if only next available
Example
// Fix zero/missing values
bars, _ := t.History(models.HistoryParams{
Period: "1y",
Interval: "1d",
Repair: true,
RepairOptions: &models.RepairOptions{
FixZeroes: true,
},
})
Partial Zero Handling
When only some OHLC values are zero: - Calculate mean of valid prices - Fill zero values with the mean - Ensure High >= all prices, Low <= all prices
Data Fields
The Bar struct includes repair-related fields:
type Bar struct {
Date time.Time
Open float64
High float64
Low float64
Close float64
AdjClose float64
Volume int64
Dividends float64 // Dividend amount
Splits float64 // Split ratio
CapitalGains float64 // Capital gains (ETF/MutualFund)
Repaired bool // True if this bar was repaired
}
Repair Order
The repair system applies fixes in a specific order:
- Dividend adjustments - Must come first
- 100x unit errors - Price level corrections
- Stock split errors - Split ratio corrections
- Zero/missing values - Gap filling
- Capital gains - Last (needs clean data)
This order ensures each repair has clean data to work with.
Performance Considerations
- Repair operations add processing overhead
- For large datasets, consider caching repaired data
- Repair is most useful for long-term historical data
Comparison with Python yfinance
This implementation matches the behavior of Python yfinance's repair=True option, ensuring data consistency across languages.
See Also
- HistoryParams - History parameters including RepairOptions
- Statistics Package - Statistical functions for repair algorithms
- Repair Package - Price repair implementation details
- MIC Codes - Exchange MIC code mapping