repair
Package repair provides price data repair functionality for financial time series.
This package detects and corrects common data quality issues in Yahoo Finance data, including 100x currency errors, bad stock split adjustments, dividend double-counting, capital gains double-counting, and missing/zero values.
Overview
Yahoo Finance data sometimes contains errors that need to be repaired:
- 100x errors: Price appears in cents instead of dollars (or vice versa)
- Bad stock splits: Split adjustments not applied or applied incorrectly
- Bad dividends: Dividend adjustments not applied correctly
- Capital gains double-counting: For ETFs/MutualFunds, capital gains counted twice
- Zero/missing values: Prices showing as 0 or NaN
Usage
Create a Repairer and call Repair on your bar data:
opts := repair.DefaultOptions()
opts.Interval = "1d"
opts.QuoteType = "ETF"
repairer := repair.New(opts)
repairedBars, err := repairer.Repair(bars)
Repair Options
Individual repair functions can be enabled/disabled:
opts := repair.Options{
FixUnitMixups: true, // Fix 100x errors
FixZeroes: true, // Fix zero/missing values
FixSplits: true, // Fix stock split errors
FixDividends: true, // Fix dividend adjustment errors
FixCapitalGains: true, // Fix capital gains double-counting
}
Capital Gains Repair \(v1.1.0\)
For ETFs and Mutual Funds, Yahoo Finance sometimes double-counts capital gains in the Adjusted Close calculation. This repair detects and corrects this issue:
// Only applies to ETF and MUTUALFUND quote types
opts.QuoteType = "ETF"
opts.FixCapitalGains = true
The algorithm compares price drops on distribution days against expected drops based on dividend vs dividend+capital_gains to detect double-counting.
Stock Split Repair
Detects when Yahoo fails to apply stock split adjustments to historical data:
Uses IQR-based outlier detection to identify suspicious price changes that match the split ratio, then applies corrections.
This package is designed to match the behavior of Python yfinance's price repair functionality.
Index
- func CountRepaired(bars []models.Bar) int
- func DetectBadDividends(bars []models.Bar, currency string) []int
- func DetectBadSplits(bars []models.Bar) []int
- func DetectUnitMixups(bars []models.Bar) []int
- func DetectZeroes(bars []models.Bar) []int
- func HasCapitalGains(bars []models.Bar) bool
- func HasDividends(bars []models.Bar) bool
- func HasSplits(bars []models.Bar) bool
- type CapitalGainsRepairStats
- type DividendInfo
- type DividendRepairStats
- type Options
- func DefaultOptions() Options
- type QuoteType
- type Repairer
- func New(opts Options) *Repairer
- func (r *Repairer) AnalyzeCapitalGains(bars []models.Bar) CapitalGainsRepairStats
- func (r *Repairer) AnalyzeDividends(bars []models.Bar) DividendRepairStats
- func (r *Repairer) AnalyzeSplits(bars []models.Bar) SplitRepairStats
- func (r *Repairer) AnalyzeUnitMixups(bars []models.Bar) UnitMixupStats
- func (r *Repairer) AnalyzeZeroes(bars []models.Bar) ZeroRepairStats
- func (r *Repairer) Repair(bars []models.Bar) ([]models.Bar, error)
- type SplitInfo
- type SplitRepairStats
- type UnitMixupStats
- type ZeroRepairStats
func CountRepaired
CountRepaired counts how many bars have been repaired.
func DetectBadDividends
DetectBadDividends checks if there are dividend issues in the data. Returns indices of bars with suspected dividend problems.
func DetectBadSplits
DetectBadSplits checks if there are unadjusted splits in the data.
func DetectUnitMixups
DetectUnitMixups checks if there are 100x errors in the data. Returns indices of bars with suspected 100x errors.
func DetectZeroes
DetectZeroes checks for bars with zero/missing values. Returns indices of bars that may need repair.
func HasCapitalGains
HasCapitalGains checks if any bar has capital gains data.
func HasDividends
HasDividends checks if any bar has dividend data.
func HasSplits
HasSplits checks if any bar has split data.
type CapitalGainsRepairStats
CapitalGainsRepairStats returns statistics about capital gains repair.
type CapitalGainsRepairStats struct {
TotalEvents int // Number of capital gains events
DoubleCountEvents int // Number detected as double-counted
DoubleCountRatio float64 // Ratio of double-counted events
RepairApplied bool // Whether repair was applied
BarsRepaired int // Number of bars that were repaired
}
type DividendInfo
DividendInfo contains information about a single dividend.
type DividendInfo struct {
Date time.Time
Amount float64
IsMissingAdj bool
IsTooSmall bool
IsTooLarge bool
IsPhantom bool
WasRepaired bool
}
type DividendRepairStats
DividendRepairStats contains statistics about dividend repairs.
type DividendRepairStats struct {
TotalDividends int // Number of dividend events found
MissingAdj int // Dividends with missing adjustment
TooSmall int // Dividends 100x too small
TooLarge int // Dividends 100x too big
Phantoms int // Phantom (duplicate) dividends
BarsRepaired int // Total bars modified
Dividends []DividendInfo // Details of each dividend
}
type Options
Options configures the repair behavior.
type Options struct {
// Data context
Ticker string // Ticker symbol
Interval string // Data interval (1d, 1wk, 1mo, etc.)
Timezone string // Exchange timezone
Currency string // Price currency
QuoteType QuoteType // Type of instrument (EQUITY, ETF, MUTUALFUND, etc.)
// Feature flags - which repairs to apply
FixUnitMixups bool // Fix 100x currency errors ($/cents, £/pence)
FixZeroes bool // Fix missing/zero values
FixSplits bool // Fix bad stock split adjustments
FixDividends bool // Fix bad dividend adjustments
FixCapitalGains bool // Fix capital gains double-counting (ETF/MutualFund only)
}
func DefaultOptions
DefaultOptions returns options with all repairs enabled.
type QuoteType
QuoteType represents the type of financial instrument.
const (
QuoteTypeEquity QuoteType = "EQUITY"
QuoteTypeETF QuoteType = "ETF"
QuoteTypeMutualFund QuoteType = "MUTUALFUND"
QuoteTypeIndex QuoteType = "INDEX"
QuoteTypeCurrency QuoteType = "CURRENCY"
QuoteTypeCrypto QuoteType = "CRYPTOCURRENCY"
)
type Repairer
Repairer handles price data repair operations.
func New
New creates a new Repairer with the given options.
func (*Repairer) AnalyzeCapitalGains
AnalyzeCapitalGains analyzes bars for capital gains issues without modifying. Useful for debugging and understanding the data.
func (*Repairer) AnalyzeDividends
AnalyzeDividends analyzes bars for dividend issues without modifying.
func (*Repairer) AnalyzeSplits
AnalyzeSplits analyzes bars for split issues without modifying.
func (*Repairer) AnalyzeUnitMixups
AnalyzeUnitMixups analyzes bars for 100x errors without modifying.
func (*Repairer) AnalyzeZeroes
AnalyzeZeroes analyzes bars for zero/missing values without modifying.
func (*Repairer) Repair
Repair applies all enabled repair operations to the bar data. The order of operations matters:
- Fix dividend adjustments (must come before price-level errors)
- Fix 100x unit errors
- Fix stock split errors
- Fix zero/missing values
- Fix capital gains double-counting (last, needs clean adjustment data)
Returns the repaired bars and any error encountered.
type SplitInfo
SplitInfo contains information about a single split.
type SplitRepairStats
SplitRepairStats contains statistics about split repair.
type SplitRepairStats struct {
TotalSplits int // Number of split events found
SplitsRepaired int // Number of splits that were repaired
BarsRepaired int // Total bars modified
Splits []SplitInfo // Details of each split
}
type UnitMixupStats
UnitMixupStats contains statistics about unit mixup repairs.
type UnitMixupStats struct {
TotalBars int // Total bars analyzed
BarsRepaired int // Bars with 100x errors fixed
HasUnitSwitch bool // Whether a permanent unit switch was detected
SwitchIndex int // Index where unit switch occurred (-1 if none)
RandomMixupCount int // Number of random 100x errors found
}
type ZeroRepairStats
ZeroRepairStats contains statistics about zero value repairs.