Skip to content

repair

import "github.com/wnjoon/go-yfinance/pkg/repair"

Package repair provides price data repair functionality for financial time series.

This package detects and corrects common data quality issues in Yahoo Finance data, including 100x currency errors, bad stock split adjustments, dividend double-counting, capital gains double-counting, and missing/zero values.

Overview

Yahoo Finance data sometimes contains errors that need to be repaired:

  • 100x errors: Price appears in cents instead of dollars (or vice versa)
  • Bad stock splits: Split adjustments not applied or applied incorrectly
  • Bad dividends: Dividend adjustments not applied correctly
  • Capital gains double-counting: For ETFs/MutualFunds, capital gains counted twice
  • Zero/missing values: Prices showing as 0 or NaN

Usage

Create a Repairer and call Repair on your bar data:

opts := repair.DefaultOptions()
opts.Interval = "1d"
opts.QuoteType = "ETF"

repairer := repair.New(opts)
repairedBars, err := repairer.Repair(bars)

Repair Options

Individual repair functions can be enabled/disabled:

opts := repair.Options{
    FixUnitMixups:   true,   // Fix 100x errors
    FixZeroes:       true,   // Fix zero/missing values
    FixSplits:       true,   // Fix stock split errors
    FixDividends:    true,   // Fix dividend adjustment errors
    FixCapitalGains: true,   // Fix capital gains double-counting
}

Capital Gains Repair \(v1.1.0\)

For ETFs and Mutual Funds, Yahoo Finance sometimes double-counts capital gains in the Adjusted Close calculation. This repair detects and corrects this issue:

// Only applies to ETF and MUTUALFUND quote types
opts.QuoteType = "ETF"
opts.FixCapitalGains = true

The algorithm compares price drops on distribution days against expected drops based on dividend vs dividend+capital_gains to detect double-counting.

Stock Split Repair

Detects when Yahoo fails to apply stock split adjustments to historical data:

opts.FixSplits = true

Uses IQR-based outlier detection to identify suspicious price changes that match the split ratio, then applies corrections.

This package is designed to match the behavior of Python yfinance's price repair functionality.

Index

func CountRepaired

func CountRepaired(bars []models.Bar) int

CountRepaired counts how many bars have been repaired.

func DetectBadDividends

func DetectBadDividends(bars []models.Bar, currency string) []int

DetectBadDividends checks if there are dividend issues in the data. Returns indices of bars with suspected dividend problems.

func DetectBadSplits

func DetectBadSplits(bars []models.Bar) []int

DetectBadSplits checks if there are unadjusted splits in the data.

func DetectUnitMixups

func DetectUnitMixups(bars []models.Bar) []int

DetectUnitMixups checks if there are 100x errors in the data. Returns indices of bars with suspected 100x errors.

func DetectZeroes

func DetectZeroes(bars []models.Bar) []int

DetectZeroes checks for bars with zero/missing values. Returns indices of bars that may need repair.

func HasCapitalGains

func HasCapitalGains(bars []models.Bar) bool

HasCapitalGains checks if any bar has capital gains data.

func HasDividends

func HasDividends(bars []models.Bar) bool

HasDividends checks if any bar has dividend data.

func HasSplits

func HasSplits(bars []models.Bar) bool

HasSplits checks if any bar has split data.

type CapitalGainsRepairStats

CapitalGainsRepairStats returns statistics about capital gains repair.

type CapitalGainsRepairStats struct {
    TotalEvents       int     // Number of capital gains events
    DoubleCountEvents int     // Number detected as double-counted
    DoubleCountRatio  float64 // Ratio of double-counted events
    RepairApplied     bool    // Whether repair was applied
    BarsRepaired      int     // Number of bars that were repaired
}

type DividendInfo

DividendInfo contains information about a single dividend.

type DividendInfo struct {
    Date         time.Time
    Amount       float64
    IsMissingAdj bool
    IsTooSmall   bool
    IsTooLarge   bool
    IsPhantom    bool
    WasRepaired  bool
}

type DividendRepairStats

DividendRepairStats contains statistics about dividend repairs.

type DividendRepairStats struct {
    TotalDividends int            // Number of dividend events found
    MissingAdj     int            // Dividends with missing adjustment
    TooSmall       int            // Dividends 100x too small
    TooLarge       int            // Dividends 100x too big
    Phantoms       int            // Phantom (duplicate) dividends
    BarsRepaired   int            // Total bars modified
    Dividends      []DividendInfo // Details of each dividend
}

type Options

Options configures the repair behavior.

type Options struct {
    // Data context
    Ticker    string    // Ticker symbol
    Interval  string    // Data interval (1d, 1wk, 1mo, etc.)
    Timezone  string    // Exchange timezone
    Currency  string    // Price currency
    QuoteType QuoteType // Type of instrument (EQUITY, ETF, MUTUALFUND, etc.)

    // Feature flags - which repairs to apply
    FixUnitMixups   bool // Fix 100x currency errors ($/cents, £/pence)
    FixZeroes       bool // Fix missing/zero values
    FixSplits       bool // Fix bad stock split adjustments
    FixDividends    bool // Fix bad dividend adjustments
    FixCapitalGains bool // Fix capital gains double-counting (ETF/MutualFund only)
}

func DefaultOptions

func DefaultOptions() Options

DefaultOptions returns options with all repairs enabled.

type QuoteType

QuoteType represents the type of financial instrument.

type QuoteType string

const (
    QuoteTypeEquity     QuoteType = "EQUITY"
    QuoteTypeETF        QuoteType = "ETF"
    QuoteTypeMutualFund QuoteType = "MUTUALFUND"
    QuoteTypeIndex      QuoteType = "INDEX"
    QuoteTypeCurrency   QuoteType = "CURRENCY"
    QuoteTypeCrypto     QuoteType = "CRYPTOCURRENCY"
)

type Repairer

Repairer handles price data repair operations.

type Repairer struct {
    // contains filtered or unexported fields
}

func New

func New(opts Options) *Repairer

New creates a new Repairer with the given options.

func (*Repairer) AnalyzeCapitalGains

func (r *Repairer) AnalyzeCapitalGains(bars []models.Bar) CapitalGainsRepairStats

AnalyzeCapitalGains analyzes bars for capital gains issues without modifying. Useful for debugging and understanding the data.

func (*Repairer) AnalyzeDividends

func (r *Repairer) AnalyzeDividends(bars []models.Bar) DividendRepairStats

AnalyzeDividends analyzes bars for dividend issues without modifying.

func (*Repairer) AnalyzeSplits

func (r *Repairer) AnalyzeSplits(bars []models.Bar) SplitRepairStats

AnalyzeSplits analyzes bars for split issues without modifying.

func (*Repairer) AnalyzeUnitMixups

func (r *Repairer) AnalyzeUnitMixups(bars []models.Bar) UnitMixupStats

AnalyzeUnitMixups analyzes bars for 100x errors without modifying.

func (*Repairer) AnalyzeZeroes

func (r *Repairer) AnalyzeZeroes(bars []models.Bar) ZeroRepairStats

AnalyzeZeroes analyzes bars for zero/missing values without modifying.

func (*Repairer) Repair

func (r *Repairer) Repair(bars []models.Bar) ([]models.Bar, error)

Repair applies all enabled repair operations to the bar data. The order of operations matters:

  1. Fix dividend adjustments (must come before price-level errors)
  2. Fix 100x unit errors
  3. Fix stock split errors
  4. Fix zero/missing values
  5. Fix capital gains double-counting (last, needs clean adjustment data)

Returns the repaired bars and any error encountered.

type SplitInfo

SplitInfo contains information about a single split.

type SplitInfo struct {
    Date        time.Time
    Ratio       float64
    WasRepaired bool
}

type SplitRepairStats

SplitRepairStats contains statistics about split repair.

type SplitRepairStats struct {
    TotalSplits    int         // Number of split events found
    SplitsRepaired int         // Number of splits that were repaired
    BarsRepaired   int         // Total bars modified
    Splits         []SplitInfo // Details of each split
}

type UnitMixupStats

UnitMixupStats contains statistics about unit mixup repairs.

type UnitMixupStats struct {
    TotalBars        int  // Total bars analyzed
    BarsRepaired     int  // Bars with 100x errors fixed
    HasUnitSwitch    bool // Whether a permanent unit switch was detected
    SwitchIndex      int  // Index where unit switch occurred (-1 if none)
    RandomMixupCount int  // Number of random 100x errors found
}

type ZeroRepairStats

ZeroRepairStats contains statistics about zero value repairs.

type ZeroRepairStats struct {
    TotalBars       int // Total bars analyzed
    ZeroBars        int // Bars with zero prices
    PartialZeroBars int // Bars with some zero values
    ZeroVolumeBars  int // Bars with zero volume but price changed
    BarsRepaired    int // Total bars repaired
}