stats
Package stats provides statistical utility functions for price repair operations.
This package includes functions for percentile calculations, z-score computations, median filtering, and outlier detection. These utilities are essential for detecting and correcting data quality issues in financial time series.
Percentile Functions
The package provides percentile calculation using linear interpolation:
Z\-Score Functions
Z-score calculations for standardization and outlier detection:
Filtering Functions
Median filter and outlier detection for noise reduction:
These functions are designed to match the behavior of numpy and scipy functions used in the Python yfinance implementation.
Index
- func Abs(data []float64) []float64
- func All(mask []bool) bool
- func Any(mask []bool) bool
- func ClipOutliers(data []float64, multiplier float64) []float64
- func CountTrue(mask []bool) int
- func DetectOutliersByZScore(data []float64, threshold float64) []bool
- func Diff(data []float64) []float64
- func FilterByMask(data []float64, mask []bool) []float64
- func FindBlocks(mask []bool) [][2]int
- func IQR(data []float64) (q1, q3, iqr float64)
- func InlierMask(data []float64, multiplier float64) []bool
- func Mean(data []float64) float64
- func Median(data []float64) float64
- func MedianFilter(data []float64, windowSize int) []float64
- func MedianFilter2D(data [][]float64, windowSize int) [][]float64
- func MedianOfSlice(data []float64) float64
- func OHLCMedian(open, high, low, close float64) float64
- func OutlierBounds(data []float64, multiplier float64) (lower, upper float64)
- func OutlierMask(data []float64, multiplier float64) []bool
- func PctChange(data []float64) []float64
- func Percentile(data []float64, p float64) float64
- func RemoveNaN(data []float64) []float64
- func RollingMean(data []float64, windowSize int) []float64
- func RollingStd(data []float64, windowSize int) []float64
- func Std(data []float64, ddof int) float64
- func WeightedMean(data, weights []float64) float64
- func ZScore(value, mean, std float64) float64
- func ZScoreSlice(data []float64) []float64
- func ZScoreWithParams(data []float64, mean, std float64) []float64
func Abs
Abs returns absolute values of the data.
func All
All returns true if all values in the mask are true.
func Any
Any returns true if any value in the mask is true.
func ClipOutliers
ClipOutliers replaces outliers with boundary values.
func CountTrue
CountTrue counts the number of true values in a boolean slice.
func DetectOutliersByZScore
DetectOutliersByZScore identifies outliers based on z-score threshold. Returns a boolean mask where true indicates an outlier.
Parameters:
- data: slice of float64 values
- threshold: z-score threshold (typically 2.0 or 3.0)
func Diff
Diff calculates the difference between consecutive elements. Returns slice of length n-1.
func FilterByMask
FilterByMask returns elements where mask is true.
func FindBlocks
FindBlocks identifies contiguous blocks of True values in a boolean mask. Returns slice of [start, end) pairs.
func IQR
IQR calculates the interquartile range (Q3 - Q1). Returns Q1, Q3, and IQR.
The interquartile range is used for outlier detection:
- Lower bound: Q1 - 1.5 * IQR
- Upper bound: Q3 + 1.5 * IQR
func InlierMask
InlierMask creates a boolean mask for inliers (non-outliers). Returns true for values that are NOT outliers.
func Mean
Mean calculates the arithmetic mean of the data. Returns NaN for empty data.
func Median
Median calculates the median (50th percentile) of the data.
func MedianFilter
MedianFilter applies a 1D median filter to the data. This is similar to scipy.ndimage.median_filter for 1D arrays.
Parameters:
- data: input slice
- windowSize: filter window size (should be odd)
Returns filtered data with same length as input. Edge values use smaller windows.
func MedianFilter2D
MedianFilter2D applies a 2D median filter to the data matrix. This is similar to scipy.ndimage.median_filter for 2D arrays.
Parameters:
- data: 2D slice [rows][cols]
- windowSize: filter window size for both dimensions
Returns filtered 2D data.
func MedianOfSlice
MedianOfSlice calculates the median without sorting the original slice.
func OHLCMedian
OHLC calculates the median of Open, High, Low, Close values. This provides a robust estimate of the "typical" price.
func OutlierBounds
OutlierBounds calculates the lower and upper bounds for outlier detection using the IQR method with a configurable multiplier.
Parameters:
- data: slice of float64 values
- multiplier: IQR multiplier (typically 1.5 for outliers, 3.0 for extreme outliers)
Returns lower bound, upper bound.
func OutlierMask
OutlierMask creates a boolean mask for outliers using the IQR method. Returns true for values that are outliers.
Parameters:
- data: slice of float64 values
- multiplier: IQR multiplier (typically 1.5)
func PctChange
PctChange calculates the percentage change between consecutive elements. Returns slice of length n-1.
func Percentile
Percentile calculates the p-th percentile of the given data using linear interpolation. This matches numpy.percentile with default interpolation method.
Parameters:
- data: slice of float64 values
- p: percentile to compute (0-100)
Returns the percentile value. Returns NaN for empty data.
func RemoveNaN
RemoveNaN returns a new slice with NaN values removed.
func RollingMean
RollingMean calculates a rolling (moving) mean with the specified window size. Uses center alignment. Returns NaN for positions where window is incomplete.
func RollingStd
RollingStd calculates a rolling (moving) standard deviation. Uses center alignment and sample std (ddof=1).
func Std
Std calculates the standard deviation of the data. Uses n-1 denominator (sample standard deviation) by default.
Parameters:
- data: slice of float64 values
- ddof: delta degrees of freedom (0 for population, 1 for sample)
func WeightedMean
WeightedMean calculates the weighted arithmetic mean. Returns NaN if weights sum to zero or if slices have different lengths.
func ZScore
ZScore calculates the z-score (standard score) for a single value.
Z-score = (value - mean) / std
Returns NaN if std is zero or NaN.
func ZScoreSlice
ZScoreSlice calculates z-scores for all values in the data. Uses sample standard deviation (ddof=1).
func ZScoreWithParams
ZScoreWithParams calculates z-scores using provided mean and std.