Histogram
PortfolioOptimisers.Knuth Type
struct Knuth <: AstroPyBins endHistogram binning algorithm using Knuth's rule.
Knuth implements Knuth's rule for selecting the optimal number of bins in a histogram, as provided by the AstroPy library. This method aims to maximize the posterior probability of the histogram given the data, resulting in an adaptive binning strategy that balances bias and variance.
Related
sourcePortfolioOptimisers.FreedmanDiaconis Type
struct FreedmanDiaconis <: AstroPyBins endHistogram binning algorithm using the Freedman-Diaconis rule.
FreedmanDiaconis implements the Freedman-Diaconis rule for selecting the number of bins in a histogram, as provided by the AstroPy library. This method determines bin width based on the interquartile range (IQR) and the number of data points, making it robust to outliers and suitable for skewed distributions.
Related
sourcePortfolioOptimisers.Scott Type
struct Scott <: AstroPyBins endHistogram binning algorithm using Scott's rule.
Scott implements Scott's rule for selecting the number of bins in a histogram, as provided by the AstroPy library. This method chooses bin width based on the standard deviation of the data and the number of observations, providing a good default for normally distributed data.
Related
sourcePortfolioOptimisers.HacineGharbiRavier Type
struct HacineGharbiRavier <: AbstractBins endHistogram binning algorithm using the Hacine-Gharbi–Ravier rule.
HacineGharbiRavier implements the Hacine-Gharbi–Ravier rule for selecting the number of bins in a histogram. This method adapts the bin count based on the correlation structure and sample size, and is particularly useful for information-theoretic measures such as mutual information and variation of information.
Related
sourcePortfolioOptimisers.AbstractBins Type
abstract type AbstractBins <: AbstractAlgorithm endAbstract supertype for all histogram binning algorithms.
AbstractBins is the abstract type for all binning algorithm types used in histogram-based calculations within PortfolioOptimisers.jl, such as mutual information and variation of information analysis. Concrete subtypes implement specific binning strategies (e.g., Knuth, Freedman-Diaconis, Scott, Hacine-Gharbi-Ravier) and provide a consistent interface for bin selection.
Related
sourcePortfolioOptimisers.AstroPyBins Type
abstract type AstroPyBins <: AbstractBins endAbstract supertype for all histogram binning algorithms implemented using AstroPy's bin width selection methods.
AstroPyBins is the abstract type for all binning algorithm types that rely on bin width selection functions from the AstroPy Python library, such as Knuth, Freedman-Diaconis, and Scott. Concrete subtypes implement specific binning strategies and provide a consistent interface for bin selection in histogram-based calculations within PortfolioOptimisers.jl.
Related
sourcePortfolioOptimisers.get_bin_width_func Function
get_bin_width_func(bins::Union{<:AbstractBins, <:Integer})Return the bin width selection function associated with a histogram binning algorithm.
This utility dispatches on the binning algorithm type and returns the corresponding bin width function from the AstroPy Python library for Knuth, FreedmanDiaconis, and Scott. For HacineGharbiRavier and integer bin counts, it returns nothing, as these strategies do not use a bin width function.
Arguments
bins::Knuth: Use Knuth's rule (astropy.stats.knuth_bin_width).bins::FreedmanDiaconis: Use the Freedman-Diaconis rule (astropy.stats.freedman_bin_width).bins::Scott: Use Scott's rule (astropy.stats.scott_bin_width).bins::Union{<:HacineGharbiRavier, <:Integer}: No bin width function (returnsnothing).
Returns
bin_width_func::Function: The corresponding bin width function (callable), ornothingif not applicable.
Examples
julia> PortfolioOptimisers.get_bin_width_func(Knuth())
Python: <function knuth_bin_width at 0x7da1178e0fe0>
julia> PortfolioOptimisers.get_bin_width_func(FreedmanDiaconis())
Python: <function freedman_bin_width at 0x7da1178e0fe0>
julia> PortfolioOptimisers.get_bin_width_func(Scott())
Python: <function scott_bin_width at 0x7da1178e0fe0>
julia> PortfolioOptimisers.get_bin_width_func(HacineGharbiRavier())
julia> PortfolioOptimisers.get_bin_width_func(10)Related
sourcePortfolioOptimisers.calc_num_bins Function
calc_num_bins(bins::Union{<:AbstractBins, <:Integer}, xj::AbstractVector,
xi::AbstractVector, j::Integer, i::Integer, bin_width_func, T::Integer)Compute the number of histogram bins for a pair of variables using a specified binning algorithm.
This function determines the number of bins to use for histogram-based calculations (such as mutual information or variation of information) between two variables, based on the selected binning strategy. It dispatches on the binning algorithm type and uses the appropriate method for each:
For
AstroPyBins, it computes the bin width using the providedbin_width_funcand calculates the number of bins as the range divided by the bin width, rounding to the nearest integer. For off-diagonal pairs, it uses the maximum of the two variables' bin counts.For
HacineGharbiRavier, it uses the Hacine-Gharbi–Ravier rule, which adapts the bin count based on the correlation and sample size.For an integer, it returns the specified number of bins directly.
Arguments
bins: Binning algorithm/number.xj: Data vector for variablej.xi: Data vector for variablei.j: Index of variablej.i: Index of variablei.bin_width_func: Bin width selection function (fromget_bin_width_func), ornothing.T: Number of observations (used by some algorithms).
Returns
nbins::Int: The computed number of bins for the variable pair.
Related
sourcePortfolioOptimisers.calc_hist_data Function
calc_hist_data(xj::AbstractVector, xi::AbstractVector, bins::Integer)Compute histogram-based marginal and joint distributions for two variables.
This function computes the normalised histograms (probability mass functions) for two variables xj and xi using the specified number of bins, as well as their joint histogram. It returns the marginal entropies and the joint histogram, which are used in mutual information and variation of information calculations.
Arguments
xj: Data vector for variablej.xi: Data vector for variablei.bins: Number of bins to use for the histograms.
Returns
ex::Real: Entropy ofxj.ey::Real: Entropy ofxi.hxy::Matrix{<:Real}: Joint histogram (counts, not normalised to probability).
Details
The histograms are computed using
StatsBase.fit(Histogram, ...)over the range of each variable, with bin edges expanded slightly usingepsto ensure all data is included.The marginal histograms are normalised to sum to 1 before entropy calculation.
The joint histogram is not normalised, as it is used directly in mutual information calculations.
Related
sourcePortfolioOptimisers.intrinsic_mutual_info Function
intrinsic_mutual_info(X::AbstractMatrix)Compute the intrinsic mutual information from a joint histogram.
This function calculates the mutual information between two variables given their joint histogram matrix X. It is used as a core step in information-theoretic measures such as mutual information and variation of information.
Arguments
X: Joint histogram matrix.
Returns
mi::Real: The intrinsic mutual information between the two variables.
Details
The function computes marginal distributions by summing over rows and columns.
Only nonzero entries in the joint histogram are considered.
The mutual information is computed as the sum over all nonzero joint probabilities of
p(x, y) * log(p(x, y) / (p(x) * p(y))), with careful handling of log and normalisation.
Related
sourcePortfolioOptimisers.variation_info Function
variation_info(X::AbstractMatrix;
bins::Union{<:AbstractBins, <:Integer} = HacineGharbiRavier(),
normalise::Bool = true)Compute the variation of information (VI) matrix for a set of variables.
This function calculates the pairwise variation of information between all columns of the data matrix X, using histogram-based entropy and mutual information estimates. VI quantifies the amount of information lost and gained when moving from one variable to another, and is a true metric on the space of discrete distributions.
Arguments
X: Data matrix (observations × variables).bins: Binning algorithm or fixed number of bins.normalise: Whether to normalise the VI by the joint entropy.
Returns
var_mtx::Matrix{<:Real}: Symmetric matrix of pairwise variation of information values.
Details
For each pair of variables, the function computes marginal entropies and the joint histogram using
calc_hist_data.The mutual information is computed using
intrinsic_mutual_info.VI is calculated as
H(X) + H(Y) - 2 * I(X, Y). Ifnormaliseistrue, it is divided by the joint entropy.The result is clamped to
[0, typemax(eltype(X))]and is symmetric.
Related
sourcePortfolioOptimisers.mutual_info Function
mutual_info(X::AbstractMatrix;
bins::Union{<:AbstractBins, <:Integer} = HacineGharbiRavier(),
normalise::Bool = true)Compute the mutual information (MI) matrix for a set of variables.
This function calculates the pairwise mutual information between all columns of the data matrix X, using histogram-based entropy and mutual information estimates. MI quantifies the amount of shared information between pairs of variables, and is widely used in information-theoretic analysis of dependencies.
Arguments
X: Data matrix (observations × variables).bins: Binning algorithm or fixed number of bins.normalise: Whether to normalise the MI by the minimum marginal entropy.
Returns
mut_mtx::Matrix{<:Real}: Symmetric matrix of pairwise mutual information values.
Details
For each pair of variables, the function computes marginal entropies and the joint histogram using
calc_hist_data.The mutual information is computed using
intrinsic_mutual_info.If
normaliseistrue, the MI is divided by the minimum of the two marginal entropies.The result is clamped to
[0, typemax(eltype(X))]and is symmetric.
Related
source