Histogram

julia

struct Knuth <: AstroPyBins end

Histogram binning algorithm using Knuth's rule.

Knuth implements Knuth's rule for selecting the optimal number of bins in a histogram, as provided by the AstroPy library. This method aims to maximize the posterior probability of the histogram given the data, resulting in an adaptive binning strategy that balances bias and variance.

Related

source

PortfolioOptimisers.FreedmanDiaconis Type

julia

struct FreedmanDiaconis <: AstroPyBins end

Histogram binning algorithm using the Freedman-Diaconis rule.

FreedmanDiaconis implements the Freedman-Diaconis rule for selecting the number of bins in a histogram, as provided by the AstroPy library. This method determines bin width based on the interquartile range (IQR) and the number of data points, making it robust to outliers and suitable for skewed distributions.

Related

source

PortfolioOptimisers.Scott Type

julia

struct Scott <: AstroPyBins end

Histogram binning algorithm using Scott's rule.

Scott implements Scott's rule for selecting the number of bins in a histogram, as provided by the AstroPy library. This method chooses bin width based on the standard deviation of the data and the number of observations, providing a good default for normally distributed data.

Related

source

PortfolioOptimisers.HacineGharbiRavier Type

julia

struct HacineGharbiRavier <: AbstractBins end

Histogram binning algorithm using the Hacine-Gharbi–Ravier rule.

HacineGharbiRavier implements the Hacine-Gharbi–Ravier rule for selecting the number of bins in a histogram. This method adapts the bin count based on the correlation structure and sample size, and is particularly useful for information-theoretic measures such as mutual information and variation of information.

Related

source

PortfolioOptimisers.AbstractBins Type

julia

abstract type AbstractBins <: AbstractAlgorithm end

Abstract supertype for all histogram binning algorithms.

AbstractBins is the abstract type for all binning algorithm types used in histogram-based calculations within PortfolioOptimisers.jl, such as mutual information and variation of information analysis. Concrete subtypes implement specific binning strategies (e.g., Knuth, Freedman-Diaconis, Scott, Hacine-Gharbi-Ravier) and provide a consistent interface for bin selection.

Related

source

PortfolioOptimisers.AstroPyBins Type

julia

abstract type AstroPyBins <: AbstractBins end

Abstract supertype for all histogram binning algorithms implemented using AstroPy's bin width selection methods.

AstroPyBins is the abstract type for all binning algorithm types that rely on bin width selection functions from the AstroPy Python library, such as Knuth, Freedman-Diaconis, and Scott. Concrete subtypes implement specific binning strategies and provide a consistent interface for bin selection in histogram-based calculations within PortfolioOptimisers.jl.

Related

source

PortfolioOptimisers.get_bin_width_func Function

julia

get_bin_width_func(bins::Int_Bin)

Return the bin width selection function associated with a histogram binning algorithm.

This utility dispatches on the binning algorithm type and returns the corresponding bin width function from the AstroPy Python library for Knuth, FreedmanDiaconis, and Scott. For HacineGharbiRavier and integer bin counts, it returns nothing, as these strategies do not use a bin width function.

Arguments

bins::Knuth: Use Knuth's rule (astropy.stats.knuth_bin_width).
bins::FreedmanDiaconis: Use the Freedman-Diaconis rule (astropy.stats.freedman_bin_width).
bins::Scott: Use Scott's rule (astropy.stats.scott_bin_width).
bins::Union{<:HacineGharbiRavier, <:Integer}: No bin width function (returns nothing).

Returns

bin_width_func::Function: The corresponding bin width function (callable), or nothing if not applicable.

Examples

julia

julia> PortfolioOptimisers.get_bin_width_func(Knuth())
Python: <function knuth_bin_width at 0x7da1178e0fe0>

julia> PortfolioOptimisers.get_bin_width_func(FreedmanDiaconis())
Python: <function freedman_bin_width at 0x7da1178e0fe0>

julia> PortfolioOptimisers.get_bin_width_func(Scott())
Python: <function scott_bin_width at 0x7da1178e0fe0>

julia> PortfolioOptimisers.get_bin_width_func(HacineGharbiRavier())

julia> PortfolioOptimisers.get_bin_width_func(10)

Related

source

PortfolioOptimisers.calc_num_bins Function

julia

calc_num_bins(bins::Int_Bin, xj::VecNum,
              xi::VecNum, j::Integer, i::Integer, bin_width_func, T::Integer)

Compute the number of histogram bins for a pair of variables using a specified binning algorithm.

This function determines the number of bins to use for histogram-based calculations (such as mutual information or variation of information) between two variables, based on the selected binning strategy. It dispatches on the binning algorithm type and uses the appropriate method for each:

For AstroPyBins, it computes the bin width using the provided bin_width_func and computes the number of bins as the range divided by the bin width, rounding to the nearest integer. For off-diagonal pairs, it uses the maximum of the two variables' bin counts.
For HacineGharbiRavier, it uses the Hacine-Gharbi–Ravier rule, which adapts the bin count based on the correlation and sample size.
For an integer, it returns the specified number of bins directly.

Arguments

bins: Binning algorithm/number.
xj: Data vector for variable j.
xi: Data vector for variable i.
j: Index of variable j.
i: Index of variable i.
bin_width_func: Bin width selection function (from get_bin_width_func), or nothing.
T: Number of observations (used by some algorithms).

Returns

nbins::Int: The computed number of bins for the variable pair.

Related

source

PortfolioOptimisers.calc_hist_data Function

julia

calc_hist_data(xj::VecNum, xi::VecNum, bins::Integer)

Compute histogram-based marginal and joint distributions for two variables.

This function computes the normalised histograms (probability mass functions) for two variables xj and xi using the specified number of bins, as well as their joint histogram. It returns the marginal entropies and the joint histogram, which are used in mutual information and variation of information calculations.

Arguments

xj: Data vector for variable j.
xi: Data vector for variable i.
bins: Number of bins to use for the histograms.

Returns

ex::Number: Entropy of xj.
ey::Number: Entropy of xi.
hxy::Matrix{<:Number}: Joint histogram (counts, not normalised to probability).

Details

The histograms are computed using StatsBase.fit(Histogram, ...) over the range of each variable, with bin edges expanded slightly using eps to ensure all data is included.
The marginal histograms are normalised to sum to 1 before entropy calculation.
The joint histogram is not normalised, as it is used directly in mutual information calculations.

Related

source

PortfolioOptimisers.intrinsic_mutual_info Function

julia

intrinsic_mutual_info(X::MatNum)

Compute the intrinsic mutual information from a joint histogram.

This function computes the mutual information between two variables given their joint histogram matrix X. It is used as a core step in information-theoretic measures such as mutual information and variation of information.

Arguments

X: Joint histogram matrix.

Returns

mi::Number: The intrinsic mutual information between the two variables.

Details

The function computes marginal distributions by summing over rows and columns.
Only nonzero entries in the joint histogram are considered.
The mutual information is computed as the sum over all nonzero joint probabilities of p(x, y) * log(p(x, y) / (p(x) * p(y))), with careful handling of log and normalisation.

Related

source

PortfolioOptimisers.variation_info Function

julia

variation_info(X::MatNum;
               bins::Int_Bin = HacineGharbiRavier(),
               normalise::Bool = true)

Compute the variation of information (VI) matrix for a set of variables.

This function computes the pairwise variation of information between all columns of the data matrix X, using histogram-based entropy and mutual information estimates. VI quantifies the amount of information lost and gained when moving from one variable to another, and is a true metric on the space of discrete distributions.

Arguments

X: Data matrix (observations × variables).
bins: Binning algorithm or fixed number of bins.
normalise: Whether to normalise the VI by the joint entropy.

Returns

var_mtx::Matrix{<:Number}: Symmetric matrix of pairwise variation of information values.

Details

For each pair of variables, the function computes marginal entropies and the joint histogram using calc_hist_data.
The mutual information is computed using intrinsic_mutual_info.
VI is calculated as H(X) + H(Y) - 2 * I(X, Y). If normalise is true, it is divided by the joint entropy.
The result is clamped to [0, typemax(eltype(X))] and is symmetric.

Related

source

PortfolioOptimisers.mutual_info Function

julia

mutual_info(X::MatNum;
            bins::Int_Bin = HacineGharbiRavier(),
            normalise::Bool = true)

Compute the mutual information (MI) matrix for a set of variables.

This function computes the pairwise mutual information between all columns of the data matrix X, using histogram-based entropy and mutual information estimates. MI quantifies the amount of shared information between pairs of variables, and is widely used in information-theoretic analysis of dependencies.

Arguments

X: Data matrix (observations × variables).
bins: Binning algorithm or fixed number of bins.
normalise: Whether to normalise the MI by the minimum marginal entropy.

Returns

mut_mtx::Matrix{<:Number}: Symmetric matrix of pairwise mutual information values.

Details

For each pair of variables, the function computes marginal entropies and the joint histogram using calc_hist_data.
The mutual information is computed using intrinsic_mutual_info.
If normalise is true, the MI is divided by the minimum of the two marginal entropies.
The result is clamped to [0, typemax(eltype(X))] and is symmetric.

Related

source

Constraints

Risk Constraints

Histogram

Histogram ​

Histogram