Skip to content
13

Combinatorial

PortfolioOptimisers.CombinatorialCrossValidation Type
julia
struct CombinatorialCrossValidation{__T_n_folds, __T_n_test_folds, __T_purged_size, __T_embargo_size} <: NonSequentialCrossValidationEstimator

Implements combinatorial non-sequential cross-validation with purging and embargoing, allowing for all possible combinations of test folds.

Fields

  • n_folds: Number of folds to split the data into.

  • n_test_folds: Number of folds to use as test set in each split.

  • purged_size: Number of observations to exclude from the start/end of each train set adjacent to a test set.

  • embargo_size: Number of observations to exclude from the start of each train set after a test set.

Constructors

julia
CombinatorialCrossValidation(;
    n_folds::Integer = 10,
    n_test_folds::Integer = 8,
    purged_size::Integer = 0,
    embargo_size::Integer = 0,
    warn_comb::Integer = 100_000,
) -> CombinatorialCrossValidation

Keyword arguments correspond to the struct's fields.

Validation

  • n_folds must be non-empty, greater than zero, and finite.

  • n_test_folds must be non-empty, greater than zero, and finite.

  • purged_size and embargo_size must be non-empty and finite.

  • Warns if the number of combinations exceeds warn_comb.

Examples

julia
julia> CombinatorialCrossValidation(; n_folds = 10, n_test_folds = 8, purged_size = 2,
                                    embargo_size = 1)
CombinatorialCrossValidation
       n_folds ┼ Int64: 10
  n_test_folds ┼ Int64: 8
   purged_size ┼ Int64: 2
  embargo_size ┴ Int64: 1

Related

source
PortfolioOptimisers.CombinatorialCrossValidationResult Type
julia
struct CombinatorialCrossValidationResult{__T_train_idx, __T_test_idx, __T_path_ids} <: NonSequentialCrossValidationResult

Result type produced by CombinatorialCrossValidation after splitting data into combinatorial training and testing folds.

Stores the train index vectors, nested test index vectors (one per path), and a matrix of path IDs mapping folds to paths.

Fields

  • train_idx: Vector of training index ranges for each split.

  • test_idx: Vector of vectors of testing index ranges (one per path per split).

  • path_ids: Matrix mapping fold indices to path indices.

Related

source
Base.split Method
julia
Base.split(ccv::CombinatorialCrossValidation, rd::ReturnsResult) -> CombinatorialCrossValidationResult

Split the returns data rd into all possible combinations of training and test folds using combinatorial cross-validation with optional purging and embargoing.

Arguments

  • ccv::CombinatorialCrossValidation: Combinatorial cross-validation estimator.

  • rd::ReturnsResult: Returns data to split.

Returns

  • CombinatorialCrossValidationResult: Result containing train indices, nested test index vectors (one per path), and a matrix of path IDs mapping folds to paths.

Related

source
PortfolioOptimisers.n_test_paths Function
julia
n_test_paths(n_folds, n_test_folds)

Compute the number of test paths in combinatorial cross-validation.

Returns the number of unique recombined test paths from n_folds folds choosing n_test_folds test folds. Also accepts a CombinatorialCrossValidation object directly.

Arguments

  • n_folds: Total number of folds.

  • n_test_folds: Number of test folds per combination.

Returns

  • Integer number of test paths.

Related

source
PortfolioOptimisers.average_train_size Function
julia
average_train_size(T, n_folds, n_test_folds)

Compute the average training set size for combinatorial cross-validation.

Arguments

  • T: Total number of observations.

  • n_folds: Total number of folds.

  • n_test_folds: Number of test folds per combination.

Returns

  • Average number of training observations per fold.

Related

source
PortfolioOptimisers.recombined_paths Function
julia
recombined_paths(ccv)

Generate the recombined test paths for combinatorial cross-validation.

Returns a vector of vectors representing the recombined test paths — sequences of test fold indices that together cover the entire dataset.

Arguments

Returns

  • Vector of recombined path index vectors.

Related

source
PortfolioOptimisers.optimal_number_folds Function
julia
optimal_number_folds(T::Integer, target_train_size::Integer,
                     target_n_test_paths::Integer; train_size_w::Number = 1,
                     n_test_paths_w::Number = 1, maxval::Number = 1e5) -> Tuple{Int, Int}

Find the optimal (n_folds, n_test_folds) pair for combinatorial cross-validation by minimising a weighted cost that balances the average training size against the number of test paths.

Arguments

  • T: Total number of observations in the dataset.

  • target_train_size: Desired average number of observations in each training set.

  • target_n_test_paths: Desired number of recombined test paths.

  • train_size_w: Weight applied to the training-size component of the cost (default 1).

  • n_test_paths_w: Weight applied to the test-paths component of the cost (default 1).

  • maxval: Early-exit threshold; a fold configuration whose cost exceeds maxval prunes subsequent higher n_test_folds values (default 1e5).

Returns

  • Tuple{Int, Int}: The optimal (n_folds, n_test_folds) pair minimising the weighted cost. Returns (0, 0) when no valid configuration is found.

Details

The cost function for a candidate (n_folds, n_test_folds) pair is:

cost=wntp|P(n,k)P|P+wtr|T¯(n,k)T|T

where P(n,k) is the number of test paths, T¯(n,k) is the average training size, P is target_n_test_paths, and T is target_train_size.

Related

source