Combinatorial
PortfolioOptimisers.CombinatorialCrossValidation Type
struct CombinatorialCrossValidation{__T_n_folds, __T_n_test_folds, __T_purged_size, __T_embargo_size} <: NonSequentialCrossValidationEstimatorImplements combinatorial non-sequential cross-validation with purging and embargoing, allowing for all possible combinations of test folds.
Fields
n_folds: Number of folds to split the data into.n_test_folds: Number of folds to use as test set in each split.purged_size: Number of observations to exclude from the start/end of each train set adjacent to a test set.embargo_size: Number of observations to exclude from the start of each train set after a test set.
Constructors
CombinatorialCrossValidation(;
n_folds::Integer = 10,
n_test_folds::Integer = 8,
purged_size::Integer = 0,
embargo_size::Integer = 0,
warn_comb::Integer = 100_000,
) -> CombinatorialCrossValidationKeyword arguments correspond to the struct's fields.
Validation
n_foldsmust be non-empty, greater than zero, and finite.n_test_foldsmust be non-empty, greater than zero, and finite.purged_sizeandembargo_sizemust be non-empty and finite.Warns if the number of combinations exceeds
warn_comb.
Examples
julia> CombinatorialCrossValidation(; n_folds = 10, n_test_folds = 8, purged_size = 2,
embargo_size = 1)
CombinatorialCrossValidation
n_folds ┼ Int64: 10
n_test_folds ┼ Int64: 8
purged_size ┼ Int64: 2
embargo_size ┴ Int64: 1Related
sourcePortfolioOptimisers.CombinatorialCrossValidationResult Type
struct CombinatorialCrossValidationResult{__T_train_idx, __T_test_idx, __T_path_ids} <: NonSequentialCrossValidationResultResult type produced by CombinatorialCrossValidation after splitting data into combinatorial training and testing folds.
Stores the train index vectors, nested test index vectors (one per path), and a matrix of path IDs mapping folds to paths.
Fields
train_idx: Vector of training index ranges for each split.test_idx: Vector of vectors of testing index ranges (one per path per split).path_ids: Matrix mapping fold indices to path indices.
Related
sourceBase.split Method
Base.split(ccv::CombinatorialCrossValidation, rd::ReturnsResult) -> CombinatorialCrossValidationResultSplit the returns data rd into all possible combinations of training and test folds using combinatorial cross-validation with optional purging and embargoing.
Arguments
ccv::CombinatorialCrossValidation: Combinatorial cross-validation estimator.rd::ReturnsResult: Returns data to split.
Returns
CombinatorialCrossValidationResult: Result containing train indices, nested test index vectors (one per path), and a matrix of path IDs mapping folds to paths.
Related
sourcePortfolioOptimisers.n_test_paths Function
n_test_paths(n_folds, n_test_folds)Compute the number of test paths in combinatorial cross-validation.
Returns the number of unique recombined test paths from n_folds folds choosing n_test_folds test folds. Also accepts a CombinatorialCrossValidation object directly.
Arguments
n_folds: Total number of folds.n_test_folds: Number of test folds per combination.
Returns
- Integer number of test paths.
Related
sourcePortfolioOptimisers.average_train_size Function
average_train_size(T, n_folds, n_test_folds)Compute the average training set size for combinatorial cross-validation.
Arguments
T: Total number of observations.n_folds: Total number of folds.n_test_folds: Number of test folds per combination.
Returns
- Average number of training observations per fold.
Related
sourcePortfolioOptimisers.recombined_paths Function
recombined_paths(ccv)Generate the recombined test paths for combinatorial cross-validation.
Returns a vector of vectors representing the recombined test paths — sequences of test fold indices that together cover the entire dataset.
Arguments
ccv:CombinatorialCrossValidationconfiguration.
Returns
- Vector of recombined path index vectors.
Related
sourcePortfolioOptimisers.optimal_number_folds Function
optimal_number_folds(T::Integer, target_train_size::Integer,
target_n_test_paths::Integer; train_size_w::Number = 1,
n_test_paths_w::Number = 1, maxval::Number = 1e5) -> Tuple{Int, Int}Find the optimal (n_folds, n_test_folds) pair for combinatorial cross-validation by minimising a weighted cost that balances the average training size against the number of test paths.
Arguments
T: Total number of observations in the dataset.target_train_size: Desired average number of observations in each training set.target_n_test_paths: Desired number of recombined test paths.train_size_w: Weight applied to the training-size component of the cost (default1).n_test_paths_w: Weight applied to the test-paths component of the cost (default1).maxval: Early-exit threshold; a fold configuration whose cost exceedsmaxvalprunes subsequent highern_test_foldsvalues (default1e5).
Returns
Tuple{Int, Int}: The optimal(n_folds, n_test_folds)pair minimising the weighted cost. Returns(0, 0)when no valid configuration is found.
Details
The cost function for a candidate (n_folds, n_test_folds) pair is:
where target_n_test_paths, and target_train_size.
Related
source