Combinatorial
PortfolioOptimisers.CombinatorialCrossValidation Type
struct CombinatorialCrossValidation{__T_n_folds, __T_n_test_folds, __T_purged_size, __T_embargo_size} <: NonSequentialCrossValidationEstimatorImplements combinatorial non-sequential cross-validation with purging and embargoing, allowing for all possible combinations of test folds.
Fields
n_folds: Number of folds.n_test_folds: Number of test folds.purged_size: Number of observations to purge between train and test sets.embargo_size: Number of observations to embargo after the test set.
Constructors
CombinatorialCrossValidation(;
n_folds::Integer = 10,
n_test_folds::Integer = 8,
purged_size::Integer = 0,
embargo_size::Integer = 0,
warn_comb::Integer = 100_000,
) -> CombinatorialCrossValidationKeyword arguments correspond to the struct's fields.
Validation
n_foldsmust be non-empty, greater than zero, and finite.n_test_foldsmust be non-empty, greater than zero, and finite.purged_sizeandembargo_sizemust be non-empty and finite.Warns if the number of combinations exceeds
warn_comb.
Examples
julia> CombinatorialCrossValidation(; n_folds = 10, n_test_folds = 8, purged_size = 2,
embargo_size = 1)
CombinatorialCrossValidation
n_folds ┼ Int64: 10
n_test_folds ┼ Int64: 8
purged_size ┼ Int64: 2
embargo_size ┴ Int64: 1Related
sourcePortfolioOptimisers.CombinatorialCrossValidationResult Type
struct CombinatorialCrossValidationResult{__T_train_idx, __T_test_idx, __T_path_ids} <: NonSequentialCrossValidationResultResult type produced by CombinatorialCrossValidation after splitting data into combinatorial training and testing folds.
Stores the train index vectors, nested test index vectors (one per path), and a matrix of path IDs mapping folds to paths.
Fields
train_idx: Training set indices.test_idx: Test set indices.path_ids: Path identifiers for cross-validation splits.
Constructors
CombinatorialCrossValidationResult(;
train_idx::VecVecInt,
test_idx::VecVecVecInt,
path_ids::AbstractMatrix{<:Integer}
) -> CombinatorialCrossValidationResultKeywords correspond to the struct's fields.
Validation
!isempty(train_idx).!isempty(test_idx).!isempty(path_ids).length(train_idx) == length(test_idx) == size(path_ids, 2).
Related
sourcePortfolioOptimisers.CombCVER Type
const CombCVER = Union{<:CombinatorialCrossValidation,
<:CombinatorialCrossValidationResult}Alias for a combinatorial cross-validation estimator or result.
Matches either a CombinatorialCrossValidation estimator or a CombinatorialCrossValidationResult.
Related
sourceBase.split Method
Base.split(ccv::CombinatorialCrossValidation, rd::ReturnsResult) -> CombinatorialCrossValidationResultSplit the returns data rd into all possible combinations of training and test folds using combinatorial cross-validation with optional purging and embargoing.
Arguments
ccv::CombinatorialCrossValidation: Combinatorial cross-validation estimator.rd::ReturnsResult: Returns data to split.
Returns
CombinatorialCrossValidationResult: Result containing train indices, nested test index vectors (one per path), and a matrix of path IDs mapping folds to paths.
Related
sourcePortfolioOptimisers.test_set_index Method
test_set_index(ccv)Generate all test set index combinations for combinatorial cross-validation.
Returns a vector of test fold index combinations for ccv.
Arguments
ccv:CombinatorialCrossValidationconfiguration.
Returns
- Vector of test index combinations.
Related
sourcePortfolioOptimisers.binary_train_test_sets Method
binary_train_test_sets(ccv)Generate binary train/test set assignment matrices for combinatorial cross-validation.
Returns a matrix indicating which samples are in train (0) and test (1) sets for each combination.
Arguments
ccv:CombinatorialCrossValidationconfiguration.
Returns
- Binary train/test assignment matrix.
Related
sourcePortfolioOptimisers.get_path_ids Method
get_path_ids(ccv)Get path identifiers for each test combination in combinatorial cross-validation.
Returns the path assignment for each test combination, mapping combinations to their recombined paths.
Arguments
ccv:CombinatorialCrossValidationconfiguration.
Returns
- Vector of path IDs.
Related
sourcePortfolioOptimisers.n_test_paths Function
n_test_paths(n_folds, n_test_folds)Compute the number of test paths in combinatorial cross-validation.
Returns the number of unique recombined test paths from n_folds folds choosing n_test_folds test folds. Also accepts a CombinatorialCrossValidation object directly.
Arguments
n_folds: Total number of folds.n_test_folds: Number of test folds per combination.
Returns
- Integer number of test paths.
Related
sourcePortfolioOptimisers.average_train_size Function
average_train_size(T, n_folds, n_test_folds)Compute the average training set size for combinatorial cross-validation.
Arguments
T: Total number of observations.n_folds: Total number of folds.n_test_folds: Number of test folds per combination.
Returns
- Average number of training observations per fold.
Related
sourcePortfolioOptimisers.recombined_paths Function
recombined_paths(ccv)Generate the recombined test paths for combinatorial cross-validation.
Returns a vector of vectors representing the recombined test paths — sequences of test fold indices that together cover the entire dataset.
Arguments
ccv:CombinatorialCrossValidationconfiguration.
Returns
- Vector of recombined path index vectors.
Related
sourcePortfolioOptimisers.optimal_number_folds Function
optimal_number_folds(T::Integer, target_train_size::Integer,
target_n_test_paths::Integer; train_size_w::Number = 1,
n_test_paths_w::Number = 1, maxval::Number = 1e5) -> Tuple{Int, Int}Find the optimal (n_folds, n_test_folds) pair for combinatorial cross-validation by minimising a weighted cost that balances the average training size against the number of test paths.
Mathematical definition
The cost function for a candidate (n_folds, n_test_folds) pair is:
Where:
: Weighted cost for the candidate fold configuration. : Weight on the test-paths component. : Weight on the training-size component. : Number of test paths for folds and test folds. : Average training size for folds and test folds. : Target number of test paths ( target_n_test_paths).: Target training size ( target_train_size).
Arguments
T: Total number of observations in the dataset.target_train_size: Desired average number of observations in each training set.target_n_test_paths: Desired number of recombined test paths.train_size_w: Weight applied to the training-size component of the cost (default1).n_test_paths_w: Weight applied to the test-paths component of the cost (default1).maxval: Early-exit threshold; a fold configuration whose cost exceedsmaxvalprunes subsequent highern_test_foldsvalues (default1e5).
Returns
Tuple{Int, Int}: The optimal(n_folds, n_test_folds)pair minimising the weighted cost. Returns(0, 0)when no valid configuration is found.
Related
source