x_valid module¶

This module contains functions used for the validation of ArchPy models For now it the validation mainly relies on cross-validation through the k-fold method

x_valid.X_valid(ArchTable, k=3, nreal_un=5, nreal_fa=2, plot=True, brier=True, proba_correct=True, aggregate_method=None, save_figs=False, fig_dir=None, weighting_method='same_weights', dic_weights=None, parallel=False, seed=15, folding_method='random', aspect='auto', verbose=1, **kwargs)¶

Perform a Cross-validation on the given ArchTable

Parameters:

ArchTable (base.Arch_table object) – The ArchTable to perform the X-validation on
k (int) – Number of folds
nreal_un (int) – Number of unit realizations to estimate score
nreal_fa (int) – Number of facies realiations to estimate score
brier (bool) – Return brier scores
proba_correct (bool) – Return proportion of correct cells per units/facies
aggregate_method (str or None) – To perform X-valid on mean model rather than on realizations This parameter is pass to the realizations_aggregation function Ignored if None
weighting_method (str) – Which method to use for applying the weights possible values are : same_weights, prop_weights, user_weights
folding_method (string) – Folding method to use to separate the data methods availables : “random”, “k_means” (to implement), “stratified” (to implement)
plot (bool) – Display plots
seed (int) – Seed
verbose (0 or 1) – Verbosity level

Returns:

(score_folds, df_conf_norm_un, df_conf_norm_fa) where score_folds is a list of size k containing the scores for each fold.

Each entry of the score_folds list is a tuple containing various scores:

dictionsnary of final scores (units/facies and brier/probability of correct classification)

distinct brier scores for each borehole (units and facies)

proportion of correct cells for each borehole (units and facies)

list of test boreholes

list of train boreholes (nreal, nreal_fa, n_boreholes)

df_conf_norm_un is a dataframe containing the normalized confusion matrix for each unit df_conf_norm_fa is a dataframe containing the normalized confusion matrix for each facies

Return type:

tuple of size 3

x_valid.bh2array(ArchTable, bh, typ='units')¶

Function to convert an ArchPy borehole to an array of IDs, either units or facies IDs are given by the facies/units ID

Parameters:

ArchTable (ArchPy.base.Arch_table) – ArchPy table
bh (ArchPy.base.borehole) – ArchPy borehole
typ (str) – type of data to extract, either “units” or “facies”

Returns:

array of IDs

Return type:

ndarray

x_valid.brier_func(p, i)¶

Compute the Brier score given a vector of probabilities

Parameters:

p (sequence if float) – vector of probabilities. Values must be between 0 and 1
i (int) – index of the true answer in vector

Returns:

Brier score

Return type:

float

x_valid.img_to_3D_colors(arr, dic)¶

Replace value in an array using a dictionnary linking actual values to new values

Parameters:

arr (ndarray) – array of values to replace
dic (dict) – dictionnary linking actual values to new values

Returns:

array of new values

Return type:

ndarray

x_valid.plot_confusion_matrix(df, title='Confusion matrix', cmap='plasma')¶

x_valid.test_bh_1fold(ArchTable, bhs_real, bhs_test, weighting_method='same_weight', dic_weights=None, plot=True, save_figs=False, fig_dir=None, brier=True, proba_correct=False, aspect='auto')¶

Perform a test on a single fold of a cross-validation

Parameters:

ArchTable (ArchPy.base.Arch_table) – ArchPy table to test
bhs_real (ndarray of size (nreal_units, nreal_fa, n_boreholes) of ArchPy.base.borehole) – “fake” boreholes sampled in the realizations of the ArchTable
bhs_test (sequence of ArchPy.base.borehole) – “real” boreholes to test
weighting_method (str) – method to weight the realizations, either “same_weight” or “weights”
dic_weights (dict) – dictionnary linking realization index to weight
plot (bool) – plot or not
save_figs (bool) – save figures or not
fig_dir (str) – directory to save figures
brier (bool) – compute Brier score or not
proba_correct (bool) – compute probability of correct classification or not
aspect (str) – aspect ratio of the plot

Returns:

dictionnary of results

Return type:

dict