x_valid module

This module contains functions used for the validation of ArchPy models For now it the validation mainly relies on cross-validation through the k-fold method

x_valid.X_valid(ArchTable, k=3, nreal_un=5, nreal_fa=2, plot=True, brier=True, proba_correct=True, aggregate_method=None, save_figs=False, fig_dir=None, weighting_method='same_weights', dic_weights=None, parallel=False, seed=15, folding_method='random', aspect='auto', verbose=1, **kwargs)

Perform a Cross-validation on the given ArchTable

Parameters:
  • ArchTable (base.Arch_table object) – The ArchTable to perform the X-validation on

  • k (int) – Number of folds

  • nreal_un (int) – Number of unit realizations to estimate score

  • nreal_fa (int) – Number of facies realiations to estimate score

  • brier (bool) – Return brier scores

  • proba_correct (bool) – Return proportion of correct cells per units/facies

  • aggregate_method (str or None) – To perform X-valid on mean model rather than on realizations This parameter is pass to the realizations_aggregation function Ignored if None

  • weighting_method (str) – Which method to use for applying the weights possible values are : same_weights, prop_weights, user_weights

  • folding_method (string) – Folding method to use to separate the data methods availables : “random”, “k_means” (to implement), “stratified” (to implement)

  • plot (bool) – Display plots

  • seed (int) – Seed

  • verbose (0 or 1) – Verbosity level

Returns:

(score_folds, df_conf_norm_un, df_conf_norm_fa) where score_folds is a list of size k containing the scores for each fold.

Each entry of the score_folds list is a tuple containing various scores:

  • dictionsnary of final scores (units/facies and brier/probability of correct classification)

  • distinct brier scores for each borehole (units and facies)

  • proportion of correct cells for each borehole (units and facies)

  • list of test boreholes

  • list of train boreholes (nreal, nreal_fa, n_boreholes)

df_conf_norm_un is a dataframe containing the normalized confusion matrix for each unit df_conf_norm_fa is a dataframe containing the normalized confusion matrix for each facies

Return type:

tuple of size 3

x_valid.bh2array(ArchTable, bh, typ='units')

Function to convert an ArchPy borehole to an array of IDs, either units or facies IDs are given by the facies/units ID

Parameters:
  • ArchTable (ArchPy.base.Arch_table) – ArchPy table

  • bh (ArchPy.base.borehole) – ArchPy borehole

  • typ (str) – type of data to extract, either “units” or “facies”

Returns:

array of IDs

Return type:

ndarray

x_valid.brier_func(p, i)

Compute the Brier score given a vector of probabilities

Parameters:
  • p (sequence if float) – vector of probabilities. Values must be between 0 and 1

  • i (int) – index of the true answer in vector

Returns:

Brier score

Return type:

float

x_valid.img_to_3D_colors(arr, dic)

Replace value in an array using a dictionnary linking actual values to new values

Parameters:
  • arr (ndarray) – array of values to replace

  • dic (dict) – dictionnary linking actual values to new values

Returns:

array of new values

Return type:

ndarray

x_valid.plot_confusion_matrix(df, title='Confusion matrix', cmap='plasma')
x_valid.test_bh_1fold(ArchTable, bhs_real, bhs_test, weighting_method='same_weight', dic_weights=None, plot=True, save_figs=False, fig_dir=None, brier=True, proba_correct=False, aspect='auto')

Perform a test on a single fold of a cross-validation

Parameters:
  • ArchTable (ArchPy.base.Arch_table) – ArchPy table to test

  • bhs_real (ndarray of size (nreal_units, nreal_fa, n_boreholes) of ArchPy.base.borehole) – “fake” boreholes sampled in the realizations of the ArchTable

  • bhs_test (sequence of ArchPy.base.borehole) – “real” boreholes to test

  • weighting_method (str) – method to weight the realizations, either “same_weight” or “weights”

  • dic_weights (dict) – dictionnary linking realization index to weight

  • plot (bool) – plot or not

  • save_figs (bool) – save figures or not

  • fig_dir (str) – directory to save figures

  • brier (bool) – compute Brier score or not

  • proba_correct (bool) – compute probability of correct classification or not

  • aspect (str) – aspect ratio of the plot

Returns:

dictionnary of results

Return type:

dict