gala.evaluate: Segmentation evaluation¶

gala.evaluate.
adapted_rand_error
(seg, gt, all_stats=False)[source]¶ Compute Adapted Rand error as defined by the SNEMI3D contest [1]
Formula is given as 1  the maximal Fscore of the Rand index (excluding the zero component of the original labels). Adapted from the SNEMI3D MATLAB script, hence the strange style.Parameters: seg : np.ndarray
the segmentation to score, where each value is the label at that point
 gt : np.ndarray, same shape as seg
the groundtruth to score against, where each value is a label
 all_stats : boolean, optional
whether to also return precision and recall as a 3tuple with rand_error
Returns: are : float
The adapted Rand error; equal to $1 
rac{2pr}{p + r}$,
where $p$ and $r$ are the precision and recall described below.
 prec : float, optional
The adapted Rand precision. (Only returned when all_stats is
True
.) rec : float, optional
The adapted Rand recall. (Only returned when all_stats is
True
.)

gala.evaluate.
adj_rand_index
(x, y=None)[source]¶ Return the adjusted Rand index.
The Adjusted Rand Index (ARI) is the deviation of the Rand Index from the expected value if the marginal distributions of the contingency table were independent. Its value ranges from 1 (perfectly correlated marginals) to 1 (perfectly anticorrelated).
Parameters: x, y : np.ndarray
Either x and y are provided as equalshaped np.ndarray label fields (int type), or y is not provided and x is a contingency table (sparse.csc_matrix) that is not normalised to sum to 1.
Returns: ari : float
The adjusted Rand index of x and y.

gala.evaluate.
assignment_table
(seg_or_ctable, gt=None, *, dtype=<class 'numpy.bool_'>)[source]¶ Create an assignment table of value in seg to gt.
Parameters: seg_or_ctable : array of int, or 2D array of float
The segmentation to assign. Every value in seg will be assigned to a single value in gt. Alternatively, pass a single, precomputed contingency table to be converted to an assignment table.
gt : array of int, same shape as seg
The segmentation to assign to. Don’t pass if seg_or_cont is a contingency matrix.
dtype : numpy dtype specification
The desired data type for the assignment matrix.
Returns: assignments : sparse matrix
A matrix with True at position [i, j] if segment i in seg is assigned to segment j in gt.
Examples
>>> seg = np.array([0, 1, 1, 1, 2, 2]) >>> gt = np.array([1, 1, 1, 2, 2, 2]) >>> assignment_table(seg, gt).toarray() array([[False, True, False], [False, True, False], [False, False, True]], dtype=bool) >>> cont = contingency_table(seg, gt) >>> assignment_table(cont).toarray() array([[False, True, False], [False, True, False], [False, False, True]], dtype=bool)

gala.evaluate.
contingency_table
(seg, gt, *, ignore_seg=(), ignore_gt=(), norm=True)[source]¶ Return the contingency table for all regions in matched segmentations.
Parameters: seg : np.ndarray, int type, arbitrary shape
A candidate segmentation.
gt : np.ndarray, int type, same shape as seg
The ground truth segmentation.
ignore_seg : iterable of int, optional
Values to ignore in seg. Voxels in seg having a value in this list will not contribute to the contingency table. (default: [0])
ignore_gt : iterable of int, optional
Values to ignore in gt. Voxels in gt having a value in this list will not contribute to the contingency table. (default: [0])
norm : bool, optional
Whether to normalize the table so that it sums to 1.
Returns: cont : scipy.sparse.csr_matrix
A contingency table. cont[i, j] will equal the number of voxels labeled i in seg and j in gt. (Or the proportion of such voxels if norm=True.)

class
gala.evaluate.
csrRowExpandableCSR
(arg1, shape=None, dtype=None, copy=False, max_num_rows=None, max_nonzero=None, expansion_factor=2)[source]¶ Like a scipy CSR matrix, but rows can be appended.
Use mat[i] = v to append the rowvector v as row i to the matrix mat. Any rows between the current last row and i are filled with zeros.
Parameters: arg1 :
Any valid instantiation of a sparse.csr_matrix. This includes a dense matrix or 2D NumPy array, any SciPy sparse matrix, or a tuple of the three defining values of a scipy sparse matrix, (data, indices, indptr). See the documentation for sparse.csr_matrix for more information.
dtype : numpy dtype specification, optional
The data type contained in the matrix, e.g. ‘float32’, np.float64, np.complex128.
shape : tuple of two ints, optional
The number of rows and columns of the matrix.
copy : bool, optional
This argument does nothing, and is maintained for compatibility with the csr_matrix constructor. Because we create biggerthan necessary buffer arrays, the data must always be copied.
max_num_rows : int, optional
The initial maximum number of rows. Note that more rows can always be added; this is used only for efficiency. If None, defaults to twice the initial number of rows.
max_nonzero : int, optional
The maximum number of nonzero elements. As with max_num_rows, this is only necessary for efficiency.
expansion_factor : int or float, optional
The maximum number of rows or nonzero elements will be this number times the initial number of rows or nonzero elements. This is overridden if max_num_rows or max_nonzero are provided.
Examples
>>> init = csrRowExpandableCSR([[0, 0, 2], [0, 4, 0]]) >>> init[2] = np.array([9, 0, 0]) >>> init[4] = sparse.csr_matrix([0, 0, 5]) >>> init.nnz 4 >>> init.data array([2, 4, 9, 5], dtype=int64) >>> init.toarray() array([[0, 0, 2], [0, 4, 0], [9, 0, 0], [0, 0, 0], [0, 0, 5]], dtype=int64)
Attributes
Methods

data
¶ The data array is virtual, truncated from the data “buffer”, _data.


gala.evaluate.
divide_columns
(matrix, row, in_place=False)[source]¶ Divide each column of matrix by the corresponding element in row.
The result is as follows: out[i, j] = matrix[i, j] / row[j]
Parameters: matrix : np.ndarray, scipy.sparse.csc_matrix or csr_matrix, shape (M, N)
The input matrix.
column : a 1D np.ndarray, shape (N,)
The row dividing matrix.
in_place : bool (optional, default False)
Do the computation inplace.
Returns: out : same type as matrix
The result of the rowwise division.

gala.evaluate.
divide_rows
(matrix, column, in_place=False)[source]¶ Divide each row of matrix by the corresponding element in column.
The result is as follows: out[i, j] = matrix[i, j] / column[i]
Parameters: matrix : np.ndarray, scipy.sparse.csc_matrix or csr_matrix, shape (M, N)
The input matrix.
column : a 1D np.ndarray, shape (M,)
The column dividing matrix.
in_place : bool (optional, default False)
Do the computation inplace.
Returns: out : same type as matrix
The result of the rowwise division.

gala.evaluate.
edit_distance
(aseg, gt, size_threshold=1000, sp=None)[source]¶ Find the number of splits and merges needed to convert aseg to gt.
Parameters: aseg : np.ndarray, int type, arbitrary shape
The candidate automatic segmentation being evaluated.
gt : np.ndarray, int type, same shape as aseg
The ground truth segmentation.
size_threshold : int or float, optional
Ignore splits or merges smaller than this number of voxels.
sp : np.ndarray, int type, same shape as aseg, optional
A superpixel map. If provided, compute the edit distance to the best possible agglomeration of sp to gt, rather than to gt itself.
Returns: (false_merges, false_splits) : float
The number of splits and merges needed to convert aseg to gt.

gala.evaluate.
fm_index
(x, y=None)[source]¶ Return the FowlkesMallows index. [1]
Parameters: x, y : np.ndarray
Either x and y are provided as equalshaped np.ndarray label fields (int type), or y is not provided and x is a contingency table (sparse.csc_matrix) that is not normalised to sum to 1.
Returns: fm : float
The FM index of x and y. 1 is perfect agreement.
References
[1] EB Fowlkes & CL Mallows. (1983) A method for comparing two hierarchical clusterings. J Am Stat Assoc 78: 553

gala.evaluate.
get_stratified_sample
(ar, n)[source]¶ Get a regularlyspaced sample of the unique values of an array.
Parameters: ar : np.ndarray, arbitrary shape and type
The input array.
n : int
The desired sample size.
Returns: u : np.ndarray, shape approximately (n,)
Notes
If len(np.unique(ar)) <= 2*n, all the values of ar are returned. The requested sample size is taken as an approximate lower bound.
Examples
>>> ar = np.array([[0, 4, 1, 3], ... [4, 1, 3, 5], ... [3, 5, 2, 1]]) >>> np.unique(ar) array([0, 1, 2, 3, 4, 5]) >>> get_stratified_sample(ar, 3) array([0, 2, 4])

gala.evaluate.
make_synaptic_functions
(fn, fcts)[source]¶ Make evaluation functions that only evaluate at synaptic sites.
Parameters: fn : string
Filename containing synapse coordinates, in Raveler format. [1]
fcts : function, or iterable of functions
Functions to be converted to synaptic evaluation.
Returns: syn_fcts : function or iterable of functions
Evaluation functions that will evaluate only at synaptic sites.
Raises: ImportError : if the syngeo package [2, 3] is not installed.
References
[1] https://wiki.janelia.org/wiki/display/flyem/synapse+annotation+file+format [2] https://github.com/janeliaflyem/synapsegeometry [3] https://github.com/jni/synapsegeometry

gala.evaluate.
merge_contingency_table
(a, b, ignore_seg=[0], ignore_gt=[0])[source]¶ A contingency table that has additional rows for merging initial rows.
Parameters: a
b
ignore_seg
ignore_gt
Returns: ct : array, shape (2M + 1, N)

gala.evaluate.
nzcol
(mat, row_idx)[source]¶ Return the nonzero elements of given row in a CSR matrix.
Parameters: mat : CSR matrix
Input matrix.
row_idx : int
The index of the row (if mat is CSR) for which the nonzero elements are desired.
Returns: nz : array of int
The location of nonzero elements of mat[main_axis_idx].
Examples
>>> mat = sparse.csr_matrix(np.array([[0, 1, 0, 0], [0, 5, 8, 0]])) >>> nzcol(mat, 1) array([1, 2], dtype=int32) >>> mat[1, 2] = 0 >>> nzcol(mat, 1) array([1], dtype=int32)

gala.evaluate.
pixel_wise_boundary_precision_recall
(pred, gt)[source]¶ Evaluate voxel prediction accuracy against a ground truth.
Parameters: pred : np.ndarray of int or bool, arbitrary shape
The voxelwise discrete prediction. 1 for boundary, 0 for nonboundary.
gt : np.ndarray of int or bool, same shape as pred
The ground truth boundary voxels. 1 for boundary, 0 for nonboundary.
Returns: pr : float
rec : float
The precision and recall values associated with the prediction.
Notes
Precision is defined as “True Positives / Total Positive Calls”, and Recall is defined as “True Positives / Total Positives in Ground Truth”.
This function only calculates this value for discretized predictions, i.e. it does not work with continuous prediction confidence values.

gala.evaluate.
rand_by_threshold
(ucm, gt, npoints=None)[source]¶ Compute Rand and Adjusted Rand indices for each threshold of a UCM
Parameters: ucm : np.ndarray, arbitrary shape
An Ultrametric Contour Map of region boundaries having specific values. Higher values indicate higher boundary probabilities.
gt : np.ndarray, int type, same shape as ucm
The ground truth segmentation.
npoints : int, optional
If provided, only compute values at npoints thresholds, rather than all thresholds. Useful when ucm has an extremely large number of unique values.
Returns: ris : np.ndarray of float, shape (3, len(np.unique(ucm))) or (3, npoints)
The rand indices of the segmentation induced by thresholding and labeling ucm at different values. The 3 rows of ris are the values used for thresholding, the corresponding Rand Index at that threshold, and the corresponding Adjusted Rand Index at that threshold.

gala.evaluate.
rand_index
(x, y=None)[source]¶ Return the unadjusted Rand index. [1]
Parameters: x, y : np.ndarray
Either x and y are provided as equalshaped np.ndarray label fields (int type), or y is not provided and x is a contingency table (sparse.csc_matrix) that is not normalised to sum to 1.
Returns: ri : float
The Rand index of x and y.
References
[1] WM Rand. (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc. 66: 846–850

gala.evaluate.
rand_values
(cont_table)[source]¶ Calculate values for Rand Index and related values, e.g. Adjusted Rand.
Parameters: cont_table : scipy.sparse.csc_matrix
A contingency table of the two segmentations.
Returns: a, b, c, d : float
The values necessary for computing Rand Index and related values. [1, 2]
a : float
Refers to the number of pairs of elements in the input image that are both the same in seg1 and in seg2,
b : float
Refers to the number of pairs of elements in the input image that are different in both seg1 and in seg2.
c : float
Refers to the number of pairs of elements in the input image that are the same in seg1 but different in seg2.
d : float
Refers to the number of pairs of elements in the input image that are different in seg1 but the same in seg2.
References
[1] Rand, W. M. (1971). Objective criteria for the evaluation of clustering methods. J Am Stat Assoc. [2] http://en.wikipedia.org/wiki/Rand_index#Definition on 20130516.

gala.evaluate.
raw_edit_distance
(aseg, gt, size_threshold=1000)[source]¶ Compute the edit distance between two segmentations.
Parameters: aseg : np.ndarray, int type, arbitrary shape
The candidate automatic segmentation.
gt : np.ndarray, int type, same shape as aseg
The ground truth segmentation.
size_threshold : int or float, optional
Ignore splits or merges smaller than this number of voxels.
Returns: (false_merges, false_splits) : float
The number of splits and merges required to convert aseg to gt.

gala.evaluate.
reduce_vi
(fn_pattern='testing/%i/flatsinglechanneltr%i%i%.2f.lzf.h5', iterable=[(0, 1, 0), (0, 2, 0), (0, 3, 0), (0, 4, 0), (0, 5, 0), (0, 6, 0), (0, 7, 0), (1, 0, 1), (1, 2, 1), (1, 3, 1), (1, 4, 1), (1, 5, 1), (1, 6, 1), (1, 7, 1), (2, 0, 2), (2, 1, 2), (2, 3, 2), (2, 4, 2), (2, 5, 2), (2, 6, 2), (2, 7, 2), (3, 0, 3), (3, 1, 3), (3, 2, 3), (3, 4, 3), (3, 5, 3), (3, 6, 3), (3, 7, 3), (4, 0, 4), (4, 1, 4), (4, 2, 4), (4, 3, 4), (4, 5, 4), (4, 6, 4), (4, 7, 4), (5, 0, 5), (5, 1, 5), (5, 2, 5), (5, 3, 5), (5, 4, 5), (5, 6, 5), (5, 7, 5), (6, 0, 6), (6, 1, 6), (6, 2, 6), (6, 3, 6), (6, 4, 6), (6, 5, 6), (6, 7, 6), (7, 0, 7), (7, 1, 7), (7, 2, 7), (7, 3, 7), (7, 4, 7), (7, 5, 7), (7, 6, 7)], thresholds=array([0. , 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1 , 0.11, 0.12, 0.13, 0.14, 0.15, 0.16, 0.17, 0.18, 0.19, 0.2 , 0.21, 0.22, 0.23, 0.24, 0.25, 0.26, 0.27, 0.28, 0.29, 0.3 , 0.31, 0.32, 0.33, 0.34, 0.35, 0.36, 0.37, 0.38, 0.39, 0.4 , 0.41, 0.42, 0.43, 0.44, 0.45, 0.46, 0.47, 0.48, 0.49, 0.5 , 0.51, 0.52, 0.53, 0.54, 0.55, 0.56, 0.57, 0.58, 0.59, 0.6 , 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.7 , 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.8 , 0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.9 , 0.91, 0.92, 0.93, 0.94, 0.95, 0.96, 0.97, 0.98, 0.99, 1. ]))[source]¶ Compile evaluation results embedded in many .h5 files under “vi”.
Parameters: fn_pattern : string, optional
A format string defining the files to be examined.
iterable : iterable of tuples, optional
The (partial) tuples to apply to the format string to obtain individual files.
thresholds : iterable of float, optional
The final tuple elements to apply to the format string. The final tuples are the product of iterable and thresholds.
Returns: vi : np.ndarray of float, shape (3, len(thresholds))
The under and over segmentation components of VI at each threshold. vi[0, :] is the threshold, vi[1, :] the undersegmentation and vi[2, :] is the oversegmentation.

gala.evaluate.
relabel_from_one
(label_field)[source]¶ Convert labels in an arbitrary label field to {1, … number_of_labels}.
This function also returns the forward map (mapping the original labels to the reduced labels) and the inverse map (mapping the reduced labels back to the original ones).
Parameters: label_field : numpy ndarray (integer type)
Returns: relabeled : numpy array of same shape as ar
forward_map : 1d numpy array of length np.unique(ar) + 1
inverse_map : 1d numpy array of length len(np.unique(ar))
The length is len(np.unique(ar)) + 1 if 0 is not in np.unique(ar)
Examples
>>> import numpy as np >>> label_field = np.array([1, 1, 5, 5, 8, 99, 42]) >>> relab, fw, inv = relabel_from_one(label_field) >>> relab array([1, 1, 2, 2, 3, 5, 4]) >>> fw array([0, 1, 0, 0, 0, 2, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5]) >>> inv array([ 0, 1, 5, 8, 42, 99]) >>> (fw[label_field] == relab).all() True >>> (inv[relab] == label_field).all() True

gala.evaluate.
sem
(ar, axis=None)[source]¶ Calculate the standard error of the mean (SEM) along an axis.
Parameters: ar : np.ndarray
The input array of values.
axis : int, optional
Calculate SEM along the given axis. If omitted, calculate along the raveled array.
Returns: sem : float or np.ndarray of float
The SEM over the whole array (if axis=None) or over the chosen axis.

gala.evaluate.
sorted_vi_components
(s1, s2, ignore1=[0], ignore2=[0], compress=False)[source]¶ Return lists of the most entropic segments in s1s2 and s2s1.
Parameters: s1, s2 : np.ndarray of int
Segmentations to be compared. Usually, s1 will be a candidate segmentation and s2 will be the ground truth or target segmentation.
ignore1, ignore2 : list of int, optional
Labels in these lists are ignored in computing the VI. 0labels are ignored by default; pass empty lists to use all labels.
compress : bool, optional
The ‘compress’ flag performs a remapping of the labels before doing the VI computation, resulting in memory savings when many labels are not used in the volume. (For example, if you have just two labels, 1 and 1,000,000, ‘compress=False’ will give a vector of length 1,000,000, whereas with ‘compress=True’ it will have just size 2.)
Returns: ii1 : np.ndarray of int
The labels in s2 having the most entropy. If s1 is the automatic segmentation, these are the worst false merges.
h2g1 : np.ndarray of float
The conditional entropy corresponding to the labels in ii1.
ii2 : np.ndarray of int (seg)
The labels in s1 having the most entropy. These correspond to the worst false splits.
h2g1 : np.ndarray of float
The conditional entropy corresponding to the labels in ii2.

gala.evaluate.
special_points_evaluate
(eval_fct, coords, flatten=True, coord_format=True)[source]¶ Return an evaluation function to only evaluate at special coordinates.
Parameters: eval_fct : function taking at least two np.ndarray of equal shapes as args
The function to be used for evaluation.
coords : np.ndarray of int, shape (n_points, n_dim) or (n_points,)
The coordinates at which to evaluate the function. The coordinates can either be subscript format (one index into each dimension of input arrays) or index format (a single index into the linear array). For the latter, use flatten=False.
flatten : bool, optional
Whether to flatten the coordinates (default) or leave them untouched (if they are already in raveled format).
coord_format : bool, optional
Format the coordinates to a tuple of np.ndarray as numpy expects. Set to False if coordinates are already in this format or flattened.
Returns: special_eval_fct : function taking at least two np.ndarray of equal shapes
The returned function is the same as the above function but only evaluated at the coordinates specified. This can be used, for example, to subsample a volume, or to evaluate only whether synapses are correctly assigned, rather than every voxel, in a neuronal image volume.

gala.evaluate.
split_components
(idx, cont, num_elems=4, axis=0)[source]¶ Return the indices of the bodies most overlapping with body idx.
Parameters: idx : int
The segment index being examined.
cont : sparse.csc_matrix
The normalized contingency table.
num_elems : int, optional
The number of overlapping bodies desired.
axis : int, optional
The axis along which to perform the calculations. Assuming cont has the automatic segmentation as the rows and the gold standard as the columns, axis=0 will return the segment IDs in the gold standard of the worst merges comprising idx, while axis=1 will return the segment IDs in the automatic segmentation of the worst splits comprising idx.
Value:
comps : list of (int, float, float) tuples
num_elems indices of the biggest overlaps comprising idx, along with the percent of idx that they comprise and the percent of themselves that overlaps with idx.

gala.evaluate.
split_vi
(x, y=None, ignore_x=[0], ignore_y=[0])[source]¶ Return the symmetric conditional entropies associated with the VI.
The variation of information is defined as VI(X,Y) = H(XY) + H(YX). If Y is the groundtruth segmentation, then H(YX) can be interpreted as the amount of undersegmentation of Y and H(XY) is then the amount of oversegmentation. In other words, a perfect oversegmentation will have H(YX)=0 and a perfect undersegmentation will have H(XY)=0.
If y is None, x is assumed to be a contingency table.
Parameters: x : np.ndarray
Label field (int type) or contingency table (float). x is interpreted as a contingency table (summing to 1.0) if and only if y is not provided.
y : np.ndarray of int, same shape as x, optional
A label field to compare to x.
ignore_x, ignore_y : list of int, optional
Any points having a label in this list are ignored in the evaluation. Ignore 0labeled points by default.
Returns: sv : np.ndarray of float, shape (2,)
The conditional entropies of YX and XY.
See also

gala.evaluate.
split_vi_threshold
(tup)[source]¶ Compute VI with tuple input (to support multiprocessing).
Parameters: tup : a tuple, (np.ndarray, np.ndarray, [int], [int], float)
 The tuple should consist of::
 the UCM for the candidate segmentation,
 the gold standard,
 list of ignored labels in the segmentation,
 list of ignored labels in the gold standard,
 threshold to use for the UCM.
Returns: sv : np.ndarray of float, shape (2,)
The undersegmentation and oversegmentation of the comparison between applying a threshold and connected components labeling of the first array, and the second array.

gala.evaluate.
vi
(x, y=None, weights=array([1., 1.]), ignore_x=[0], ignore_y=[0])[source]¶ Return the variation of information metric. [1]
VI(X, Y) = H(X  Y) + H(Y  X), where H(..) denotes the conditional entropy.
Parameters: x : np.ndarray
Label field (int type) or contingency table (float). x is interpreted as a contingency table (summing to 1.0) if and only if y is not provided.
y : np.ndarray of int, same shape as x, optional
A label field to compare to x.
weights : np.ndarray of float, shape (2,), optional
The weights of the conditional entropies of x and y. Equal weights are the default.
ignore_x, ignore_y : list of int, optional
Any points having a label in this list are ignored in the evaluation. Ignore 0labeled points by default.
Returns: v : float
The variation of information between x and y.
References
[1] Meila, M. (2007). Comparing clusterings  an information based distance. Journal of Multivariate Analysis 98, 873895.

gala.evaluate.
vi_by_threshold
(ucm, gt, ignore_seg=[], ignore_gt=[], npoints=None, nprocessors=None)[source]¶ Compute the VI at every threshold of the provided UCM.
Parameters: ucm : np.ndarray of float, arbitrary shape
The Ultrametric Contour Map, where each 0.0region is separated by a boundary. Higher values of the boundary indicate more confidence in its presence.
gt : np.ndarray of int, same shape as ucm
The ground truth segmentation.
ignore_seg : list of int, optional
The labels to ignore in the segmentation of the UCM.
ignore_gt : list of int, optional
The labels to ignore in the ground truth.
npoints : int, optional
The number of thresholds to sample. By default, all thresholds are sampled.
nprocessors : int, optional
Number of processors to use for the parallel evaluation of different thresholds.
Returns: result : np.ndarray of float, shape (3, npoints)
The evaluation of segmentation at each threshold. The rows of this array are:
 the threshold used
 the undersegmentation component of VI
 the oversegmentation component of VI

gala.evaluate.
vi_pairwise_matrix
(segs, split=False)[source]¶ Compute the pairwise VI distances within a set of segmentations.
If ‘split’ is set to True, two matrices are returned, one for each direction of the conditional entropy.
0labeled pixels are ignored.
Parameters: segs : iterable of np.ndarray of int
A list or iterable of segmentations. All arrays must have the same shape.
split : bool, optional
Should the split VI be returned, or just the VI itself (default)?
Returns: vi_sq : np.ndarray of float, shape (len(segs), len(segs))
The distances between segmentations. If split==False, this is a symmetric square matrix of distances. Otherwise, the lower triangle of the output matrix is the false split distance, while the upper triangle is the false merge distance.

gala.evaluate.
vi_statistics
(vi_table)[source]¶ Descriptive statistics from a block of related VI evaluations.
Parameters: vi_table : np.ndarray of float
An array containing VI evaluations of various samples. The last axis represents the samples.
Returns: means, sems, medians : np.ndarrays of float
The statistics of the given array along the samples axis.

gala.evaluate.
vi_tables
(x, y=None, ignore_x=[0], ignore_y=[0])[source]¶ Return probability tables used for calculating VI.
If y is None, x is assumed to be a contingency table.
Parameters: x, y : np.ndarray
Either x and y are provided as equalshaped np.ndarray label fields (int type), or y is not provided and x is a contingency table (sparse.csc_matrix) that may or may not sum to 1.
ignore_x, ignore_y : list of int, optional
Rows and columns (respectively) to ignore in the contingency table. These are labels that are not counted when evaluating VI.
Returns: pxy : sparse.csc_matrix of float
The normalized contingency table.
px, py, hxgy, hygx, lpygx, lpxgy : np.ndarray of float
The proportions of each label in x and y (px, py), the persegment conditional entropies of x given y and viceversa, the persegment conditional probability p log p.

gala.evaluate.
wiggle_room_precision_recall
(pred, boundary, margin=2, connectivity=1)[source]¶ Voxelwise, continuous value precision recall curve allowing drift.
Voxelwise precision recall evaluates predictions against a ground truth. Wiggleroom precision recall (WRPR, “warper”) allows calls from nearby voxels to be counted as correct. Specifically, if a voxel is predicted to be a boundary within a dilation distance of margin (distance defined according to connectivity) of a true boundary voxel, it will be counted as a True Positive in the Precision, and viceversa for the Recall.
Parameters: pred : np.ndarray of float, arbitrary shape
The prediction values, expressed as probability of observing a boundary (i.e. a voxel with label 1).
boundary : np.ndarray of int, same shape as pred
The true boundary map. 1 indicates boundary, 0 indicates nonboundary.
margin : int, optional
The number of dilations that define the margin. default: 2.
connectivity : {1, …, pred.ndim}, optional
The morphological voxel connectivity (defined as in SciPy) for the dilation step.
Returns: ts, pred, rec : np.ndarray of float, shape (len(np.unique(pred)+1),)
The prediction value thresholds corresponding to each precision and recall value, the precision values, and the recall values.

gala.evaluate.
xlogx
(x, out=None, in_place=False)[source]¶ Compute x * log_2(x).
We define 0 * log_2(0) = 0
Parameters: x : np.ndarray or scipy.sparse.csc_matrix or csr_matrix
The input array.
out : same type as x (optional)
If provided, use this array/matrix for the result.
in_place : bool (optional, default False)
Operate directly on x.
Returns: y : same type as x
Result of x * log_2(x).