Classify your own data

Routines to automatically classify a grain based on definitions in paper.

`classify_sic_grain(c12_c13=None, n14_n15=None, d29si=None, d30si=None, al26_al27=None, rho_si=0, ret_probabilities=False)`

Classify a measured grain according to the classification scheme.

This returns either a Tuple of grain type and subtype (if probabilities=False, default case) or a dictionary of probabilities for each grain type (if probabilities=True). If multiple probabilities are the same for groups, the preference is given in the following way: M, AB, Y, Z, X, N, C, D.

Measurement values can either be given as (value, uncertainty) or, in the case of C and N, if asymmetric uncertainties are available, as (value, (uncertainty_plus, uncertainty_minus)).

If no uncertainties are given (as None or np.nan), the uncertainty is assumed to be the ratio divided by 10.

Parameters:

Name	Type	Description	Default
`c12_c13`	`Tuple[float, Union[float, Tuple[float, float]]]`	Carbon 12/13 isotopic ratio and uncertainty.	`None`
`n14_n15`	`Tuple[float, Union[float, Tuple[float, float]]]`	Nitrogen 14/15 isotopic ratio and uncertainty.	`None`
`d29si`	`Tuple[float, float]`	Silicon 29/28 isotopic ratio as delta value in permil and uncertainty.	`None`
`d30si`	`Tuple[float, float]`	Silicon 30/28 isotopic ratio as delta value in permil and uncertainty.	`None`
`al26_al27`	`Tuple[float, float]`	Aluminium 26/27 isotopic ratio and uncertainty.	`None`
`rho_si`	`float`	Silicon correlation coefficient between d30Si and d29Si.	`0`
`ret_probabilities`	`bool`	Return probabilities for each grain type? Defaults to `False`.	`False`

Returns:

Type	Description
`Union[Tuple[str, Union[str, None]], Dict[str, float]]`	Tuple of grain type and subtype or dictionary of probabilities.

Source code in pgdtools/classify.py

def classify_sic_grain(
    c12_c13: Tuple[float, Union[float, Tuple[float, float]]] = None,
    n14_n15: Tuple[float, Union[float, Tuple[float, float]]] = None,
    d29si: Tuple[float, float] = None,
    d30si: Tuple[float, float] = None,
    al26_al27: Tuple[float, float] = None,
    rho_si: float = 0,
    ret_probabilities: bool = False,
) -> Union[Tuple[str, Union[str, None]], Dict[str, float]]:
    """Classify a measured grain according to the classification scheme.

    This returns either a Tuple of grain type and subtype (if `probabilities=False`,
    default case) or a dictionary of probabilities for each grain type
    (if `probabilities=True`).
    If multiple probabilities are the same for groups, the preference is given in the
    following way: M, AB, Y, Z, X, N, C, D.

    Measurement values can either be given as `(value, uncertainty)` or, in the case
    of C and N, if asymmetric uncertainties are available, as
    `(value, (uncertainty_plus, uncertainty_minus))`.

    If no uncertainties are given (as ``None`` or ``np.nan``), the uncertainty is
    assumed to be the ratio divided by 10.

    :param c12_c13: Carbon 12/13 isotopic ratio and uncertainty.
    :param n14_n15: Nitrogen 14/15 isotopic ratio and uncertainty.
    :param d29si: Silicon 29/28 isotopic ratio as delta value in permil and uncertainty.
    :param d30si: Silicon 30/28 isotopic ratio as delta value in permil and uncertainty.
    :param al26_al27: Aluminium 26/27 isotopic ratio and uncertainty.
    :param rho_si: Silicon correlation coefficient between d30Si and d29Si.
    :param ret_probabilities: Return probabilities for each grain type?
        Defaults to `False`.

    :return: Tuple of grain type and subtype or dictionary of probabilities.
    """
    # todo some checking of input data
    types = ["M", "AB", "Y", "Z", "X", "C", "N", "D"]
    probabilities = np.zeros(len(types))

    if c12_c13 is None and n14_n15 is None and d29si is None and d30si is None:
        if not ret_probabilities:
            return "U", None  # unclassified
        else:
            return dict(zip(types, probabilities))  #

    c12_c13 = _replace_errors(c12_c13)
    n14_n15 = _replace_errors(n14_n15)
    d29si = _replace_errors(d29si)
    d30si = _replace_errors(d30si)
    al26_al27 = _replace_errors(al26_al27)

    # replace no errors (None or nan) with ratio / 10

    # get elemental probabilities
    prob_al = _aluminium_probabilities(al26_al27)
    prob_c = _carbon_probabilities(c12_c13)
    prob_n = _nitrogen_probabilities(n14_n15)
    prob_si = _silicon_probabilities(d29si, d30si, rho_si)

    for it, gtype in enumerate(types):
        probabilities[it] = (
            prob_al[gtype] * prob_c[gtype] * prob_n[gtype] * prob_si[gtype]
        )

    probabilities = np.round(probabilities, 3)  # round to three significant digits
    # find maximum probability by sorting from lowest to highest.
    # preference given by numpy to first element in case of equal probabilities
    index_max = np.argsort(1 - probabilities)[0]

    if probabilities[index_max] < 0.01:
        gtype = "U"
    else:
        gtype = types[index_max]

    if gtype in ["X", "AB", "C"]:
        subtype = _find_subtype(gtype, c12_c13, n14_n15, d29si, d30si)
    else:
        subtype = None

    if not ret_probabilities:
        return gtype, subtype
    else:
        return dict(zip(types, probabilities))