Skip to content

Classify your own data

Routines to automatically classify a grain based on definitions in paper.

classify_sic_grain(c12_c13=None, n14_n15=None, d29si=None, d30si=None, al26_al27=None, rho_si=0, ret_probabilities=False)

Classify a measured grain according to the classification scheme.

This returns either a Tuple of grain type and subtype (if probabilities=False, default case) or a dictionary of probabilities for each grain type (if probabilities=True). If multiple probabilities are the same for groups, the preference is given in the following way: M, AB, Y, Z, X, N, C, D.

Measurement values can either be given as (value, uncertainty) or, in the case of C and N, if asymmetric uncertainties are available, as (value, (uncertainty_plus, uncertainty_minus)).

If no uncertainties are given (as None or np.nan), the uncertainty is assumed to be the ratio divided by 10.

Parameters:

Name Type Description Default
c12_c13 Tuple[float, Union[float, Tuple[float, float]]]

Carbon 12/13 isotopic ratio and uncertainty.

None
n14_n15 Tuple[float, Union[float, Tuple[float, float]]]

Nitrogen 14/15 isotopic ratio and uncertainty.

None
d29si Tuple[float, float]

Silicon 29/28 isotopic ratio as delta value in permil and uncertainty.

None
d30si Tuple[float, float]

Silicon 30/28 isotopic ratio as delta value in permil and uncertainty.

None
al26_al27 Tuple[float, float]

Aluminium 26/27 isotopic ratio and uncertainty.

None
rho_si float

Silicon correlation coefficient between d30Si and d29Si.

0
ret_probabilities bool

Return probabilities for each grain type? Defaults to False.

False

Returns:

Type Description
Union[Tuple[str, Union[str, None]], Dict[str, float]]

Tuple of grain type and subtype or dictionary of probabilities.

Source code in pgdtools/classify.py
def classify_sic_grain(
    c12_c13: Tuple[float, Union[float, Tuple[float, float]]] = None,
    n14_n15: Tuple[float, Union[float, Tuple[float, float]]] = None,
    d29si: Tuple[float, float] = None,
    d30si: Tuple[float, float] = None,
    al26_al27: Tuple[float, float] = None,
    rho_si: float = 0,
    ret_probabilities: bool = False,
) -> Union[Tuple[str, Union[str, None]], Dict[str, float]]:
    """Classify a measured grain according to the classification scheme.

    This returns either a Tuple of grain type and subtype (if `probabilities=False`,
    default case) or a dictionary of probabilities for each grain type
    (if `probabilities=True`).
    If multiple probabilities are the same for groups, the preference is given in the
    following way: M, AB, Y, Z, X, N, C, D.

    Measurement values can either be given as `(value, uncertainty)` or, in the case
    of C and N, if asymmetric uncertainties are available, as
    `(value, (uncertainty_plus, uncertainty_minus))`.

    If no uncertainties are given (as ``None`` or ``np.nan``), the uncertainty is
    assumed to be the ratio divided by 10.

    :param c12_c13: Carbon 12/13 isotopic ratio and uncertainty.
    :param n14_n15: Nitrogen 14/15 isotopic ratio and uncertainty.
    :param d29si: Silicon 29/28 isotopic ratio as delta value in permil and uncertainty.
    :param d30si: Silicon 30/28 isotopic ratio as delta value in permil and uncertainty.
    :param al26_al27: Aluminium 26/27 isotopic ratio and uncertainty.
    :param rho_si: Silicon correlation coefficient between d30Si and d29Si.
    :param ret_probabilities: Return probabilities for each grain type?
        Defaults to `False`.

    :return: Tuple of grain type and subtype or dictionary of probabilities.
    """
    # todo some checking of input data
    types = ["M", "AB", "Y", "Z", "X", "C", "N", "D"]
    probabilities = np.zeros(len(types))

    if c12_c13 is None and n14_n15 is None and d29si is None and d30si is None:
        if not ret_probabilities:
            return "U", None  # unclassified
        else:
            return dict(zip(types, probabilities))  #

    c12_c13 = _replace_errors(c12_c13)
    n14_n15 = _replace_errors(n14_n15)
    d29si = _replace_errors(d29si)
    d30si = _replace_errors(d30si)
    al26_al27 = _replace_errors(al26_al27)

    # replace no errors (None or nan) with ratio / 10

    # get elemental probabilities
    prob_al = _aluminium_probabilities(al26_al27)
    prob_c = _carbon_probabilities(c12_c13)
    prob_n = _nitrogen_probabilities(n14_n15)
    prob_si = _silicon_probabilities(d29si, d30si, rho_si)

    for it, gtype in enumerate(types):
        probabilities[it] = (
            prob_al[gtype] * prob_c[gtype] * prob_n[gtype] * prob_si[gtype]
        )

    probabilities = np.round(probabilities, 3)  # round to three significant digits
    # find maximum probability by sorting from lowest to highest.
    # preference given by numpy to first element in case of equal probabilities
    index_max = np.argsort(1 - probabilities)[0]

    if probabilities[index_max] < 0.01:
        gtype = "U"
    else:
        gtype = types[index_max]

    if gtype in ["X", "AB", "C"]:
        subtype = _find_subtype(gtype, c12_c13, n14_n15, d29si, d30si)
    else:
        subtype = None

    if not ret_probabilities:
        return gtype, subtype
    else:
        return dict(zip(types, probabilities))