qualia_core.evaluation.Evaluator module

class qualia_core.evaluation.Evaluator.Evaluator[source]

Bases: ABC

apply_dataaugmentations(framework: LearningFramework[Any], dataaugmentations: list[DataAugmentation] | None, test_x: numpy.typing.NDArray[np.float32], test_y: numpy.typing.NDArray[np.int32]) tuple[numpy.typing.NDArray[np.float32], numpy.typing.NDArray[np.int32]][source]

Apply evaluation qualia_core.dataaugmentation.DataAugmentation.DataAugmentation to dataset.

Only the qualia_core.dataaugmentation.DataAugmentation.DataAugmentation with qualia_core.dataaugmentation.DataAugmentation.DataAugmentation.evaluate set are applied. This should not be used to apply actual data augmentation to the data, but rather use the conversion or transform qualia_core.dataaugmentation.DataAugmentation.DataAugmentation modules.

Parameters:
Returns:

Tuple of data and labels after applying qualia_core.dataaugmentation.DataAugmentation.DataAugmentation sequentially

shuffle_dataset(test_x: numpy.typing.NDArray[np.float32], test_y: numpy.typing.NDArray[np.int32]) tuple[numpy.typing.NDArray[np.float32], numpy.typing.NDArray[np.int32]][source]

Shuffle the input data, keeping the labels in the same order as the shuffled data.

Shuffling uses the seeded shared random generator from qualia_core.random.shared.

Parameters:
  • test_x – Input data

  • test_y – Input labels

Returns:

Tuple of shuffled data and labels

limit_dataset(test_x: numpy.typing.NDArray[np.float32], test_y: numpy.typing.NDArray[np.int32], limit: int | None) tuple[numpy.typing.NDArray[np.float32], numpy.typing.NDArray[np.int32]][source]

Truncate dataset to limit samples.

Parameters:
  • test_x – Input data

  • test_y – Input labels

  • limit – Number of samples to truncate to, data is returned as-is if None or 0

Returns:

Tuple of data and labels limited to limit samples

compute_accuracy(preds: list[int], truth: numpy.typing.NDArray[np.int32]) float[source]

Compute accuracy from the target results.

Parameters:
  • results – List of Result from inference on the target

  • test_y – Array of ground truth one-hot encoded

Returns:

Accuracy (micro) between 0 and 1

abstract evaluate(framework: LearningFramework[Any], model_kind: str, dataset: RawDataModel, target: str, tag: str, limit: int | None = None, dataaugmentations: list[DataAugmentation] | None = None) Stats | None[source]