Suite¶

The suite page defines the reusable data-layer interfaces that concrete clinical suites implement.

Base classes, not CKD internals

This page documents the framework base classes. For the current CKD implementation, see Framework Guide -> Suites -> CKD.

Suite Base Classes¶

base ¶

krisis/data/base.py

Abstract base classes for the Krisis data layer. All domain-specific data modules (CKD, Hypertension, Diabetes) inherit from these contracts.

BaseDataSuite ¶

Bases: ABC

The top-level data contract that Benchmark receives.

A suite is the public API of the data layer. It wires together a preprocessor, feature engineer, and generator, and exposes a clean list of PatientRecord objects ready for evaluation.

Example

suite = MyClinicalSuite(config=SuiteConfig(task=Task.STAGING))
records = suite.load()

The suite handles train/test splitting internally. Benchmark always receives the test split only.

domain `property` ¶

domain: str

Human-readable domain name. e.g. 'CKD', 'Hypertension'

load `abstractmethod` ¶

load() -> list[PatientRecord]

Run the full data pipeline and return test-split PatientRecords.

Pipeline order

Load raw source data
Preprocess (encode, impute, scale)
Engineer features (domain-specific derivations)
Generate synthetic records (if n_synthetic > 0)
Merge real + synthetic
Split → return test split as PatientRecord list

describe `abstractmethod` ¶

describe() -> dict[str, Any]

Return a summary of the suite configuration and data statistics. Used by results.report() to document what was evaluated.

Should include at minimum

domain name
feature set (full/reduced)
task type
n_real records
n_synthetic records
label distribution
seed

BasePreprocessor ¶

Bases: ABC

Cleans and imputes raw domain data.

Each domain implements this to handle its own encoding, imputation strategy, and scaling. The contract is simple: fit_transform takes a raw DataFrame and returns a clean one.

fit_transform `abstractmethod` ¶

fit_transform(df: DataFrame) -> pd.DataFrame

Fit preprocessing on df and return the transformed DataFrame. Sets self._is_fitted = True on completion.

transform `abstractmethod` ¶

transform(df: DataFrame) -> pd.DataFrame

Apply already-fitted preprocessing to new data. Raises RuntimeError if called before fit_transform.

BaseFeatureEngineer ¶

Bases: ABC

Derives new clinically meaningful features from preprocessed data.

This is where domain-specific engineering happens: - CKD: eGFR computation, sex generation, stage derivation - Hypertension: MAP, pulse pressure, BP stage - Diabetes: HbA1c staging, insulin resistance markers

The engineer sits between the preprocessor and the generator — it operates on clean data and produces an enriched DataFrame that the generator can sample from.

fit_transform `abstractmethod` ¶

fit_transform(df: DataFrame) -> pd.DataFrame

Engineer new features and return the enriched DataFrame.

get_feature_names `abstractmethod` ¶

get_feature_names(feature_set: FeatureSet) -> list[str]

Return the list of feature column names for the given feature set. Used by the suite to select the right columns before passing records to the model backend.

BaseGenerator ¶

Bases: ABC

Generates synthetic patient records from a fitted distribution.

Synthetic generation in Krisis is stage-aware — records are generated along physiologically plausible disease progression arcs, not sampled randomly. This ensures the benchmark tests models on clinically coherent inputs rather than statistical noise.

The generator is seeded for reproducibility. Two researchers running the same suite with the same seed get identical synthetic patients.

fit `abstractmethod` ¶

fit(df: DataFrame) -> BaseGenerator

Fit the generator on an engineered DataFrame. Learns the statistical distribution of each feature per stage. Returns self for chaining.

generate `abstractmethod` ¶

generate(n: int) -> pd.DataFrame

Generate n synthetic patient records. Returns a DataFrame with the same schema as the fitted data. Raises RuntimeError if called before fit().

Suite¶

Suite Base Classes¶

base ¶

BaseDataSuite ¶

domain property ¶

load abstractmethod ¶

describe abstractmethod ¶

BasePreprocessor ¶

fit_transform abstractmethod ¶

transform abstractmethod ¶

BaseFeatureEngineer ¶

fit_transform abstractmethod ¶

get_feature_names abstractmethod ¶

BaseGenerator ¶

fit abstractmethod ¶

generate abstractmethod ¶

domain `property` ¶

load `abstractmethod` ¶

describe `abstractmethod` ¶

fit_transform `abstractmethod` ¶

transform `abstractmethod` ¶

fit_transform `abstractmethod` ¶

get_feature_names `abstractmethod` ¶

fit `abstractmethod` ¶

generate `abstractmethod` ¶