tea_tasting.metrics.base
#
Base classes for metrics.
AggrCols
#
Bases: NamedTuple
Columns to be aggregated for a metric analysis.
Attributes:
Name | Type | Description |
---|---|---|
has_count |
bool
|
If |
mean_cols |
Sequence[str]
|
Column names for calculation of sample means. |
var_cols |
Sequence[str]
|
Column names for calculation of sample variances. |
cov_cols |
Sequence[tuple[str, str]]
|
Pairs of column names for calculation of sample covariances. |
MetricBase
#
Bases: ABC
, Generic[R]
, ReprMixin
Base class for metrics.
analyze(data, control, treatment, variant)
abstractmethod
#
Analyze a metric in an experiment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
DataFrame | Table
|
Experimental data. |
required |
control |
Any
|
Control variant. |
required |
treatment |
Any
|
Treatment variant. |
required |
variant |
str
|
Variant column name. |
required |
Returns:
Type | Description |
---|---|
R
|
Analysis result. |
Source code in src/tea_tasting/metrics/base.py
MetricBaseAggregated
#
Bases: MetricBase[R]
, _HasAggrCols
Base class for metrics, which are analyzed using aggregated statistics.
aggr_cols: AggrCols
abstractmethod
property
#
Columns to be aggregated for an analysis.
analyze(data, control, treatment, variant=None)
#
Analyze a metric in an experiment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
DataFrame | Table | dict[Any, Aggregates]
|
Experimental data. |
required |
control |
Any
|
Control variant. |
required |
treatment |
Any
|
Treatment variant. |
required |
variant |
str | None
|
Variant column name. |
None
|
Returns:
Type | Description |
---|---|
R
|
Analysis result. |
Source code in src/tea_tasting/metrics/base.py
analyze_aggregates(control, treatment)
abstractmethod
#
Analyze metric in an experiment using aggregated statistics.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
control |
Aggregates
|
Control data. |
required |
treatment |
Aggregates
|
Treatment data. |
required |
Returns:
Type | Description |
---|---|
R
|
Analysis result. |
Source code in src/tea_tasting/metrics/base.py
MetricBaseGranular
#
Bases: MetricBase[R]
, _HasCols
Base class for metrics, which are analyzed using granular data.
cols: Sequence[str]
abstractmethod
property
#
Columns to be fetched for an analysis.
analyze(data, control, treatment, variant=None)
#
Analyze a metric in an experiment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
DataFrame | Table | dict[Any, DataFrame]
|
Experimental data. |
required |
control |
Any
|
Control variant. |
required |
treatment |
Any
|
Treatment variant. |
required |
variant |
str | None
|
Variant column name. |
None
|
Returns:
Type | Description |
---|---|
R
|
Analysis result. |
Source code in src/tea_tasting/metrics/base.py
analyze_dataframes(control, treatment)
abstractmethod
#
Analyze metric in an experiment using granular data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
control |
DataFrame
|
Control data. |
required |
treatment |
DataFrame
|
Treatment data. |
required |
Returns:
Type | Description |
---|---|
R
|
Analysis result. |
Source code in src/tea_tasting/metrics/base.py
MetricPowerResults
#
Bases: UserList[P]
, PrettyDictsMixin
Power analysis results.
to_dicts()
#
to_html(keys=None, formatter=get_and_format_num)
#
Convert the object to HTML.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
keys |
Sequence[str] | None
|
Keys to convert. If a key is not defined in the dictionary
it's assumed to be |
None
|
formatter |
Callable[[dict[str, Any], str], str]
|
Custom formatter function. It should accept a dictionary of metric result attributes and an attribute name, and return a formatted attribute value. |
get_and_format_num
|
Returns:
Type | Description |
---|---|
str
|
A table with results rendered as HTML. |
Default formatting rules
- If a name starts with
"rel_"
or equals to"power"
consider it a percentage value. Round percentage values to 2 significant digits, multiply by100
and add"%"
. - Round other values to 3 significant values.
- If value is less than
0.001
, format it in exponential presentation. - If a name ends with
"_ci"
, consider it a confidence interval. Look up for attributes"{name}_lower"
and"{name}_upper"
, and format the interval as"[{lower_bound}, {lower_bound}]"
.
Source code in src/tea_tasting/utils.py
to_pandas()
#
to_pretty(keys=None, formatter=get_and_format_num)
#
Convert the object to a Pandas Dataframe with formatted values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
keys |
Sequence[str] | None
|
Keys to convert. If a key is not defined in the dictionary
it's assumed to be |
None
|
formatter |
Callable[[dict[str, Any], str], str]
|
Custom formatter function. It should accept a dictionary of metric result attributes and an attribute name, and return a formatted attribute value. |
get_and_format_num
|
Returns:
Type | Description |
---|---|
DataFrame
|
Pandas Dataframe with formatted values. |
Default formatting rules
- If a name starts with
"rel_"
or equals to"power"
consider it a percentage value. Round percentage values to 2 significant digits, multiply by100
and add"%"
. - Round other values to 3 significant values.
- If value is less than
0.001
, format it in exponential presentation. - If a name ends with
"_ci"
, consider it a confidence interval. Look up for attributes"{name}_lower"
and"{name}_upper"
, and format the interval as"[{lower_bound}, {lower_bound}]"
.
Source code in src/tea_tasting/utils.py
to_string(keys=None, formatter=get_and_format_num)
#
Convert the object to a string.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
keys |
Sequence[str] | None
|
Keys to convert. If a key is not defined in the dictionary
it's assumed to be |
None
|
formatter |
Callable[[dict[str, Any], str], str]
|
Custom formatter function. It should accept a dictionary of metric result attributes and an attribute name, and return a formatted attribute value. |
get_and_format_num
|
Returns:
Type | Description |
---|---|
str
|
A table with results rendered as string. |
Default formatting rules
- If a name starts with
"rel_"
or equals to"power"
consider it a percentage value. Round percentage values to 2 significant digits, multiply by100
and add"%"
. - Round other values to 3 significant values.
- If value is less than
0.001
, format it in exponential presentation. - If a name ends with
"_ci"
, consider it a confidence interval. Look up for attributes"{name}_lower"
and"{name}_upper"
, and format the interval as"[{lower_bound}, {lower_bound}]"
.
Source code in src/tea_tasting/utils.py
PowerBase
#
Bases: ABC
, Generic[S]
, ReprMixin
Base class for the analysis of power.
solve_power(data, parameter='rel_effect_size')
abstractmethod
#
Solve for a parameter of the power of a test.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
DataFrame | Table
|
Sample data. |
required |
parameter |
Literal['power', 'effect_size', 'rel_effect_size', 'n_obs']
|
Parameter name. |
'rel_effect_size'
|
Returns:
Type | Description |
---|---|
S
|
Power analysis result. |
Source code in src/tea_tasting/metrics/base.py
PowerBaseAggregated
#
Bases: PowerBase[S]
, _HasAggrCols
Base class for the analysis of power using aggregated statistics.
aggr_cols: AggrCols
abstractmethod
property
#
Columns to be aggregated for an analysis.
solve_power(data, parameter='rel_effect_size')
#
Solve for a parameter of the power of a test.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
DataFrame | Table | Aggregates
|
Sample data. |
required |
parameter |
Literal['power', 'effect_size', 'rel_effect_size', 'n_obs']
|
Parameter name. |
'rel_effect_size'
|
Returns:
Type | Description |
---|---|
S
|
Power analysis result. |
Source code in src/tea_tasting/metrics/base.py
solve_power_from_aggregates(data, parameter='rel_effect_size')
abstractmethod
#
Solve for a parameter of the power of a test.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
Aggregates
|
Sample data. |
required |
parameter |
Literal['power', 'effect_size', 'rel_effect_size', 'n_obs']
|
Parameter name. |
'rel_effect_size'
|
Returns:
Type | Description |
---|---|
S
|
Power analysis result. |
Source code in src/tea_tasting/metrics/base.py
aggregate_by_variants(data, aggr_cols, variant=None)
#
Aggregate experimental data by variants.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
DataFrame | Table | dict[Any, Aggregates]
|
Experimental data. |
required |
aggr_cols |
AggrCols
|
Columns to be aggregated. |
required |
variant |
str | None
|
Variant column name. |
None
|
Raises:
Type | Description |
---|---|
ValueError
|
The variant parameter is required but was not provided. |
TypeError
|
data is not an instance of DataFrame, Table, or a dictionary of Aggregates. |
Returns:
Type | Description |
---|---|
dict[Any, Aggregates]
|
Experimental data as a dictionary of Aggregates. |
Source code in src/tea_tasting/metrics/base.py
read_dataframes(data, cols, variant=None)
#
Read granular experimental data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
DataFrame | Table | dict[Any, DataFrame]
|
Experimental data. |
required |
cols |
Sequence[str]
|
Columns to read. |
required |
variant |
str | None
|
Variant column name. |
None
|
Raises:
Type | Description |
---|---|
ValueError
|
The variant parameter is required but was not provided. |
TypeError
|
data is not an instance of DataFrame, Table, or a dictionary if DataFrames. |
Returns:
Type | Description |
---|---|
dict[Any, DataFrame]
|
Experimental data as a dictionary of DataFrames. |