tea-tasting: statistical analysis of A/B tests#
tea-tasting is a Python package for the statistical analysis of A/B tests featuring:
- Student's t-test, Z-test, bootstrap, and quantile metrics out of the box.
- Extensible API: define and use statistical tests of your choice.
- Delta method for ratio metrics.
- Variance reduction using CUPED/CUPAC (which can also be combined with the delta method for ratio metrics).
- Confidence intervals for both absolute and percentage changes.
- Sample ratio mismatch check.
- Power analysis.
- Multiple hypothesis testing (family-wise error rate and false discovery rate).
tea-tasting calculates statistics directly within data backends such as BigQuery, ClickHouse, DuckDB, PostgreSQL, Snowflake, Spark, and many other backends supported by Ibis. This approach eliminates the need to import granular data into a Python environment. tea-tasting also accepts dataframes supported by Narwhals: cuDF, Dask, Modin, pandas, Polars, PyArrow.
Check out the blog post explaining the advantages of using tea-tasting for the analysis of A/B tests.
Installation#
pip install tea-tasting
Basic example#
>>> import tea_tasting as tt
>>> data = tt.make_users_data(seed=42)
>>> experiment = tt.Experiment(
... sessions_per_user=tt.Mean("sessions"),
... orders_per_session=tt.RatioOfMeans("orders", "sessions"),
... orders_per_user=tt.Mean("orders"),
... revenue_per_user=tt.Mean("revenue"),
... )
>>> result = experiment.analyze(data)
>>> print(result)
metric control treatment rel_effect_size rel_effect_size_ci pvalue
sessions_per_user 2.00 1.98 -0.66% [-3.7%, 2.5%] 0.674
orders_per_session 0.266 0.289 8.8% [-0.89%, 19%] 0.0762
orders_per_user 0.530 0.573 8.0% [-2.0%, 19%] 0.118
revenue_per_user 5.24 5.73 9.3% [-2.4%, 22%] 0.123
Learn more in the detailed user guide. Additionally, see the guides on data backends, power analysis, multiple hypothesis testing, and custom metrics.
Roadmap#
- A/A tests and simulations.
- More statistical tests:
- Asymptotic and exact tests for frequency data.
- Mann–Whitney U test.
- Sequential testing.
Package name#
The package name "tea-tasting" is a play on words that refers to two subjects:
- Lady tasting tea is a famous experiment which was devised by Ronald Fisher. In this experiment, Fisher developed the null hypothesis significance testing framework to analyze a lady's claim that she could discern whether the tea or the milk was added first to the cup.
- "tea-tasting" phonetically resembles "t-testing" or Student's t-test, a statistical test developed by William Gosset.