Understanding the Differences and Applications of ANOVA, ANCOVA, MANOVA, and MANCOVA in Machine Learning

ANOVA: Tests if the average of one outcome differs between groups.
Example: Compare the accuracy of 3 ML models.

import pandas as pd

import statsmodels.api as sm

from statsmodels.formula.api import ols

data = pd.DataFrame({

‘model’: [‘A’, ‘A’, ‘B’, ‘B’, ‘C’, ‘C’],

‘accuracy’: [0.8, 0.82, 0.85, 0.86, 0.9, 0.91]

})

model = ols(‘accuracy ~ model’, data=data).fit()

print(sm.stats.anova_lm(model))

ANCOVA: Like ANOVA, but adjusts for extra factors (covariates).
Example: Compare model accuracy while controlling for training time.

data[‘train_time’] = [10, 11, 9, 10, 8, 9]

model = ols(‘accuracy ~ model + train_time’, data=data).fit()

print(sm.stats.anova_lm(model))

MANOVA: Tests differences in several outcomes at onceExample: Compare models on accuracy and precision together.

from statsmodels.multivariate.manova import MANOVA

data = pd.DataFrame({

‘model’: [‘A’, ‘A’, ‘B’, ‘B’, ‘C’, ‘C’],

‘accuracy’: [0.8, 0.82, 0.85, 0.86, 0.9, 0.91],

‘precision’: [0.7, 0.72, 0.75, 0.76, 0.8, 0.82]

})

manova = MANOVA.from_formula(‘accuracy + precision ~ model’, data=data)

print(manova.mv_test())

MANCOVA: MANOVA plus controlling for covariates.
Example: Compare models on accuracy and precision while controlling for training time.

data[‘train_time’] = [10, 11, 9, 10, 8, 9]

manova = MANOVA.from_formula(‘accuracy + precision ~ model + train_time’, data=data)

print(manova.mv_test())

Test	Definition	When to Use	Example (from earlier code)	Explanation of Example
ANOVA	Compare means of one dependent variable across groups	When comparing one outcome across different groups or models	Comparing accuracy of 3 ML models (accuracy ~ model)	Tests if average accuracy differs between models A, B, and C
ANCOVA	Like ANOVA but controls for one or more covariates	When you want to adjust the outcome for other variables	Comparing accuracy while controlling for training time (accuracy ~ model + train_time)	Checks if model accuracy differs between models after accounting for training time differences
MANOVA	Compare means of multiple dependent variables across groups	When comparing several related outcomes together	Comparing accuracy and precision across models (accuracy + precision ~ model)	Tests if model groups differ on both accuracy and precision together
MANCOVA	MANOVA with covariates included	When adjusting multiple outcomes for other influencing factors	Comparing accuracy and precision while controlling for training time (accuracy + precision ~ model + train_time)	Assesses differences in accuracy and precision between models after adjusting for training time