CCA-Zoo¶

Multiview Canonical Correlation Analysis for Python — scikit-learn compatible, research-grade, batteries included.

pip install cca-zoo

Get Started API Reference

What is CCA?¶

Given two or more views of the same observations — brain imaging and behavioural scores, gene expression and phenotypic data, audio and video features — Canonical Correlation Analysis finds projections that maximise correlation between the projected views.

CCA-Zoo extends classical CCA in several directions:

Linear & regularised

Classical CCA, ridge-regularised rCCA, PLS, and seven sparse/elastic-net variants for high-dimensional settings.

Linear methods →
Kernel & nonparametric

KCCA, KGCCA, and KTCCA bring nonlinear relationships into reach via the kernel trick — no explicit feature map needed.

Kernel methods →
Deep learning

DCCA and variants (EY, NOI, SDL, DCCAE, DVCCA, DTCCA, BarlowTwins, VICReg) using your own nn.Module encoders with PyTorch Lightning.

Deep methods →
Probabilistic

Full Bayesian treatment of CCA via NUTS MCMC with NumPyro — posterior inference over latent variables and loadings.

Probabilistic →

Unified API¶

Every model follows the same three-step scikit-learn pattern:

from cca_zoo.linear import CCA, rCCA, PLS
from cca_zoo.nonparametric import KCCA

# 1. construct
model = CCA(latent_dimensions=2)

# 2. fit — views is a list of arrays, one per dataset
model.fit([X1, X2])

# 3. use
z1, z2 = model.transform([X1, X2])
corrs  = model.score([X1, X2])    # canonical correlations, shape (2,)
W1, W2 = model.weights            # weight matrices

Models are sklearn.base.BaseEstimator subclasses, so they work directly with GridSearchCV, Pipeline, and cross-validation utilities.

Navigate the docs¶

Getting Started

Installation, quick start examples, and core concepts.
User Guide

In-depth explanations of each method family with usage guidance.
API Reference

Full class and method documentation auto-generated from source.
Contributing

Development setup, coding standards, and how to contribute.