I've been working on a compact Python library called cgmm for regression modelling with Conditional Gaussian Mixture Models. It allows flexible, data-driven regression beyond Gaussian and linear assumptions.
It integrates with scikit-learn, comes with documentation and examples, and is available on PyPI.
Key features:
* model non-Gaussian conditional distributions
* capture non-linear dependencies
* handle heteroscedastic noise (variance that changes with inputs)
* provide full predictive distributions, not just point estimates
The current release added:
* Mixture of Experts (MoE): Softmax-gated experts with linear mean functions (Jordan & Jacobs, “Hierarchical Mixtures of Experts and the EM Algorithm”, Neural Computation, 1994)
* Direct conditional likelihood optimization: implementing EM from Jaakkola & Haussler, “Expectation-Maximization Algorithms for Conditional Likelihoods”, ICML 2000
Examples now cover a range of applications:
* ViX volatility Monte Carlo simulation (non-linear, non-Gaussian SDEs)
* Multivariate seasonal forecasts (temperature, windspeed, light intensity)
* Iris dataset + scikit-learn benchmarks
* Generative modelling of handwritten digits
Links:
Docs: https://cgmm.readthedocs.io/en/latest/
GitHub: https://github.com/sitmo/cgmm
PyPI: https://pypi.org/project/cgmm/
I'd love to get feedback from the community, especially on use cases where people model non-Gaussian, non-linear data.