GAMbler

This is a growing collection of interesting and (hopefully) useful tips for ecological modeling, with (perhaps too much) emphasis on Generalized Additive Models.

Written by Nicholas Clark

Compositional modeling of plant communities with Dirichlet regression

Compositional data appears everywhere in scientific research, yet many analysts fall back on problematic approaches that ignore fundamental mathematical constraints. I demonstrate how Dirichlet regression with Gaussian process smooths provides a principled framework for modeling plant community composition across environmental gradients, using approximate Hilbert space methods that make these models computationally tractable for realistic datasets. Unlike separate binomial models that can predict impossible total abundances exceeding 100%, this approach automatically ensures predictions respect compositional constraints while capturing complex nonlinear environmental responses.

GAMs for Customer Lifetime Value (CLV) prediction

Customer Lifetime Value models are critical for SaaS businesses, but standard regression approaches often predict impossible values like negative revenue or infinite growth. There are established methods to handle this (i.e. constrained optimization, truncation), but these can be complex to implement and maintain in production. Even fewer approaches naturally incorporate business logic while remaining interpretable and deployable. This post demonstrates how to build CLV models that automatically respect business reality using GAMs, creating predictions that make sense without complex constraint matrices or manual bounds checking.

State-Space Vector Autoregressions in mvgam

Vector Autoregressions (VAR models), also known as Multivariate Autoregressions (MAR models), offer a way to model delayed and contemporaneous interactions among sets of multiple time series. These models are widely used in econometrics and psychology, among other fields, where they can be analyzed to ask many interesting questions about potential causality or stability. But software to fit these models to real-world time series, which often present as non-Gaussian counts, proportions or even binary observations with measurement error, is lacking. Here I show how to fit VARs in a State-Space format, and how to interrogate the models to ask meaningful questions about interactions and stability, using the mvgam package in R.

Incorporating time-varying seasonality in forecast models

Many time series show repeated seasonal patterns, and fitting models that can capture this seasonality is a major focus of time series forecasting algorithms. There are a lot of useful, established methods to deal with this (i.e. SARIMA, Harmonic regression), but sometimes the seasonal patterns change over time. Fewer time series and forecasting models can handle this feature. This post introduces some strategies for capturing time-varying seasonality and time-varying periodicity in Dynamic Generalized Additive Models, using the mvgam package in R.

First release of mvgam(v1.1.0) to CRAN

The mvgam package has been officially released to CRAN. This package fits Bayesian Dynamic Generalized Additive Models to sets of time series. Users can build dynamic nonlinear State-Space models that can incorporate semiparametric effects in observation and process components, using a wide range of observation families. Estimation is performed using Markov Chain Monte Carlo with Hamiltonian Monte Carlo in the software Stan.

How to interpret and report nonlinear effects from Generalized Additive Models

Generalized additive models (GAMs) are incredibly flexible tools that fit penalized regression splines to data. But interpreting nonlinear effects from GAMs is not as easy as interpreting linear models. In this post I provide 3 simple steps to help you understand and interpret nonlinear effects from GAMs using the mgcv R package.

Phylogenetic smoothing using mgcv

Use species’ phylogenetic or functional relationships to inform and partially pool hierarchical, nonlinear smooth functions in Generalized Additive Models with mgcv

By Nicholas Clark in rstats mgcv

February 24, 2024