We address component-based regularization of a multivariate generalized linear model (GLM). A vector of random responses
is assumed to depend, through a GLM, on a set
of explanatory variables, as well as on a set
of additional covariates.
is partitioned into
conceptually homogenous variable groups
, viewed as explanatory themes. Variables in each
are assumed many and redundant. Thus, generalized linear regression demands dimension reduction and regularization with respect to each
. By contrast, variables in
are assumed few and selected so as to demand no regularization. Regularization is performed searching each
for an appropriate number of orthogonal components that both contribute to model
and capture relevant structural information in
. To estimate a single-theme model, we first propose an enhanced version of Supervised Component Generalized Linear Regression (SCGLR), based on a flexible measure of structural relevance of components, and able to deal with mixed-type explanatory variables. Then, to estimate the multiple-theme model, we develop an algorithm encapsulating this enhanced SCGLR: THEME-SCGLR. The method is tested on simulated data and then applied to rainforest data in order to model the abundance of tree species.