Abstract
In regression modelling, categorical covariates have to be coded. Depending on the number
of categorical covariates and on the number of levels they have, the number of
coefficients can become huge. To reduce the model complexity, coefficients of similar
categories should be fused and coefficients of non-influential categories should be set to
zero. To this end, Lasso-type penalties on the differences of coefficients are a standard
approach. However, the clustering/selection performance of this approach is sometimes
poor–especially when the adaptive weights are badly conditioned or not existing. In some
situations, there is no incentive to cluster similar categories. To overcome this, a
Get full access to this article
View all access options for this article.
