Sage Journals: Discover world-class research

Abstract

This article is concerned with a subset of numerically stable and scalable algorithms useful to support computationally complex psychometric models in the era of machine learning and massive data. The subset selected here is a core set of numerical methods that should be familiar to computational psychometricians and considers whitening transforms for dealing with correlated data, computational concepts for linear models, multivariable integration, and optimization techniques.

Keywords

computational psychometrics Gaussian quadrature linear models;numerical analysis psychometric data scientist

Get full access to this article

View all access options for this article.

References

Abadi

Agarwal

Barham

Brevdo

Chen

Citro

Zheng

. (2015). TensorFlow: Large-scale machine learning on heterogeneous systems. https://www.tensorflow.org/ (Software available from tensorflow.org)

Abramowitz

Stegun

I. A.

(Eds.). (1965). Handbook of mathematical functions with formulas, graphs and mathematical tables. Dover Publications, Inc.

Ackerberg

(2009). A new use of importance sampling to reduce computational burden in simulation estimation. Quant Mark Econ, 7, 343–376.

Allen

Donoghue

Schoeps

. (2001). The NAEP 1998 technical report (Tech. Rep.). U.S. Department of Education. Office of Educational Research and Improvement. National Center for Education Statistics. https://nces.ed.gov/nationsreportcard/pubs/main1998/2001509.asp

Andersson

Xin

(2021). Estimation of latent regression item response theory models using a second-order Laplace approximation. Journal of Educational and Behavioral Statistics, 46(2), 244–265.

Antal

Oranje

. (2007). Adaptive numerical integration for item response theory (Tech. Rep.). Educational Testing Service. https://files.eric.ed.gov/fulltext/EJ1111562.pdf

Bates

(2004). Least squares calculations in R. R News, 4(1), 17–20.

Bates

Eddelbuettel

(2013). Fast and elegant numerical linear algebra using the RcppEigen package. Journal of Statistical Software, 52(5), 1–24.

Bates

Machler

Bolker

Walker

(2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48.

10.

Bell

Mccaffrey

(2002). Bias reduction in standard errors for linear regression with multi-stage samples. Survey Methodology, 28, 169–181.

11.

Bezanson

Edelman

Karpinski

Shah

V. B.

(2017). Julia: A fresh approach to numerical computing. SIAM Review, 59(1), 65–98. https://epubs.siam.org/doi/10.1137/141000671

12.

Bock

R. D.

Mislevy

R. J.

(1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6(4), 431–444.

13.

Cai

(2010). Metropolis–Hastings Robbins–Monro algorithm for confirmatory item factor analysis. Journal of Educational and Behavioral Statistics, 35(3), 307–335.

14.

Cai

Yang

Hansen

(2011). Generalized full-information item bifactor analysis. Psychological Methods, 16(3), 221–248.

15.

Chalmers

R. P.

(2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1–29.

16.

Chowdhary

Salloum

Debusschere

Larson

V. E.

(2015). Quadrature methods for the calculation of subgrid microphysics moments. Monthly Weather Review, 143(7), 2955–2972.

17.

Cizek

Cizkova

(2004). Iterative methods for solving linear systems. In Gentle

Härdle

Mori

(Eds.), Handbook of computational statistics (pp. 120–126). Springer.

18.

Cohen

J. D.

Jiang

(1999). Comparison of partially measured latent traits across nominal subgroups. Journal of the American Statistical Association, 94(448), 1035–1044.

19.

Davis

Hager

(2005). Row modifications of a sparse Cholesky factorization. SIAM J. Matrix Analysis Applications, 26, 621–639.

20.

DeMars

C. E.

(2005). Scoring subscales using multidimensional item response theory models. Poster presented at the annual meeting of the American Psychological Association.

21.

Dempster

A. P.

Laird

N. M.

Rubin

D. B.

(1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B, 39, 1–38.

22.

Doran

(2014). Methods for incorporating measurement error in value-added models and teacher classifications. Statistics and Public Policy, 1(1), 114–119.

23.

Doran

Bailey

Buehler

joo Lee

(2021). Dire: Linear regressions with a latent outcome variable [Computer software manual]. https://CRAN.R-project.org/package=Dire (R package version 1.0.3)

24.

Doran

Bates

Bliese

Dowling

(2007). Estimating the multilevel rasch model: With the lme4 package. Journal of Statistical Software, 20(2), 1–18.

25.

Eddelbuettel

Balamuta

J. J.

(2018). Extending R with C++: A brief introduction to Rcpp. The American Statistician, 72(1), 28–36.

26.

Elvira

Martino

Closas

(2021). Importance Gaussian quadrature. IEEE Transactions on Signal Processing, 69, 474–488. http://dx.doi.org/10.1109/TSP.2020.3045526

27.

Ferrando

Lorenzo-Seva

(2016). A note on improving EAP trait estimation in oblique factor-analytic and item response theory models. Psicologica, 37(2), 235–247.

28.

Fisher

R. A.

(1925). Theory of statistical estimation. Mathematical Proceedings of the Cambridge Philosophical Society, 22(5), 700–725.

29.

Fletcher

(1987). Practical methods of optimization (Second ed.). John Wiley & Sons.

30.

Gelman

Carlin

J. B.

Stern

H. S.

Rubin

D. B

. (2004). Bayesian data analysis (2nd ed. ed.). Chapman and Hall/CRC.

31.

Genz

Kass

R. E.

(1997). Subregion-adaptive integration of functions having a dominant peak. Journal of Computational and Graphical Statistics, 6(1), 92–111.

32.

Giner

Smyth

G. K

. (2016). Statmod: Probability calculations for the inverse Gaussian distribution. The R Journal, 8(1), 339–351.

33.

Golub

Welsch

(1969). Calculation of Gauss quadrature rules. Mathematics of Computation, 23, 221–230.

34.

Hao

T. K.

(2019). Machine learning made easy: A review of scikit-learn package in python programming language. Journal of Educational and Behavioral Statistics, 44(3), 348–361.

35.

Hedges

L. V.

Hedberg

E. C.

(2007). Intraclass correlation values for planning group-randomized trials in education. Educational Evaluation and Policy Analysis, 29(1), 60–87.

36.

Hsu

Ackerman

T. A.

Fan

(1999). The relationship between the Bock-Aitkin procedure and the EM algorithm for IRT model estimation. https://www.act.org/content/dam/act/unsecured/documents/ACTRR99-07.pdf

37.

Innes

Edelman

Fischer

Rackauckas

Saba

Shah

V. B.

Tebbutt

(2019). A differentiable programming system to bridge machine learning and scientific computing. http://arxiv.org/abs/1907.07587

38.

Jäckel

(2005). A note on multivariate Gauss–Hermite quadrature. http://www.jaeckel.org/ANoteOnMultivariateGaussHermiteQuadrature.pdf

39.

Johnson

S. G

. (2010). Notes on the convergence of trapezoidal-rule quadrature. https://math.mit.edu/s∼tevenj/trapezoidal.pdf

40.

Judd

K. L.

Maliar

(2011). Numerically stable and accurate stochastic simulation approaches for solving dynamic economic models. Quantitative Economics, 2(2), 173–210.

41.

Kessy

Lewin

Strimmer

(2018). Optimal whitening and decorrelation. The American Statistician, 72(4), 309–314.

42.

King

(1998). Unifying political methodology: The likelihood theory of statistical inference. University of Michigan Press.

43.

Koseoglu

(2018). Understanding ordinary least square in matrix form with R. https://medium.com/@bengikoseoglu/understanding-ordinary-least-square-in-matrix-form-with-r-b6cf2d08a93b

44.

Laird

N. M.

Ware

J. H.

(1982). Random-effects models for longitudinal data. Biometrics, 38(4), 963–974.

45.

Lange

(1999). Numerical analysis for statisticians. Springer New York.

46.

Lee

S. X.

Leemaqz

K. L.

McLachlan

G. J.

(2016). A simple parallel EM algorithm for statistical learning via mixture models. In 2016 International Conference on Digital Image Computing: Techniques and Applications (dicta) (pp. 1–8). https://doi.org/10.1109/DICTA.2016.7796997

47.

Lesaffre

Spiessens

(2001). On the effect of the number of quadrature points in a logistic random-effects model: An example. Journal of the Royal Statistical Society. Series C (Applied Statistics), 50(3), 325–335.

48.

Lira

Iyer

Trindade

Howle

. (2016). QR versus Cholesky: A probabilistic analysis. International Journal of Numerical Analysis and Modeling, 13(1), 114–121. (Publisher Copyright: © 2016 Institute for Scientific Computing and Information.)

49.

Liu

Pierce

D. A.

(1994). A note on Gauss–Hermite quadrature. Biometrika, 81(3), 624–629.

50.

Lockwood

J. R.

McCaffrey

D. F.

(2020). Recommendations about estimating errors-in-variables regression in stata. The Stata Journal, 20(1), 116–130.

51.

McCulloch

C. E.

Searle

S. R.

(2001). Generalized, linear, and mixed models. John Wiley and Sons.

52.

McLean

R. A.

Sanders

W. L.

Stroup

W. W.

(1991). A unified approach to mixed linear models. The American Statistician, 45(1), 54–64.

53.

Mislevy

R. J.

(1984). Estimating latent distributions. Psychometrika, 49(3), 359–381.

54.

Muraki

Bock

R. D

. (1999). Parscale: IRT item analysis and test scoring for rating-scale data [Computer software manual]. Chicago, IL.

55.

Nab

van Smeden

Keogh

R. H.

Groenwold

R. H.

(2021). Mecor: An R package for measurement error correction in linear regression models with a continuous outcome. Computer Methods and Programs in Biomedicine, 208, 106238.

56.

National Research Council. (2013). Frontiers in massive data analysis. The National Academies Press. https://www.nap.edu/catalog/18374/frontiers-in-massive-data-analysis

57.

Naylor

J. C.

Smith

A. M.

(1982). Applications of a method for the efficient computation of posterior distributions. Journal of the Royal Statistical Society Series C-applied Statistics, 31, 214–225.

58.

Paszke

Gross

Massa

Lerer

Bradbury

Chanan

Chintala

. (2019). Pytorch: An imperative style, high-performance deep learning library. In Advances in neural information processing systems 32 (pp. 8024–8035). Curran Associates, Inc. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf

59.

Pinheiro

J. C.

Bates

D. M.

(1995). Approximations to the log-likelihood function in the nonlinear mixed-effects model. Journal of Computational and Graphical Statistics, 4(1), 12–35.

60.

Pustejovsky

J. E.

Tipton

(2018). Small-sample methods for cluster-robust variance estimation and hypothesis testing in fixed effects models. Journal of Business & Economic Statistics, 36(4), 672–683.

61.

Quarteroni

Saleri

Gervasio

(2010). Scientific computing with MATLAB and octave (3rd ed.). Springer.

62.

Rabe-Hesketh

Skrondal

Pickles

(2002). Reliable estimation of generalized linear mixed models using adaptive quadrature. The Stata Journal, 2(1), 1–21.

63.

Rijmen

. (2009). Efficient full information maximum likelihood estimation for multidimensional IRT models (Tech. Rep.). Educational Testing Service. https://files.eric.ed.gov/fulltext/ED505564.pdf (Research Report No. RR-09-03)

64.

Robitzsch

(2021). A note on a computationally efficient implementation of the EM algorithm in item response models. Quantitative and Computational Methods in Behavioral Sciences, 1(1), 1–16.

65.

Rosseel

(2021). Evaluating the observed log-likelihood function in two-level structural equation modeling with missing data: From formulas to R code. Psych, 3(2), 197–232.

66.

Ruder

. (2016). An overview of gradient descent optimization algorithms. ArXiv, abs/1609.04747 .

67.

SAS Documentation 14.2. (n.d.). Estimating fixed and random effects in the mixed model [Computer software manual]. (Online Help Manual).

68.

Searle

(1982). Matrix algebra useful for statistics. John Wiley and Sons.

69.

Shamir

Zhang

(2013). Stochastic gradient descent for non-smooth optimization: Convergence results and optimal averaging schemes. In Dasgupta

McAllester

(Eds.), Proceedings of the 30th international conference on machine learning (Vol. 28., pp. 71–79). PMLR. https://proceedings.mlr.press/v28/shamir13.html

70.

Shustin

P. F.

Avron

. (2021). Semi-infinite linear regression and its applications. https://arxiv.org/abs/2104.05687

71.

Sohl-Dickstein

Poole

Ganguli

(2014). Fast large-scale optimization by unifying stochastic gradient and quasi-newton methods. In Proceedings of the 31th international conference on machine learning (Vol. 32). June 22–24, 2014, Bejing, China. http://proceedings.mlr.press/v32/sohl-dicksteinb14.pdf

72.

StataCorp. (n.d.). Errors-in-variables regression [Computer software manual]. https://www.stata.com/manuals/reivreg.pdf (Online Help Manual).

73.

Stoer

Bulirsch

(2002). Introduction to numerical analysis. Springer.

74.

Stringer

(2021). Implementing approximate Bayesian inference using adaptive quadrature: The AGHQ package. https://doi.org/10.48550/arXiv.2101.04468

75.

Teng

S.-H.

(2016). Scalable algorithms for data and network analysis. Foundations and Trends in Theoretical Computer Science, 12(1–2), 1–274.

76.

Teng

S.-H.

(2018). Scalable algorithms in the age of big data and network sciences: Characterization, primitives, and techniques. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining (pp. 6–7). Association for Computing Machinery. https://doi.org/10.1145/3159652.3160602

77.

Toulis

Airoldi

E. M.

(2015). Scalable estimation strategies based on stochastic approximations: Classical results and new insights. Statistics and Computing, 25, 781–795.

78.

Tran

Toulis

Airoldi

(2015). Stochastic gradient descent methods for estimation with large data sets. https://arxiv.org/pdf/1509.06459.pdf

79.

Tuerlinckx

Rijmen

Verbeke

De Boeck

(2006). Statistical inference in generalized linear mixed models: A review. The British Journal of Mathematical and Statistical Psychology, 59, 225–55.

80.

von Davier

(2016). High-performance psychometrics: The parallel-e parallel-m algorithm for generalized latent variable models. ETS Research Report Series, 2016, 1–11.

81.

Woodbury

M. A.

(1950). Inverting Modified Matrices. In Kuntzmann

(Ed.). Princeton University.

82.

Zhang

(1999). Matrix theory: Basic results and techniques. Springer.

83.

Zhou

Wei

Zhang

Zheng

(2021). Damped newton stochastic gradient descent method for neural networks training. Mathematics, 9(13), 1533.

A Collection of Numerical Recipes Useful for Building Scalable Psychometric Applications

Abstract

Keywords

Get full access to this article

References