In this paper we present a structured overview of methods for two-mode clustering, that is, methods that provide a simultaneous clustering of the rows and columns of a rectangular data matrix. Key structuring principles include the nature of row, column and data clusters and the type of model structure or associated loss function. We illustrate with analyses of symptom data on archetypal psychiatric patients.
Hartigan JA.Clustering algorithms. New York: John Wiley , 1975.
3.
Arabie P , Hubert LJ , De Soete G. eds. Clustering and classification. River Edge, NJ: World Scientific , 1996.
4.
Everitt BS , Landau S , Leese M.Cluster analysis, fourth edition. New York: Edward Arnold , 2001.
5.
Jain AK , Dubes RC.Algorithms for clustering data. Englewood Cliffs, NJ: Prentice Hall , 1988.
6.
Celeux G , Diday E , Govaert G , Lechevallier Y , Ralambondrainy H.Classification automatique des données: environnement statistique et informatique. Paris: Dunod , 1989.
7.
Kaufman L , Rousseeuw PJ.Finding groups in data: an introduction to cluster analysis. New York: Wiley , 1990.
8.
Mirkin BG.Mathematical classification and clustering. Dordrecht: Kluwer , 1996.
9.
Gordon AD.Classification, second edition. Boca Raton, FL: Chapman & Hall=CRC , 1999.
10.
Tryon RC.Cluster analysis. Ann Arbor, MI: Edwards Brothers , 1939.
11.
Fisher W.Clustering and aggregation in economics. Baltimore: Johns Hopkins , 1969.
12.
Getz G , Levine E , Domany, E.Coupled twoway clustering analysis of gene microarray data . Proceedings of the National Academy of Sciences of the USA2000; 97(22): 12079-12084 .
13.
Li J , Zha H.Simultaneous classification and feature clustering using discriminant vector quantization with applications to microarray data analysis . IEEE Computer Society Bioinformatics Conference (CSB’02)2002, Stanford, CA.
14.
Pollard KS , van der Laan MJ.Statistical inference for simultaneous clustering of gene expression data . Mathematical Biosciences2002; 176: 99-121 .
15.
Pollard, KS , van der Laan MJ.Statistical inference for simultaneous clustering of gene expression data. In Denison DD , Hansen MH , Holmes C , Mallick B , Yu B , eds. Nonlinear estimation and classification. Berlin: Springer , 2002, 305-320.
16.
Jörnsten R , Yu, B.Simultaneous gene clustering and subset selection for sample classification via MDL . Bioinformatics2003; 19: 1100-1109 .
17.
Hartigan J.Direct clustering of a data matrix . Journal of the American Statistical Association1972; 67: 123-129 .
18.
Bock H-H. Stochastische Modelle für die einfache und doppelte Klassifikation von normalverteilten Beobachtungen. Dissertation, University of Freiburg, Germany, 1968.
19.
Both M , Gaul W.Ein vergleich zweimodaler Clusteranalyseverfahren . Methods of Operations Research1987; 57: 593-605 .
20.
Eckes T.Bimodale Clusteranalyse: Methoden zur Klassifikation von Elementen zweier Mengen . Zeitschrift für Experimentelle und Angewandte Psychologie1991; XXXVIII: 201-225 .
21.
Eckes T , Orlik P.An error variance approach to two-mode hierarchical clustering . Journal of Classification1993; 10: 51-74 .
22.
Krolak-Schwerdt S.Two-mode clustering methods: compare and contrast. In Schader M , Gaul W , Vichi M , eds. Between data science and applied data analysis: studies in classification, data analysis, and knowledge organization. Heidelberg: Springer , 2003, 270-278.
23.
Mirkin B , Arabie P , Hubert LJ.Additive two-mode clustering: the error-variance approach revisited . Journal of Classification1995; 12: 243-263 .
Tucker LR.The extension of factor analysis to three-dimensional matrices. In Frederiksen N , Gulliksen H , eds. Contributions to mathematical psychology. New York: Holt , Rinehart and Winston, 1964, 109-127.
26.
Frege G.Grundgesetze der Arithmetik, begrifflich abgeleitet, Band II. Jena: Verlag Hermann Pohle , 1903.
27.
Braverman EM.Methods for extremal grouping of the variables and the problem of finding important factors . Automation and Remote Control1970; 31: 123-132 .
28.
Bock HH.Probability models and hypotheses testing in partitioning cluster analysis. In Arabie P , Hubert LJ , De SoeteG, eds. Clustering and classification. River Edge, NJ: World Scientific , 1996.
29.
Lambert JM , Williams WT.Multivariate methods in plant ecology. IV. Nodal analysis . Journal of Ecology1962; 50: 775-802 .
30.
Williams WT , Lambert JM.Nodal analysis of associated populations . Nature1961; 191: 202-202 .
31.
Breiger RL , Boorman SA , Arabie P.An algorithm for clustering relational data with applications to social network analysis and comparison with multidimensional scaling . Journal of Mathematical Psychology1975; 12: 328-383 .
32.
Arabie P , Boorman SA , Levitt PR.Constructing blockmodels: How and why . Journal of Mathematical Psychology1978; 17: 21-63 .
33.
Noma E , Smith DR.Benchmark for the blocking of sociometric data . Psychological Bulletin1985; 97: 583-591 .
34.
Arabie P , Schleutermann S , Daws J , Hubert LJ.Marketing applications of sequencing and partitioning of nonsymmetric and/or two-mode matrices. In Gaul W , Schader M , eds. Data, expert knowledge and decisions. Berlin: Springer-Verlag , 1988, 215-224.
35.
Arabie P , Hubert LJ , Schleutermann S.Blockmodels from the bond energy approach . Social Networks1990; 12: 99-126 .
36.
Arabie P , Hubert LJ.The bond energy algorithm revisited . IEEE Transactions on Systems, Man, and Cybernetics1991; 20: 268-274 .
37.
Marcotorchino F.Block seriation problems: a unified approach . Journal of Applied Stochastical Models and Data Analysis1987; 3: 73-93 .
38.
Govaert G. Algorithme de classification d’un tableau de contingence. Premie `res journées internationales analyse des données et informatique (Versailles 1977). Paris: CNRS, 1980, 487-500.
39.
Govaert G.Classification simultanéede tableaux binaires. In Diday E , Jambu M , Lebart L , Pages J , Tomassone R , eds. Data analysis and informatics 3. Amsterdam: North Holland , 1984, 223-236.
40.
Govaert G.Simultaneous clustering of rows and columns . Control and Cybernetics1995; 24: 437-458 .
41.
DeSarbo WS.GENNCLUS: new models for general nonhierarchical clustering analysis . Psychometrika1982; 47: 449-475 .
42.
Gaul W , Schader M.A new algorithm for two-mode clustering. In Bock H-H , Polasek W , eds. Data analysis and information systems. Heidelberg: Springer , 1996, 15-23.
43.
Baier D , Gaul W , Schader M.Two-mode overlapping clustering with applications in simultaneous benefit segmentation and market structuring. In Klar R , Opitz O , eds. Classification and knowledge organization. Heidelberg: Springer , 1997, 557-566.
44.
Vichi M.Double k-means clustering for simultaneous classification of objects and variables. In Borra S , Rocchi R , Schader M , eds. Advances in classification and data analysis. Studies in classification, data analysis, and knowledge organization. Heidelberg: Springer , 2001, 43-52.
45.
Trejos J , Castillo W.Simulated annealing optimization for two-mode partitioning. In Gaul W , Decker R , eds. Classification and information at the turn of the millenium. Heidelberg: Springer , 2000, 135-142.
46.
Castillo W , Trejos J.Two-mode partitioning: review of methods and application of tabu search. In Jajuga K , Sokolowski A , Bock H-H , eds. Classification, clustering, and related topics. Recent advances and applications. Studies in classification, data analysis, and knowledge organization. Heidelberg: Springer-Verlag , 2002, 43-51.
47.
Hansohm J.Two-mode clustering with genetic algorithms. In Gaul W , Ritter G , eds. Classification, automation, and new media. Studies in classification, data analysis, and knowledge organization. Heidelberg: Springer , 2002, 87-93.
48.
Bock H-H.Simultaneous clustering of objects and variables. In Tomassone R , ed. Analyse des données et informatique. Le Chesnay, France: INRIA , 1979, 187-204.
49.
Greenacre MJ.Clustering the rows and columns of a contingency table . Journal of Classification1988; 5: 39-51 .
50.
Bock H-H.Convexity-based clustering criteria: a new approach. Recor’s Lecture presented at the Academy of Economy of Cracow, Poland , 2000.
51.
Bock H-H.Two-way clustering for contingency tables: Maximizing a dependence measure. In Schader M , Gaul W , Vichi M , eds. Between data science and applied data analysis. Heidelberg-Berlin: Springer Verlag , 2003, 143-154.
52.
Bock H-H.Convexity-based clustering criteria: theory, algorithms, and applications in statistics . Statistical Methods and Applications2003; 12: 293-317 .
53.
Bock H-H.A clustering algorithm for choosing optimal classes for the chi-squared test. Bulletin of the International Statistical Institute , 44th session. Madrid, 1983.
54.
Pötzelberger K , Strasser H.Clustering and quantization by MSP-partitions . Statistics and Decisions2001; 19: 331-371 .
55.
Ciok A.Discretization as a tool in cluster analysis. In Rizzi A , Vichi M , Bock H-H , eds. Advances in data science and classification. Heidelberg: Springer , 1998, 349-354.
56.
DeSarbo WS , Fong DKH , Liechty J.A hierarchical Bayesian procedure for two-mode cluster analysis . Paper presented at the 27th Annual Meeting of the Gesellschaft für Klassifikation. Brandenburg: University of Technology Cottbus, 2003.
57.
Govaert G , Nadif M.Clustering with block mixture models . Pattern Recognition2003; 36: 463-473 .
58.
Hartigan JA.Bloc voting in the United States Senate . Journal of Classification2000; 17: 29-49 .
59.
Hartigan JA.Partition models . Communications in Statistics1990; 19: 2745-2756 .
60.
Duffy DE , Quiroz AJ.A permutation-based algorithm for block clustering . Journal of Classification1991; 8: 65-91 .
61.
Furnas GW. Objects and their features: the metric analysis of two-class data. Doctoral dissertation. Stanford University, Stanford, CA, 1980.
62.
De Soete G , Carroll JD.Tree and other network models for representing proximity data. In Arabie P , Hubert LJ , De Soete G , eds. Clustering and Classification. River Edge, NJ: World Scientific , 1996, 157-197.
63.
Tversky A.Features of similarity . Psychological Review1977; 84: 327-352 .
64.
Eckes T , Orlik P.An agglomerative method for two-mode hierarchical clustering. In Bock H-H , Ihm P , eds. Classification, data analysis, and knowledge organization. Berlin: Springer-Verlag , 1991, 3-8.
65.
Eckes T.A two-mode clustering study of situations and their features. In Opitz O , Lausen B , Klar R , eds. Information and classification. New York: Springer-Verlag , 1993, 510-517.
66.
Castillo W , Trejos J.Recurrence properties in two-mode hierarchical clustering. In Decker R , Gaul W , eds. Classification and information processing at the turn of the millennium. Heidelberg: Springer , 2000, 68-73.
67.
Lance GN , Williams WT.A general theory of classification sorting strategies . Computer Journal1967; 9: 373-380 .
68.
Schwaiger M.Two-mode classification in advertising research. In Klar R , Opitz, O , eds. Classification and knowledge organization. Heidelberg: Springer , 1997, 597-603.
69.
Espejo E , Gaul W.Two-mode hierarchical clustering as an instrument for marketing research. In Gaul W , Schader M , eds. Classification as a tool of research. Amsterdam: Elsevier=North-Holland , 1986, 121-128.
70.
De Soete G , DeSarbo WS , Furnas GW , Carroll JD.The estimation of ultrametric and path length trees from rectangular proximity data . Psychometrika1984; 49: 289-310 .
71.
Hubert LJ , Arabie P.Iterative projection strategies for the least-squares fitting of tree structures to proximity data . British Journal of Mathematical and Statistical Psychology1995; 48: 281-317 .
72.
Hartigan JA.Modal blocks in dentition of west coast mammals . Systematic Zoology1976; 25: 149-160 .
73.
Shepard RN , Arabie P.Additive clustering representation of similarities as combinations of discrete overlapping properties . Psychological Review1979; 86: 87-123 .
74.
Carroll JD , Chaturvedi A.A general approach to clustering and multidimensional scaling of two-way, three-way, or higher-way data. In Luce RD , D’Zmura M , Hoffman DD , Iverson GJ , Romney AK , eds. Geometric representations of perceptual phenomena. Mahwah: Erlbaum , 1995, 295-318.
75.
Mickey MR , Mundle P , Engelman L.Boolean factor analysis. In Dixon WJ , ed. BMDP statistical software manual. Berkeley, CA: University of California Press , 1983, 538-545.
76.
De Boeck P , Rosenberg S.Hierarchical classes: model and data analysis . Psychometrika1988; 53: 361-381 .
77.
Van Mechelen I , Rosenberg, S , De Boeck, P.On hierarchies and hierarchical classes models. In Mirkin B , McMorris FR , Roberts FS , Rhetsky A , eds. Mathematical hierarchies and biology. Providence: American Mathematical Society , 1997, 291-298.
78.
Van Mechelen I , De Boeck P , Rosenberg, S.The conjunctive model of hierarchical classes . Psychometrika1995; 60: 505-521 .
79.
Leenen I , Van Mechelen I.An evaluation of two algorithms for hierarchical classes analysis . Journal of Classification2001; 18: 57-80 .
80.
Barbut M , Monjardet B.Ordre et classification. Paris: Classiques Hachette , 1970.
Stahringer S , Wille R.Conceptual clustering via convex-ordinal structures. In Opitz O , Lausen B , Klar R , eds. Information and Classification. New York: Springer-Verlag , 1993, 85-98.
83.
Krolak-Schwerdt S, Orlik P. Direct clustering of a two-mode binary data matrix. Arbeiten der Fachrichtung Psychologie. Saarbrücken: Universität des Saarlandes, 1998.
84.
Van Mechelen I.Approximate Galois lattices of formal concepts. In Opitz O , Lausen B , Klar R , eds. Information and classification. New York: Springer-Verlag , 1993, 108-112.
85.
Mezzich JE , Solomon H.Taxonomy and behavioral science: comparative performance of grouping methods. London: Academic Press , 1980.
86.
Gara M , Rosenberg S , Goldberg L.DSM-IIIR as a taxonomy: a cluster analysis of diagnoses and symptoms . Journal of Nervous and Mental Disease1992; 180: 11-19 .
87.
Ceulemans E.An algorithm for the HICLAS-R model. In Schader M , Gaul W , Vichi M , eds. Between data science and applied data analysis. Heidelberg, Berlin: Springer Verlag , 2003, 173-181.