We propose an extension of a systemic model for object recognition formulated by Rybak et al (1998 Vision Research 38 2387–2400) which is based on the functional organisation of the visual systems in primate brains. In contrast to the learning and recognition scheme of Rybak et al we do not assume a behavioural paradigm, ie a visuomotor programmed scanpath that determines the sequence of foveation on the different parts of the object. As in the basis architecture of Rybak et al, the system modules are separated into ‘what’-like subsystems corresponding to the ventral occipito-inferotemporal visual path and ‘where’-like complexes analogous to the dorsal occipito-parietal visual path. The ‘what’ system analyses local features in the actual foveation as in Rybak et al. But, in our case, the ‘where’ memory, instead of programming a behavioural scanpath, scores the spatial relationship between successive fixation and the spatial relationship between the associated main edges. The recognition is based on the identification of parts and their spatial relationship. This gives the learning and recognition mechanisms more flexibility in the sense that, for recognising an object, several different fixation sequences may be accepted.