Abstract
Image segmentation and registration are closely related image processing techniques and often required as simultaneous tasks. In this work, we introduce an optimization-based approach to a joint registration and segmentation model for multimodal images deformation. The model combines an active contour variational term with mutual information (MI) smoothing fitting term and solves in this way the difficulties of simultaneously performed segmentation and registration models for multimodal images. This combination takes into account the image structure boundaries and the movement of the objects, leading in this way to a robust dynamic scheme that links the object boundaries information that changes over time. Comparison of our model with state of art shows that our method leads to more consistent registrations and accurate results.
Introduction
In the last decades, the development of imaging devices significantly increased image data collections and consecutively the need for human interpretation. To reduce labour hours of specialists new image processing techniques are developed. Two main image processing techniques are image registration and segmentation. Image registration provides an understanding of the data behavior in two or more different scenarios taken at different times. Meanwhile, image segmentation is the process of partitioning objects or features considering the characteristics of each pixel. In a wide range of problems, such as comparison of images taken at different time or modalities, image registration depends on image segmentation and vice-versa. Erdt et al. 1 indicates that more than 20% of scientific research in medical imaging area requires a combined registration and segmentation scheme. The combined registration and segmentation models can be divided into two different categories: simultaneous registration and segmentation or joint segmentation and registration which lastly has been referred to as regmentation. In the last years, regmentation functional models have been introduced on monomodal imaging frameworks. These models are variational rigid registration based models,2,3 variational nonrigid registration based models,4,5 or atlas based models. 6 Recently, Ibrahim-Rada-Chen 7 present a linear curvature regmentation variational based model which gets profit on the global and local deformation and easily can cope with images which contain more than one object.
In difference with monomodal images, multimodal image processing requires spatial correspondence estimation or extract certain information from images with different modalities (or protocols). Such example can be from imaging using CT and MRI protocols, which brings different image perspective into the medical field of clinical and pre-clinical diagnostics. Even though assessing image similarity becomes challenging for multimodal images, multimodal images are desired as they bring different image perspectives and properties. In difference with monomodal images registration, where simple image similarity measures between image pairs can provide a good spatial transformation, dealing with multimodalities is substantially more difficult. Registering multimodal images, which are acquired through different mechanisms and have distinct modalities, involves the alignment of both images in terms of shapes and salient components while preserving the modality of one given image. For a comprehensive overview of registration techniques in a systematic manner, the work of Sotiras et al. 8 is referred. By adding the active contour without edges model into the curvature joint registration based on MI proposed by Modersitzki, 9 a modified joint image segmentation and registration delivers results which are shown in Figure 1. The same images are processed with the new proposed joint regmentation model and the results shown in the last section indicate that the combination of segmentation and registration tasks into a single framework is relevant as avoids the errors produced from simultaneous tasks. The existing registration models for mono modal-images can be classified as unsupervised and supervised methods. Among different registration similarity measure, the most popular one is mutual information (MI), introduced by Viola et al. 10 Recently, learning-based approaches measuring image similarity have been proposed as well. Techniques involving KL-divergence, 11 max-margin structured learning, 12 boosting, 13 or deep learning 14 have been investigated. However, their large training sets make them limited by the lack of generalization to previously unseen object classes and the registration performance depends on numerical optimization of the optimal registration parameters. There are a few unsupervised works to solve the joint segmentation and registration problem. The work proposed by Wang and Vemuri 15 introduces a registration driven by segmentation of a reference image without varying degrees of non-rigidity. The model applies cross cumulative residual entropy as a distance measure 15 and piecewise constant Chan-Vese model for segmentation. 16 Although, the algorithm can accommodate image pairs having very distinct intensity distributions but fails in deforming large objects. Droske et. al 17 presented variational method based on a local energy density which uses a generalized motion of image morphology for multimodal image registration. A similar idea was lastly proposed by Aganj and Fischl 18 where the segmentation score was applied to register two multimodal brain images. This model is not designed for general image deformation as the model applies rigid image transformation. Recently, a joint segmentation and registration model has been proposed by Debroux and Le Guyader, 19 which is formulated based on the combination of nonlocal total variation and nonlocal shape descriptors. The fitting term of this model uses a weighted sum of squared difference distance measure, similar to 7 and the model depends on the many parameters such as window size, patch size, number of neighbors pixels, etc. To overcome the difficulties mentioned above we propose a new variational model which combines linear curvature, known for its ability to generate smooth transformations,9,20 and MI as a distance measure10,21 for the spatial transform.

Simultaneous segmentation and linear curvature registration based on MI model noisy synthetic image and CT image. First column segmented template image, T &
The outline of this paper is as follows. In the next section, we review the existing models for monomodal image regmentation and multimodal image registration. Then, we introduce our proposed new model for multimodal image regmentation. In the last section, we show some numerical tests including comparison.
Related work
Deformable multimodal registration attempts to find spatial correspondence between image pairs that are acquired with distinct scanning protocols. In the registration process, these correspondences are estimated by determining the spatial transformation which is applied to a template image to make similar to the reference image. Generally, image registration is driven by the chosen similarity measure and the chosen deformation model. For images taken from the same modality, the image similarity usually is considered to be high, thus simple and direct image similarity measures are used, for instance, the sum of squared intensity differences (SSD). We start this section with an overview of the work introduced by Ibrahim-Rada-Chen 7 for monomodal regmentation of two given images. The model jointly merges the segmentation and registration into a variational function represented in a level set. This method improves the Guyader-Vese model 22 by replacing the nonlinear elastic term with linear curvature model and adding a weighted Heaviside sum of SSD term.
Let R be a reference image and T template image in a given domain
The term
The values of c1 and c2 in equation (1) present the average intensity values inside and outside the boundary
In multilevel representation, the registration problem is solved using quasi-Newton method by updating the deformation field
Referring to registration for multimodality, different techniques have been introduced.23–25 A significant work, proposed by Modersitzki,
9
based on mutual information and curvature regularizer, competes the other registration models. The joint functional for multimodal registration is given as follows:
The NGF can be considered as the L2 norm of R, the residual of the alignment of the normalised gradients of two images at a pixel position
The proposed new joint regmentation model for multimodality
Our joint functional model for multimodal image regmentation combines an active contour without edges, a curvature regulariser, and a mutual information distance measure. Our proposed regmentation functional is given with the formula:
The joint functional is solved with respect to the displacement field using discretise then optimise approach based on the quasi-Newton method in multilevel framework for faster implementation. The grid points are located at the center of the cell
Minimizing equation (19) brings a system of nonlinear equation with unknown
The gradient of the regularization term is computed as the multiplication of Step 1. Initialize the level set function Step 2. Finding deformation field using multilevel strategy:
* Compute the deformation field on each level using Quasi-Newton method as follows: *
Numerical results and conclusions
In this section, we present several examples for synthetic as well as real data in comparison with the linear curvature model based on MI applied on a synthetic image and a set of real images. In addition, the results of our proposed joint regmentation are compared to image registration applying mutual information as a distance measure and image segmentation simultaneously. The image pairs used in all our experiments have significantly different intensity profiles, deformation and presence of noise. In each iteration we compute the Jacobian matrix of the transformation,

Comparison between monomodal image regmentation
7
and the proposed model. The monomodal joint regmentation fails to register with parameters

Comparison between the linear curvature models for synthetic images using MI and NGF28 (refer to (a),(b) and (c),(d) for MI and NGF results of derformation field and registered template image, respectively). The proposed model is compared to both of these models and shows that successfully deformes the given template image. In contrast, curvature model using MI gets stuck in local minima which is demonstrated by the distance measurements before registration
Figures 4 and 5 show the results on real multimodal data images. In the first row of both figures, we show comparison results between the proposed model and linear curvature model28 of two chest images (T1 and T2 weighted images). The second and the third row have the thorax (PET and CT images28) and brain images with high deformation and presence of noise of a T2 weighted CT image for the template and a MRI image for the reference. We clearly see that the proposed model delivers good results, referring to the fifth, sixth, and seventh column of Figure 4, whereas the linear curvature model28 shown in the third and fourth column stuck in local minima due to highly non convexity of MI functional. Our multimodal joint regmentation provides accurate segmentation and registration in comparison also to linear curvature based on NGF model.28 These results are shown in Figure 5. The gradient field distance measure involves second order image derivatives, as a consequence there can be a problem dealing with noisy images. Table 1 shows the parameters and distances measurements before and after the performance of linear curvature model28 and our model. We note that the parameter α, λ1, λ2, μ, and γ are tuned according to the paired modalities. It can be noticed a similarity for those parameters in between the chest image case where T1 and T2 weighted CT images are involved and the image of the brain with T2 weighted CT image for the template and an MRI image for the reference, whereas those parameters change for the thorax image where PET image for the template is paired with a CT image for the reference. This drawback is expected similar to all the other inverse problems. In conclusion, the proposed model is suitable for jointly segmenting and registering images with intensity difference, severe deformation, and presence of noise. The model avoids in this way the need of pre-or post-processes such as pre-registration step which might be required for the segmentation task or vice-versa.

Comparison between linear curvature model using MI28 and the proposed model for real images data. T &

Comparison between linear curvature model using NGF28 and the proposed model for real images data. T & 0 f (x) (a) R (b) F = –0.3094 (c) T(x + u(x)) (d) F = 0.56233 (e) R & 0 f (x+u(x)) (f) T(x + u(x)) (g) T & 0 f (x) (h)R (i)
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is partially supported by MOHE of Malaysia under grant RACER/1/2019/ICT01/UPNM/1.
