Sage Journals: Discover world-class research

Abstract

Modeling and simulation is a proven cost-efficient means for studying the behavioral dynamics of modern systems of systems. Our research is focused on evaluating the ability of neural networks to approximate multivariate, nonlinear, complex-valued functions. In order to evaluate the accuracy and performance of neural network approximations as a function of nonlinearity (NL), it is required to quantify the amount of NL present in the complex-valued function. In this paper, we introduce a metric for quantifying NL in multi-dimensional complex-valued functions. The metric is an extension of a real-valued NL metric into the k-dimensional complex domain. The metric is flexible as it uses discrete input–output data pairs instead of requiring closed-form continuous representations for calculating the NL of a function. The metric is calculated by generating a best-fit, least-squares solution (LSS) linear k-dimensional hyperplane for the function; calculating the L2 norm of the difference between the hyperplane and the function being evaluated; and scaling the result to yield a value between zero and one. The metric is easy to understand, generalizable to multiple dimensions, and has the added benefit that it does not require a closed-form continuous representation of the function being evaluated.

Keywords

Nonlinearity metric complex-valued functions modeling simulation

1. Introduction

Modern systems are typically composed of collections of pre-existing subsystems (or components) that are independently developed, optimized, and fabricated for use in multiple applications. The benefits of this “systems of systems” approach are well understood in system design and manufacturing and result in significant cost reductions, improved reliability, and subsystem reuse across multiple systems. It is also true that these benefits extend into the modeling, simulation, and analysis of new systems as the constituent subsystem models have already been developed, refined, and abstracted to provide accurate and efficient system simulations.

Many scientific and engineering models contain functional blocks containing complex-valued signals and functions. For example, antenna design and analysis includes the antenna shape and sub-element arrangement to generate a design based on the associated frequency-domain characteristics which includes the complex amplitude of the desired operating range, nonlinearity (NL) in responses, and other design considerations.¹ Acoustic signal processing is another field where complex-valued signals are important.² Ultrasonic imaging can be used to estimate the sea depth by analyzing the echo generated by the rocks or sand on the seabed. Distortions and nonlinearities in the returned signal can be introduced by seabed objects which decrease the quality of a seabed map created using this technology. Hirose et al. proposed a complex-valued-based Markov random field model to visualize the shape of distant boundaries to mitigate these distortions. Adaptive processing of interferometric synthetic aperture radar (InSAR) imaging involves complex-valued electromagnetic-wave signal processing.³ The InSAR amplitude data correspond to reflectance and the phase to variation in object height. Since the amplitude and phase are inseparable properties of the electromagnetic wave, they are treated as a single complex-valued data entity. In each of these applications, modeling the system requires the use of complex-valued signals and functional blocks.

Modern composite military systems of interest contain functional blocks that are nonlinear. The NL may result from internal states that change over time and apply different logic and mathematical models to signal processing (e.g., detecting signals in noise). For example, there may be some logic using a noise-riding threshold, T, with signals detected as values that are consistently above the noise over some continuous epoch of length t_min > 1, so that noise spikes (t = 1) that are above T are not detected as signals. In this case, the logic imparts discontinuity, and hence NL between the input and output pairs. As the internal state (e.g., of the noise-riding threshold) evolves over time, so does the NL. Systems may also contain signal pre-processing blocks that produce time-invariant transformations of sensor data. The development and use of surrogate approximate models to mimic these transformations enable the study of behaviors over a wider range of input–output values than would be practical using the system itself which may not exist yet, whose models may be restricted or unavailable for use, or when the models may be too computationally expensive for use in system simulation.⁴ In these important cases, when a sufficient set of representative input–output pairs can be obtained, it may be possible to use one or more surrogate models to approximate a functional block with sufficient accuracy as defined by the analyst.

Recent progress in the application of artificial intelligence technologies has motivated us to examine the use of machine learning, statistical learning, and neural networks in the surrogate modeling and simulation of systems which process complex-valued signals. Specifically, our current research focuses on investigating the viability of approximating nonlinear complex-valued functional blocks using neural networks. While neural networks are often used to classify and cluster real-world data, they can also be used for function approximation.^5,6 Our hypothesis is that the use of neural networks for complex-valued function approximation may reduce computational resource requirements when simulating large systems of systems which process complex-valued signals. Our initial research revealed that the amount of NL contained within a complex-valued function impacts both the accuracy and the amount of computational resources required to approximate the function using a neural network. As a consequence, measuring the degree of NL present in a functional block is essential when approximating the block using neural networks. For this reason, it is necessary to develop a metric to quantify the amount of NL present in time-invariant, complex-valued functional blocks. An NL metric is an essential tool in military modeling and simulation as it enables the analyst to evaluate tradeoffs when using neural networks to approximate complex-valued functions.

In this paper, we introduce a metric for the quantification of NL in multi-dimensional complex-valued functions. The metric is calculated by generating a best-fit, least-squares solution (LSS) linear k-dimensional hyperplane for the function; calculating the L2 norm of the difference between this hyperplane and the function being evaluated; and scaling the result to yield a value between zero and one. The remainder of this paper is organized as follows. Section 2 presents the basic mathematical concepts of quantifying NL in a function of one variable which forms the basis of the proposed metric; section 3 introduces our least squares-based NL metric for k-dimensional complex-valued functions; section 4 demonstrates application of the metric to quantify NL in two complex-valued functional blocks, each of which contains four complex-valued functions; and section 5 provides conclusions and future research directions.

2. Quantification of NL of a function of one variable

A linear equation is one in which the outputs have a constant, multiplicative proportional relationship to the inputs. Conversely, a nonlinear equation is one in which the outputs cannot be simply related to the inputs by such a constant of proportionality. In engineering and the sciences, nonlinear systems are of great interest, as most real physical systems exhibit nonlinear behaviors.

The concept of a linear function of one variable is simple to visualize and is relatively easy to identify by inspection of a plot of a function’s input/output relationship. If we think of quantifiable NL as being the measure of divergence of a nonlinear function from a straight line over some domain, we can develop an associated metric which quantifies this divergence. Existing work has proposed a quantifiable measure of NL in the real domain.^7–9 Emancipator and Kroll⁹ discuss the need for a quantitative measure of NL in the real domain. They define a “(dimensional) nonlinearity” method as the square root of the mean of the square of the deviation of the response curve from a straight line, where the straight line is chosen to minimize the NL. Furthermore, they propose that the NL metric should be some measure of the average deviation of a response function from a “best-fit” straight line over the interval of interest. This paper extends previous works to provide an alternative nonlinear metric approach in the more general case that includes complex-valued functions. The method of defining a function’s NL in terms of deviation from a linear function is useful even in multi-dimensional real and complex spaces where visualization becomes impractical.

2.1. Quantification of NL of a real function of one variable

For example, consider the periodic function $f (x) = \sin (x)$ over an interval of interest $[X_{L} = 1 / 2 rad, X_{U} = 3 / 2 rad]$ . The calculated best-fit linear approximation to the function over this interval is $g (x) = 0.52665 x + 0.27951$ as shown in Figure 1.

Figure 1.

Plot of a nonlinear function $f (x) = \sin (x)$ and the “best-fit” linear approximation $g (x) = 0.52665 x + 0.27951$ over the interval $[X_{L} = 1 / 2 radians, X_{U} = 3 / 2 radians]$ .

Determining the best-fit line $g (x)$ requires curve-fitting over the interval of interest to find the associated equation for $g (x)$ . It is important to note that the choice of a different upper and lower limit of the interval can dramatically change the resulting best-fit linear approximation. To quantify the NL of $f (x)$ in this case, we can integrate the differences of $f (x) - g (x)$ for $x$ over the interval of interest as a metric of NL. However, a shortcoming of this approach is in cases where the interval of interest covers regions where $f (x) - g (x)$ produce negative values. In this case, this metric would result in a lower than expected value since negative differences will negate positive differences. A solution to this problem is to create an NL metric by integrating the squared differences between the function and the generated best-fit line and then to take the square root of the average (i.e., the root-mean-square or RMS) as proposed in Emancipator and Kroll⁹ as shown in Equation (1). Note that this approach involves finding the RMS of the differences of the integrals of $f (x)$ and $g (x)$ :

NL = \sqrt{\frac{1}{(X_{U} - X_{L})} \int_{X_{L}}^{X_{U}} {(f (x) - g (x))}^{2} dx}

(1)

This approach is tractable when $f$ and $g$ are the known functions with closed form antiderivatives. However, in cases where $f$ and $g$ are the complex-valued multi-variate functions with unknown functional forms, obtaining the integrals necessary to implement this approach can be quite challenging. For this reason, we choose to calculate our NL metric using a finite number of evenly spaced discrete points. This approach is significantly easier to implement, especially when you have only discrete input/output data sets such as those collected in a laboratory experiment. We also desired to make the NL metric invariant to linear scaling. For example, if we measured the NL of two functions such as $f (x) = \sin (x)$ and $g (x) = 2 * \sin (x)$ over some interval of interest, we would like the resulting NL metrics to be equal. Ideally, we would like an NL metric that yields an equal measure of relative NL as they simply differ by a linear factor of two. We can accomplish this goal using a modification to the scaling concept proposed in Emancipator and Kroll.⁹ Specifically, we can divide all errors between the predicted and true values by the difference between upper and lower bounds of the true values. This ensures a nonlinear function scaled by a constant value results in the same calculated NL metric. Finally, since the NL metric grows monotonically in response to the associated function’s deviation from its least-squares approximation, we choose to bound the calculated NL metric to a value between 0.0 (low NL) and 1.0 (high NL) by normalizing the result. In addition to extending existing NL metrics to the complex domain, the three features of our NL metric make it suitable for use in our application in the quantification of NL in multi-dimensional complex-valued functions.

2.2. Quantification of NL of a complex function of one variable

A complex number is an element of a number system that contains both real and imaginary components. Every complex number can be expressed in the rectangular coordinate form $a + bi$ , where $a$ and $b$ are the real numbers and the imaginary element, $i$ , which satisfies the equation $i^{2} = - 1$ . A complex number may also be represented in the polar form $z = r (\cos θ + i \sin θ)$ , where $r = | z | = \sqrt{a^{2} + b^{2}}$ , $a = r \cos θ$ , $b = r \sin θ$ and $θ = \tan^{- 1} (b / a)$ . For this reason, evaluating the NL of a complex function of one variable using the technique described above first requires the creation of a plane which best-fits the nonlinear function in the complex domain.

Consider the MATLAB code for an arbitrary complex-valued function $f (z)$ shown in Equation (2) as follows:

\begin{matrix} f (z) = (4.0 * z + 1.5 \\ * 0.3678 * \exp (- conj (z)) \\ + 0.6422 * \tan (z) + 0.3679 \\ * \exp (- z) + 0.6422 \\ * \tan (conj (z)) + 0.3679 \\ * \exp (- z) + 0.6422 * \tan (z) \\ + 0.3679 * \exp (- z) + 0.6422 \\ * \tan (z) + 0.7071 * abs (z) \\ + 0.6919 * \sin (z) + 0.6919 \\ * \sin (conj (z)) + 0.6145 \\ * z^{z} + 0.07145 \\ * conj {(z)}^{conj (z)} + 0.7147 \\ * z^{z}) \end{matrix}

(2)

We can express the domain of a complex number $z$ , represented in rectangular form, by specifying both its real and complex element intervals. For example, the domain of the function in Equation (2) can be specified by the real interval $[- 1, 1]$ and the imaginary interval $[- j, j]$ . If we divide both the real and imaginary intervals into 100 equally spaced samples, we can generate a discrete input set of 10,000 points that spans the domain of the function. Application of this input data set to Equation (2) yields a corresponding response surface. A plot of the real part of the response surface is shown in Figure 2 and the imaginary part of the response surface is shown in Figure 3.

Figure 2.

Surface plot of the real part of the nonlinear complex-valued function $f (z)$ shown in Equation (2).

Figure 3.

Surface plot of the imaginary part of the nonlinear complex-valued function $f (z)$ shown in Equation (2).

Determining the associated integrals of the underlying function from measured data, in this case, is impractical when compared to the case of a function with one real variable as discussed in Emancipator and Kroll.⁹ One possible method for finding estimates for the required integrals could be to use an approximation of the true response function $f (z)$ , say by a complex-valued polynomial interpolate, and performing numerical integration to find an approximation to the integral of the polynomial.¹⁰ We could then find a best-fit plane $g (z)$ to the polynomial for the interval and follow the process of Emancipator and Kroll.⁹ Conversely, since our interest here is only in a generating a best-fit linear plane which approximates the function to aid in calculating an NL metric versus finding a nonlinear solution, we can simply use the LSS method.¹¹ By use of the general equation for a hyperplane, we can solve a set of simultaneous equations which yield the parameters that provide the LSS best-fit hyperplane. We can then adjust the metric shown in Equation (1) to quantify NL in complex functions. This approach is valid for the real domain, but more importantly for our purposes, it is valid in the complex domain.¹¹

3. Quantification of NL in k-dimensional complex-valued functions

Since the area of interest here is modeling of functions with more than one complex argument, we define an NL metric that is calculated as the absolute value of the sum of squared error between the best-fit k-dimensional hyperplane (where k is the number of arguments) and a discrete set of input–output “ground truth” data. The use of discrete data points eliminates the need to obtain the continuous functional representation. “Best-fit” in this case is determined by finding the hyperplane that minimizes the sum of the squared differences between the true values and the associated points on the hyperplane.

For example, in the three-dimensional case (k = 3), the general equation for a plane $y = g (z)$ is shown in Equation (3) as follows:

a_{1} z_{1} + a_{2} z_{2} + a_{3} z_{3} + b = y = g (z)

(3)

where the $a_{i}$ coefficients are non-zero and $b$ is an arbitrary constant. To find the parameters of the plane (the a_is and $b$ ), we use the least-squares approach given the input $z$ and output $y$ vectors to solve the system of equations as shown in Equation (4) as follows:

[\begin{matrix} 1 & z_{12} & z_{13} & z_{14} \\ 1 & z_{22} & z_{23} & z_{24} \\ 1 & z_{32} & z_{33} & z_{34} \\ ⋮ & ⋮ & ⋮ & ⋮ \\ 1 & z_{n 2} & z_{n 3} & z_{n 4} \end{matrix}] [\begin{matrix} b \\ a_{1} \\ a_{2} \\ a_{3} \end{matrix}] = [\begin{matrix} y_{1} \\ y_{2} \\ y_{3} \\ ⋮ \\ y_{n} \end{matrix}]

(4)

where the number of rows, $n$ , is the number of observations. Equation (4) can be rewritten in the more compact vector/matrix form as shown in Equation (5) as follows:

Ax = B

(5)

When we solve for the plane parameters $(x)$ , we have an over-determined problem (i.e., more equations than parameters). In this case, we can use the pseudoinverse $(A^{+} = (A^{H} A)^{- 1} A^{H})$ , where the superscript $H$ denotes the complex-conjugate transpose, and the superscript of −1 is the matrix inverse. This results in Equation (6):

[\begin{matrix} b \\ a_{1} \\ a_{2} \\ a_{3} \end{matrix}] = {(A^{H} A)}^{- 1} A^{H} B

(6)

After solving for the parameters, Equation (6) is used to construct the best-fit k-dimensional linear hyperplane. Since we want to ensure the NL metric is unaffected by constant relative change, we first divide the residuals by the range of the true output values as shown in Equation (7):

scale d_{resid} = \frac{t_{ik} - a_{ik}}{\max (t_{ik}) - \min (t_{ik})}

(7)

where $t_{ik}$ is a matrix of the “ground truth” complex values $f (z)$ , and $a_{ik}$ is a matrix of the associated LSS best-fit linear hyperplane $g (z)$ . It is then straightforward to calculate the NL metric $(NL)$ as shown in Equation (8):

NL = | \begin{matrix} \sum_{i = 1}^{n} (\sum_{k = 1}^{c} Re {(scale d_{resid})}^{2}) + \\ j * \sum_{i = 1}^{n} (\sum_{k = 1}^{c} Im {(scale d_{resid})}^{2}) \end{matrix} |

(8)

where $n$ is the number of observations, $c$ is the number of output vectors, $Re$ is a function that takes the real part of the complex number in parentheses, $Im$ is a function that takes the imaginary part of the complex number in parenthesis, and $i = \sqrt{- 1}$ . Finally, we calculate the bounded NL metric, $N L_{bounded}$ , as shown in Equation (9) to prevent the NL metric from growing without bound:

N L_{bounded} = \frac{1}{1 + \frac{1}{NL}}

(9)

In the case where the relationship between the input and output is linear, the LSS provides an approximation that deviates from the true solution on the order of the IEEE 754 machine epsilon precision standard (~10⁻³⁰).¹² This approach is valid for both the real and complex domains as shown by Elgot.¹¹ Furthermore, as the input/output relationship becomes increasingly nonlinear, the accuracy of the LSS drops off proportional to the number of observations (n). In the complex domain, this can be seen by observation of the NL calculation $(NL)$ as shown in Equation (8) as the sum of the magnitudes of the squared difference between the function $f (z)$ and the best-fit linear hyperplane $g (z)$ increases with n.

As a straightforward example, we can apply this method to $f (z)$ from Equation (2) which is a function of one complex variable $(k = 1)$ . In order to visualize the results using Equation (3) which is used when we have three variables (i.e., $k = 3$ ), we fix $z_{1} = z_{2} = z_{3} = z$ and find the plane parameters $a_{1}, a_{2}, a_{3},$ and $b$ which satisfy the equation. The resulting plane parameters are shown in Equation (10) as follows:

[\begin{matrix} b \\ a_{1} \\ a_{2} \\ a_{3} \end{matrix}] = [\begin{matrix} 0.1549 \\ 0.0754 \\ 0.0754 \\ 0.0754 \end{matrix}]

(10)

Note that $a_{1}$ through $a_{3}$ are equal since we are using one complex-valued variable in this example to simplify the graphic illustration. The equation of the best-fit linear hyperplane $y = g (z)$ constructed from Equation (3) is shown in Equation (11) as follows:

0.0754 z_{1} + 0.0754 z_{2} + 0.0754 z_{3} + 0.1549 = y

(11)

We can now create plots which help visualize the foundation of the NL metric. Figure 4 shows the surface plot for the real part of $f (z)$ and the real part of the least-squares plane $g (z)$ . Figure 5 shows the surface plot for the imaginary part of $f (z)$ and the imaginary part of the least-squares plane $g (z)$ .

Figure 4.

Surface plot of the real part of the nonlinear complex-valued function $f (z)$ and real part of least-squares solution (LSS) $g (z)$ .

Figure 5.

Surface plot of the imaginary part of the nonlinear complex-valued function $f (z)$ and imaginary part of least-squares solution (LSS) $g (z)$ .

The differences in the functional response between best-fit LSS hyperplane and the actual function response for the real part (Figure 4) and the imaginary part (Figure 5) are the basis for the NL metric shown in Equation (8). A surface plot of the calculated differences between best-fit LSS hyperplane and the function response for the real and imaginary parts is illustrated in Figures 6 and 7, respectively.

Figure 6.

Surface plot of the differences between best-fit LSS hyperplane and the actual function response for the real part.

Figure 7.

Surface plot of the differences between best-fit LSS hyperplane and the actual function response for the imaginary part.

Using actual values for $f (z)$ and the values determined from the LSS best-fit hyperplane $g (z)$ , we can use Equation (8) to calculate the NL metric, $NL$ , for the function in Equation (2). In this example, the calculated $NL$ is 77.324. Using the bounded $(N L_{bounded})$ NL metric of Equation (9), we get 0.987.

In some cases, it may be desirable to provide an NL measure of a complete functional block containing multiple inputs and outputs. In this case, visualization of the method becomes problematic; however, the underlying principles still hold. Assuming we have $m$ input/output data set pairs with $r$ input variables each, we then have $m$ sets of simultaneous equations to solve (one for each input/output set), each with r + 1 input columns which represent the input variables $z_{1}, z_{2}, z_{3}, \dots, z_{r}$ plus the column of ones. This results in $m$ sets of $r$ hyperplane parameters each which will then be used to find the $m \times n$ (number of input/output sets times the number of observations) approximation values. We can then apply Equation (8) to obtain the NL metric, $NL$ , value in this general case. The following section will provide examples of a four-dimensional complex-valued functional block scenario.

4. Application of NL quantification to complex-valued functional blocks

In this section, we demonstrate the use of the NL metric to quantify the NL in complex-valued functional blocks.

4.1. A functional block

In general, a functional block contains multiple functions of multiple variables. Consider, e.g., a four-input, four-output functional block as shown in Figure 8. In this case, we have four independent complex-valued inputs, $I n_{1}, I n_{2}, I n_{3},$ and $I n_{4}$ which are applied to the functional block and generate four complex-valued outputs $Ou t_{1}, Ou t_{2}, Ou t_{3},$ and $Ou t_{4}$ .

Figure 8.

Block diagram of the input/output relationships for a four-input, four-output complex-valued functional block.

Each of the four outputs is an independent function of four-input variables as shown in Equations (12)–(15) as follows:

Ou t_{1} = f 1 (I n_{1}, I n_{2}, I n_{3}, I n_{4})

(12)

Ou t_{2} = f 2 (I n_{1}, I n_{2}, I n_{3}, I n_{4})

(13)

Ou t_{3} = f 3 (I n_{1}, I n_{2}, I n_{3}, I n_{4})

(14)

Ou t_{4} = f 4 (I n_{1}, I n_{2}, I n_{3}, I n_{4})

(15)

4.2. Input and output data

In the two examples that follow, four independent input vectors of 1000 complex elements were generated from a truncated random normal distribution in the real interval $[- 1, 1]$ and the imaginary interval $[- j, j]$ . Figure 9 shows a scatter plot of one of the four inputs, $I n_{1}$ , to visualize the normal distribution of the input data. We can conceptualize the input data as a $1000 \times 4$ matrix containing 1000 rows (one row for each observation) and 4 columns (one column for each input).

Figure 9.

Input data pulled from a truncated random normal distribution from the interval $[- 1, 1]$ on the real axis and $[- j, j]$ on the imaginary axis for input $I n_{1}$ .

The application of the input data to the functional block yields a $1000 \times 4$ matrix output data matrix. In the output matrix, each row represents a unique input data point sample and each column corresponds to one of the four outputs.

4.3. Example 1: a linear four-input, four-output complex-valued functional block

In our first example, we present a four-input, four-output functional block that only contains linear operations as shown in Equations (16) through (19) as follows:

\begin{matrix} Ou t_{1} = f 1 (I n_{1}, I n_{2}, I n_{3}, I n_{4} \end{matrix}) = (0.5 + 1.0 * I n_{1} + 1.0 * I n_{2} + 1.0 * I n_{3} + 1.0 * I n_{4})

(16)

\begin{matrix} Ou t_{2} = f 2 (I n_{1}, I n_{2}, I n_{3}, I n_{4}) = (0.2 * I n_{1} + 0.8 * I n_{2} \\ + 0.6 * I n_{3} + 0.9 * I n_{4} + 0.5) \end{matrix}

(17)

\begin{matrix} Ou t_{3} = f 3 (I n_{1}, I n_{2}, I n_{3}, I n_{4}) = (0.8 * I n_{4} + 0.1 * I n_{3} \\ + 0.9 * I n_{2} + 0.5 * I n_{1} + 0.1) \end{matrix}

(18)

\begin{matrix} Ou t_{4} = f 4 (I n_{1}, I n_{2}, I n_{3}, I n_{4}) = (0.2 * I n_{2} + 0.9 * I n_{4} \\ + 0.2 * I n_{1} + 0.4 * I n_{3}) \end{matrix}

(19)

Calculating $NL$ first involves recognizing that this example has four complex-valued arguments ( $I n_{1}$ through $I n_{4}$ ). This means the $A$ matrix in this case will have five columns. The first is a column of ones as shown in Equation (4) to solve for the $b$ parameter. Columns two through five are input vectors $I n_{1}$ through $I n_{4}$ which provide information needed to solve for hyperplane parameters $a_{1}$ , $a_{2}$ , $a_{3}$ , and $a_{4}$ . The $y$ vector is each of the output vectors $Ou t_{1}$ through $Ou t_{4}$ used individually to solve for the sets of parameters of four separate hyperplanes. Equation (6) is used to generate hyperplane parameters for each output function, Out1 through Out4, as shown in Table 1.

Table 1.

Calculated hyperplane parameters for approximating $Ou t_{1}$ , $Ou t_{2}$ , $Ou t_{3}$ , and $Ou t_{4}$ of the linear complex-valued functional block of Example 1.

	$Ou t_{1}$	$Ou t_{2}$	$Ou t_{3}$	$Ou t_{4}$
$b$	0.0263	0.0263	0.0053	0.0526
$a_{1}$	0.0526	0.0105	0.0263	0.0105
$a_{2}$	0.0526	0.0421	0.0474	0.0105
$a_{3}$	0.0526	0.0316	0.0053	0.0474
$a_{4}$	0.0526	0.0474	0.0421	0.0474

These parameters allow the construction of the linear hyperplane that best-fits the function for each output $Ou t_{1}$ through $Ou t_{4}$ . Now with the true values $f (z)$ and the values determined from the best-fit LSS $g (z)$ , we can use a vector version of Equation (7) to determine $NL$ for each of the outputs $(N L_{1}, N L_{2}, N L_{3}, and N L_{4})$ . To calculate the NL using Equation (8) for each of the individual outputs, we need to revisit Equation (7) and set $a_{ik}$ equal to the vector of approximations found using each of the sets of parameters listed in Table 1 and then $t_{ik}$ is the associated vector of true values which come from $Ou t_{1}$ , $Ou t_{2}$ , $Ou t_{3}$ , and $Ou t_{4}$ . The individual NL values are shown in Table 2 and are effectively zero which is expected for a set of linear functions. These are approximately same values that are obtained for the bounded $(N L_{bounded})$ NL metric of Equation (9).

Table 2.

Calculated $N L_{1}, N L_{2}, N L_{3}, and N L_{4}$ for $Ou t_{1}$ , $Ou t_{2}$ , $Ou t_{3}$ , and $Ou t_{4}$ of the linear complex-valued functional block of Example 1.

$N L_{1}$	$N L_{2}$	$N L_{3}$	$N L_{4}$
1.4163e−29	5135e−30	6.2673e−30	5.7850e−30

Now showing the associated values using the $N L_{bounded}$ for completeness in Table 3.

Table 3.

Calculated $N L_{1 bounded}, N L_{2 bounded}, N L_{3 bounded},$ and $N L_{4 bounded}$ for $Ou t_{1}$ , $Ou t_{2}$ , $Ou t_{3}$ , and $Ou t_{4}$ of the linear complex-valued functional block of Example 1.

$N L_{1 bounded}$	$N L_{2 bounded}$	$N L_{3 bounded}$	$N L_{4 bounded}$
1.4163e−29	5135e−30	6.2673e−30	5.7850e−30

The individual NL values give insight to which outputs contribute the most to the overall NL. In this case, $Ou t_{1}$ gives the greatest contribution.

Figure 10 shows a scatter plot of the resulting complex-valued true responses labeled on the plot as “Data” with circles and the “LSS approximation” labeled with asterisks plotted with the real and imaginary values on the x- and y-axes, respectively. The plot illustrates the high level of accuracy of the calculated linear hyperplane using the LSS approach when the underlying multi-variable complex functions are linear.

Figure 10.

Plot of output from the linear complex-valued multi-variable function with the imaginary component of the output on the y-axis and the real component on the x-axis.

4.4. Example 2: a nonlinear four-input, four-output complex-valued functional block

In our second example, we present a four-input, four-output functional block that contains nonlinear operations. Nonlinear, time-invariant signal pre-processing blocks often multiply complex input signals by trigonometric functions (e.g., sin for frequency translation), exponential functions (e.g., Fourier transforms for generating image features), absolute value functions (e.g., as precursors to the application of logic in subsequent blocks), and other related nonlinear transformations. Without loss of generality, Equations (20)–(23) generate an illustrative four-input, four-output block represented in the format used in the MATLAB software package to generate the functions and data developed in this example:

\begin{array}{l} f 1 (I n_{1}, I n_{2}, I n_{3}, I n_{4}) \\ = (0.3678 * \exp (- c o n j (I n_{1})) \\ + 0.6422 * \tan (I n_{1}) \\ + 0.3679 * \exp (- I n_{2}) \\ + 0.6422 * \tan (c o n j (I n_{2})) \\ + 0.3679 * \exp (- I n_{3}) \\ + 0.6422 * \tan (I n_{3}) \\ + 0.3679 * \exp (- I n_{4}) \\ + 0.6422 * \tan (I n_{4}) \\ + 0.7071 * a b s (I n_{1}) \\ + 0.6919 * \sin (I n_{2}) \\ + 0.6919 * \sin (c o n j (I n_{3})) \\ + 0.7145 * I n_{4}^{I n_{3}} \\ + 0.07145 * c o n j {(I n_{1})}^{c o n j (I n_{2})} \\ + 0.7147 * I n_{3}^{I n_{4}}) \end{array}

(20)

\begin{matrix} f 2 (I n_{1}, I n_{2}, I n_{3}, I n_{4}) \\ = (0.6422 * \tan (conj (I n_{1})) \\ + 0.07145 * I {n_{1}}^{I n_{2}} \\ + 0.3679 * \exp (- I n_{2}) \\ + 0.6319 * \sin (conj (I n_{3})) \\ + 0.07141 * conj {(I n_{4})}^{conj (I n_{3})} \\ + 0.3679 * \exp (- conj (I n_{1})) \\ + 0.6422 * \tan (I n_{3}) \\ + 0.3679 * \exp (- I n_{1}) \\ + 0.6919 * \sin (I n_{2}) \\ + 0.7071 * abs (I n_{4}) \\ + 0.6422 * \tan (I n_{3}) \\ + 0.3678 * \exp (- I n_{1}) \\ + 0.6145 * I {n_{4}}^{I n_{2}} \\ + 0.5422 * \tan (I n_{1})) \end{matrix}

(21)

\begin{matrix} f 3 (I n_{1}, I n_{2}, I n_{3}, I n_{4}) \\ = (0.5311 * \tan (I n_{1}) \\ + 0.08145 * I {n_{1}}^{I n_{2}} \\ + 0.2679 * \exp (- I n_{1}) \\ + 0.5422 * \tan (I n_{3}) \\ + 0.6071 * abs (I n_{1}) \\ + 0.5819 * \sin (I n_{2}) \\ + 0.2679 * \exp (- I n_{4}) \\ + 0.3579 * \tan (I n_{3}) \\ + 0.3479 * \exp (- conj (I n_{4})) \end{matrix}

\begin{matrix} + 0.5145 * conj {(I n_{4})}^{conj (I n_{2})} \\ + 0.4219 * \sin (conj (I n_{2})) \\ + 0.3679 * \exp (- I n_{3}) \\ + 0.7145 * I {n_{4}}^{I n_{3}} \\ + 0.4422 * \tan (conj (I n_{4}))) \end{matrix}

(22)

\begin{matrix} f 4 (I n_{1}, I n_{2}, I n_{3}, I n_{4}) \\ = 0.3479 * \exp (- I n_{4}) \\ + 0.4919 * \sin (I n_{3}) \\ + 0.3071 * abs (I n_{2}) \\ + 0.2071 * \tan (I n_{1}) \\ + 0.2679 * \exp (- I n_{4}) \\ + 0.6145 * I {n_{4}}^{I n_{3}} \\ + 0.3422 * \tan (I n_{2}) \\ + 0.5422 * \tan (conj (I n_{1})) \\ + 0.6145 * I {n_{1}}^{I n_{4}} \\ + 0.2679 * \exp (- I n_{3}) \\ + 0.5919 * \sin (conj (I n_{2})) \\ + 0.6145 * conj {(I n_{4})}^{conj (I n_{1})} \\ + 0.2679 * \exp (- conj (I n_{4})) \\ + 0.3279 * \tan (I n_{3})) \end{matrix}

(23)

Figure 11 shows a scatter plot of the complex-valued responses plotted with the real and imaginary values of the input on the x- and y-axes. By visual inspection, it can be seen that the LSS approximation is far less accurate in this nonlinear complex-valued function scenario. This is just as we saw in Figure 1 where the linear approximation of a nonlinear function is not a good predictor of the function.

Figure 11.

Plot of output from the nonlinear complex-valued multi-variable function with the imaginary component of the output on the y-axis and the real component on the x-axis. Notice that the plot shows that the LSS approximation is not a good predictor of the actual function.

The parameters shown in Table 4 allow the calculation of the approximations for $f_{1}$ , $f_{2}$ , $f_{3}$ , and $f_{4}$ using Equation (3). Again, to calculate the $NL$ using Equation (8) for each of the individual outputs, we need to first go back to Equation (7) and set $a_{ik}$ equal to the vector of approximations found using each of the sets of parameters listed in Table 4 and then $t_{ik}$ is the vector of associated true values which come from each of the functions $f_{1}$ through $f_{4}$ . The individual NL values are shown in Table 5.

Table 4.

Calculated hyperplane parameters for approximating $Ou t_{1}$ , $Ou t_{2}$ , $Ou t_{3}$ , and $Ou t_{4}$ of the nonlinear complex-valued functional block of Example 2.

	$f_{1}$	$f_{2}$	$f_{3}$	$f_{4}$
$b$	0.1690 +0.0008 i	0.1320 +0.0094 i	0.1460 −0.0002 i	0.1610 −0.0019 i
$a_{1}$	0.0360 +0.0029 i	−0.0090 −0.0012 i	0.0140 +0.0011 i	0.0130 +0.0033 i
$a_{2}$	0.0140 −0.0026 i	−0.0100 −0.0074 i	0.0250 −0.0030 i	0.0100 −0.0020 i
$a_{3}$	−0.0610 −0.0201 i	0.0710 +0.0039 i	−0.0060 −0.0092 i	−0.0010 −0.0107 i
$a_{4}$	0.0130 +0.0047 i	−0.0030 −0.0004 i	−0.0120 +0.0047 i	0.0580 +0.0040 i

Table 5.

Calculated $N L_{1}, N L_{2}, N L_{3}, and N L_{4}$ for $f_{1}$ , $f_{2}$ , $f_{3}$ , and $f_{4}$ of the linear complex-valued functional block of Example 2.

$N L_{1}$	$N L_{2}$	$N L_{3}$	$N L_{4}$
5.6060	2.0199	2.5424	4.3449

Or using the bounded $(N L_{bounded})$ NL metric of Equation (9), Table 6 shows the values that are calculated.

Table 6.

Calculated $N L_{1_bounded}, N L_{2_bounded}, N L_{3_bounded},$ $f_{3} and N L_{4_bounded}$ for $f_{1}$ , $f_{2}$ , $f_{3}$ , and $f_{4}$ of the linear complex-valued functional block of Example 2.

$N L_{1_bounded}$	$N L_{2_bounded}$	$N L_{3_bounded}$	$N L_{4_bounded}$
0.8486	0.6689	0.7177	0.8129

The individual NL values give insight to which functions contribute the most to the overall NL. In this case, $f_{1}$ gives the greatest contribution of 0.8486, while the “least nonlinear” path through the block has a contribution of 0.7177 at $f_{3}$ . Intuitively, a simple neural network will not likely represent this block very well since all of the signal transformations are highly nonlinear.

In addition, since the least-squares technique is highly sensitive to outliers, or data points that do not follow the pattern of the other observations, these types of data points will disproportionately influence the NL metric. The outcome will be indication of a more nonlinear function than would be inferred if the offending outliers were not present. This is not typically a problem if the input data are drawn from a normal distribution from the interval discussed previously and the underlying true complex-valued nonlinear behavior is not ill-conditioned.¹³ Although real-world systems cannot constrain their inputs to be drawn from a well-behaved normal distribution, the input distribution generates the NL metrics that provide significant insight into the degree of NL present in the block being modeled.

5. Conclusion and future work

In this paper, we introduced a metric for quantifying NL in multi-dimensional complex-valued functions. The metric is based upon existing NL metrics developed for the real domain and is extended to quantify NL in k-dimensional complex-valued functions. The metric is easy to understand, generalizable to multiple dimensions, is invariant to linear scaling, and does not require a closed-form continuous representation of the function being evaluated.

Using the knowledge and experience gained through this work, our research is focused upon evaluating the strengths and weaknesses of various nonlinear complex-valued function approximation techniques. Specifically, we are focused upon understanding the tradeoffs in accuracy, performance, and resource utilization when approximating complex-valued, nonlinear functions using complex-valued neural networks. Future work will explore these tradeoffs, as a function of NL, when generating surrogate models to approximate complex-valued functional blocks. It is expected these results will inform others who are seeking to use neural network-based surrogate models. Ideally, the use of our metric to quantify NL in complex-valued functional blocks will provide researchers a means to predict the required neural network architecture, model parameters, and resources necessary to accurately surrogate model a complex-valued functional block as a function of its NL.

Footnotes

Disclaimer

The views expressed in this paper are those of the authors and do not reflect the official policy or position of the US Air Force, the Department of Defense, or the US Government.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Laboratory for Telecommunication Sciences (grant number 5703400-000-2121)

ORCID iD

Michael R Grimaila

Author biographies

Larry C Llewellyn is a PhD candidate at the Air Force Institute of Technology, Department of Systems Engineering and Management, Wright-Patterson Air Force Base, OH, USA.

Michael R Grimaila is a full professor at the Air Force Institute of Technology, Department of Systems Engineering and Management, Wright-Patterson Air Force Base, OH, USA.

Douglas D Hodson is an associate professor at the Air Force Institute of Technology, Department of Electrical and Computer Engineering, Wright-Patterson Air Force Base, OH, USA.

Scott Graham is an associate professor at the Air Force Institute of Technology, Department of Electrical and Computer Engineering, Wright-Patterson Air Force Base, OH, USA.

References

Nishino

Yamaki

Hirose

. Ultrasonic imaging for boundary shape generation by phase unwrapping with singular-point elimination based on complex-valued Markov random field model. IEICE T Fundam Electron Commun Comput Sci 2010; 93: 219–226.

Hirose

. Complex-valued neural networks: advances and applications. Chicago, IL: John Wiley & Sons, 2013.

Suksmono

Hirose

. Adaptive noise reduction of InSAR images based on a complex-valued MRF model and its application to phase unwrapping problem. IEEE T Geosci Remote Sens 2002; 40: 699–709.

Forrester

AIJ

Sobester

Keane

. Engineering design via surrogate modelling: a practical guide. Chicago, IL: John Wiley & Sons, 2008.

Cybenko

. Continuous valued neural networks with two hidden layers are sufficient. Urbana, IL: Center for Supercomputing Research and Development, University of Illinois at Urbana–Champaign, 1988.

Voigtlaender

. The universal approximation theorem for complex-valued neural networks. ArXiv, 2020, http://arxiv.org/abs/2012.03351v1

Passey

. Evaluation of the linearity of quantitative analytical methods: proposed guideline. Wayne, PA: National Committee for Clinical Laboratory Standards (NCCLS), 1986.

Kroll

Emancipator

. A theoretical evaluation of linearity. Clin Chem 1993; 39: 405–413.

Emancipator

Kroll

. A quantitative measure of nonlinearity. Clin Chem 1993; 39: 766–772.

10.

Novak

. Numerical methods for scientific computing. Morrisville, NC: Lulu.com, 2017.

11.

Elgot

. Least squares over the complex field. White Oak, MD: Naval Ordnance Laboratory, 1954.

12.

Stevenson

. IEEE standard for binary floating-point arithmetic. New York: IEEE, 1985.

13.

Trefethen

Bau

IIID

. Numerical linear algebra, vol. 50. Philadelphia, PA: SIAM, 1997.

A metric for quantifying nonlinearity in k -dimensional complex-valued functions

Abstract

Keywords

1. Introduction

2. Quantification of NL of a function of one variable

2.1. Quantification of NL of a real function of one variable

2.2. Quantification of NL of a complex function of one variable

3. Quantification of NL in k-dimensional complex-valued functions

4. Application of NL quantification to complex-valued functional blocks

4.1. A functional block

4.2. Input and output data

4.3. Example 1: a linear four-input, four-output complex-valued functional block

4.4. Example 2: a nonlinear four-input, four-output complex-valued functional block

5. Conclusion and future work

Footnotes

Disclaimer

Funding

ORCID iD

Author biographies

References