Abstract
The software Randentropy is designed to estimate inequality in a random system where several individuals interact moving among many communities and producing dependent random quantities of an attribute. The overall inequality is assessed by computing the Random Theil’s Entropy. Firstly, the software estimates a piecewise homogeneous Markov chain by identifying the change-points and the relative transition probability matrices. Secondly, it estimates the multivariate distribution function of the attribute using a copula function approach and finally, through a Monte Carlo algorithm, evaluates the expected value of the Random Theil’s Entropy. Possible applications are discussed as related to the fields of finance and human mobility.
Introduction
The issue of measuring inequality in a system found extensive treatment in the literature. One interesting approach is based on entropic measures. Starting from the pioneering work by Shannon (1948) on the mathematical theory of communication, the concept of entropy has found a rapid development and diffusion in many scientific communities. Notable examples are statistics (see, e.g. Kullback and Leibler, 1951), statistical mechanics (see, e.g. Jaynes, 1957), economy (see, e.g. Theil, 1967) and ecology (see, e.g. Phillips et al., 2006), just to name a few.
Recent efforts have been dedicated mainly to introduce new entropies as the cumulative residual entropy (see, Rao et al., 2004) or the cumulative past entropy (see, Di Crescenzo and Longobardi, 2009). In the meantime, and mainly motivated by economic problems, the notion of random entropy has emerged in terms of a normalization of a random process. The random entropy shares the same functional form as the classical entropy but is related to a random process (D’Amico and Di Biase, 2010). This more general entropy was called by the author Dynamic Theil Entropy. Nevertheless, we refer to it as Random Entropy, to avoid any possible misunderstanding with other dynamic entropies which are expressed as deterministic functions as in Di Crescenzo and Longobardi (2002), Asadi and Zohrevand (2007) and Calì et al. (2020).
The Random Entropy allows to quantify uncertainty in a random system evolving in time and encompasses recent approaches and measures introduced in Curiel and Bishop (2016). In this paper, we consider the general model considered in a previous work (D’Amico et al., 2019) and we present a software that permits the calculation of the inequality in a general system composed by a number of interacting individuals. Any individual moves among several communities in time and according to its membership, and depending on that of the other individuals, produces an attribute. The dynamic of individuals among the communities is described according to a piecewise homogeneous Markov chain which requires the identification of an unknown number of change-points (i.e. where the Markov chain changes its dynamic). Conditional on the occupancy of the communities, the individuals produce an attribute in quantities expressed by a multivariate probability distribution where the dependence structure is managed by a copula function. Finally, using a Monte Carlo algorithm, we show how to compute the moments of the Random Entropy.
The main innovation brought by this research is the building of the software
The subsequent sections of this paper present the general mathematical model, relevant scenarios of application and the software main characteristics, both the CLI (Command Line Interface) and GUI (Graphical User Interface) are described.
Theory
The main function driving the development of the software we are presenting here (i.e.
Let denote the meta-community by
The sequence of the visited communities by any individual
Firstly, we assume an independence assumption between the dynamics of the individuals. Thus, the community process for every individual will be denoted simply by
Moreover, we assume that
The symbols
Intuitively, the term piecewise refers to the existence of some points in time where the dynamic changes consistently. These times are called change-points. They break up the timeline into several sub-periods within whom the Markov process is homogeneous.

Example of three change-points.
However, for the sake of clarity of presentation, consider the example illustrated in Fig. 1 where three change-points are considered at times
Next step concerns the specification of the processes describing the personal attributes, i.e.
This strategy is pursued first by assuming that the marginal distributions of the attributes of the individuals allocated in the same community at a given time share the same probability distribution function. Formally, let
Before presenting our second main assumption we need to present the concept of copula which will be a key issue in the model and software.
An N-dimensional copula C is any function
Additionally, a dependence structure is introduced through the application of a copula function. This is formally done advancing the second main assumption stating that: the conditional joint distribution of
A notable example of copula function is the Normal (or Gaussian) copula. Let
In general, the corresponding density of the copula is
Here, the parameters are represented by the correlation matrix
As we are interested in measuring the inequality of the distribution of attributes in the meta-community, we need to introduce a measure of inequality. In particular, the measure of inequality we consider allows the user to face with stochastic processes. The measure is based on the Theil entropy (see Theil, 1967), closely related to the Shannon entropy (see Shannon, 1948). Given a probability distribution
The definition of Theil index has been extended for stochastic processes by D’Amico and Di Biase (2010) and successively applied and further investigated in D’Amico et al. (2012) and in D’Amico et al. (2014) for an additive decomposition of this index. The random extension of the Theil index is, indeed, introduced.
Let
We denote the Random Entropy in the meta-community by the stochastic process
An explicit formula for the expected value of
The whole computational procedure is made of several steps, thus, to simplify the readability of the reported pseudocode (see Algorithm 1), we are omitting some of the preliminary tasks, such as: the identification of the number K and dislocation in time of the change points

Monte Carlo Simulation of the Random Entropy
For easiness of notation we adopt the following vectorial notation along the Algorithm 1:
M denotes the horizon time of the simulation.
The Algorithm 1 generates a vector of observations
The result of Algorithm 1 is a sequence of values
In this section we provide a short description of two possible domains of application of the model. Certainly, a variety of additional situations falls well within the described theoretical setting.
Financial Inequality in an Economic Area
This application was originally considered by D’Amico et al. (2018a) and (2018b) and successively in a more comprehensive way in D’Amico et al. (2019). In this framework we have a meta-community that coincides with a given set of countries all belonging to a given Economic Area. A possible case is represented by the European Economic Area. Practically, every country receives a note about its financial creditworthiness, which is expressed in terms of a sovereign credit rating, see e.g. Trueck and Rachev (2009) and D’Amico et al. (2017). Credit ratings are measured in an ordinal scale and assigned by the rating agencies. Moody’s, Standard & Poor’s and Fitch are three major among others.
Each rating class can be seen as a community, in which the countries are allocated at every time. According to its own riskiness (expressed by rating class), each country pays interest rates on its debt. When the interest rates are compared to a benchmark they define the so-called credit spreads. Thus, credit spreads can be seen as personal attributes held by each country in time.
Empirical analysis has shown that credit spreads of European countries are positively correlated, with the exception of Denmark, Sweden and the United Kingdom. To model this complex correlation structure a copula function can be used according to our framework. Once the credit spreads are obtained, it is possible to compute the vector of attributes at time t,
Human Mobility and Environmental Implications
Another area in which the
For example, it would be possible to consider pollution as a variable depending on the specific location, and to measure by the index
Computational Details and Applications
The software we are presenting here has been engineered so that the main computational kernel is included in a single python module named
While the two mentioned CLIs have been specifically developed to perform separately the Markov reward computation (i.e.
The full software suite has been developed within the Linux OS environment. However, once the needed packages are downloaded and installed, it should work, without restrictions, also under Mac OS and Windows thanks to the intrinsic portability nature of the Python programming language. The Python packages, in addition to the aforementioned PyQT5, strictly needed to run the code are:
The Randentropykernel Class and Related CLI
As already stated, the
The

CLI for the Markov reward approach.
Evidently, the CLI options reported in Fig. 2 reflect the cited
It is finally somehow interesting to report here that: in case one wants to perform the simulation using the stationary distribution π of the Markov chain
As already stated within the
The most relevant methods within the class are needed to specify the transition matrix (i.e.

CLI for change-point detection algorithm.
Once again the CLI options, reported in Fig. 3, as expected, reflect the class capabilities. Thus, to run the code, the input transition matrix has to be specified, in terms of a Matlab or a CSV filename, as well as the matrix name within the file (options
Finally, we introduced some methods, and clearly the relative CLI options that can be used also to distribute the computational burden among several processes, thus CPUs. Indeed, while working with a huge amount of data it can be convenient to specify a range of time within which the algorithm is carried out, or to use a specific time distance between two change-points. Thus, the user has the ability to define a range of time for the first change-point (the same applies for the others) via the
All the previously illustrated functionalities have been integrated also on a GUI (Graphical User Interface). The GUI has been implemented using PyQT5, a comprehensive set of Python bindings for Qt v5 (PyQT, 2012). While we implemented two different CLIs, to fully cover the various aspects implemented within the

Dialog to specify the input matrices.

Dialog to specify the the input parameters related to the Monte Carlo simulation.
The computation starts after choosing an input file, it can be both a Matlab, as well as a CSV, containing two matrices. The first matrix has to contain the data of the variable which is supposed to evolve according to a Homogeneous Markov Chain (HMC) (e.g. in the financial application the variable consists on the sovereign credit ratings, see Section 3.1). As a matter of fact, the first matrix is expected to be named “ratings” by default (see Fig. 4). The second matrix has to refer to the reward process describing the attribute which is driven by the HMC. In the case of the financial application, as illustrated in Section 3.1, this is the credit spread. As the code directly computes the credit spread starting from the interest rates, the second matrix directly collects the interest rate data. Indeed, by default, this matrix within the file is expected to be named “interest_rates” (see Fig. 4).
Once the two matrices have been specified, the user may start the computation:
Alternatively, the user can flag “Simulation using stationary distribution” to compute the asymptotic values of the Random Theil’s Entropy. After clicking the button

Output: dynamic inequality.

Histogram of the CS empirical distribution/transition probability matrix.

Options required for change-point detection.
Subsequently, by clicking on
Finally, with
In the case reported in Fig. 8, a single change-point is detected within a range of time spreading between

Output of the change-point detection algorithm.
After confirming the chosen options, the computation starts and the GUI returns the plot of the likelihood function estimated on the community data (see in Fig. 9), together with the value of the maximum likelihood function, and the corresponding position of the calculated change-point. Evidently, also in this case, the resulting plot can be saved into a standard graphical file format.
Finally, we will show how the described CLIs and GUI can be used to predict the financial inequality in the European Economic Area according to the theoretical model proposed in D’Amico et al. (2018a, 2018b). In this specific case, the meta-community coincides with all the countries within the European Community. Thus, each rating class, as assigned by rating agencies, can be seen as a community, in which the countries are allocated at every time step. Clearly, as also already stated in the previous section, the credit spread represents the personal attributes held by each country.
The results we are here reporting have been obtained using the monthly rating, attributed by the Standard & Poor’s agency, to the 26 European countries (UK and Cyprus have been excluded in the current meta-community sample) from January 1998 to December 2016 (see, D’Amico et al., 2018a, for extra details on the data-set we are here considering).
To detect the position of a change-point, within the considered horizon time, we compute the maximum value of the likelihood function considered as a function of the position of the change point. Finally, we fix the change point as the value that maximizes the likelihood function. In the proposed software one can use both the

GUI results for the change-point detection, see text for details.
The result is reported in Fig. 10, where the likelihood function is computed depending on the position of the change point (measured on the X-axis). The software detects a change-point at time 158 (the maximum value of the likelihood function). The value 158 corresponds to a change point detected in January 2012. Indeed, at the beginning of 2012 the value of the total credit spread in Europe had a peak of about 10.000 basis points (bp) and this growth was driven by the rise of the securities yield of Greece (2.924 bp), Ireland (1.245 bp) and Portugal (1.385 bp), see D’Amico et al. (2018b) for more detail about the evolution of financial variables. Similarly, for the interested reader, financial examples with multiple change-points can be found in D’Amico et al. (2019)
It is relevant to notice that the software (i.e. the
Equivalently, a user can forecast the financial inequality in an economic area and its evolution in time via the

Random Entropy. Results obtained using the CLI are reported on the left panel, while the ones obtained using the GUI have been reported on the right panel.
As a final remark, it is somehow important to underline that, evidently, a user has the capability of building its own code, to perform the same or similar computations just described, accessing directly the functionalities implemented within the
The
Random Entropy evaluation, in the presented general framework, is a new and challenging subject of research and is not available in any software; this renders our investigation an “unicum” in the literature of inequality assessment in stochastic systems.
