Sage Journals: Discover world-class research

Abstract

Cycling competition is highly interesting since the team ranking is based on the best performance of some subset of team members. The paper develops new inequality indicators, a methodology to construct them, and numerical illustrations allowing to provide operative arguments in their favor. The numerical illustrations subsequently deal with hierarchical ranking indicators of (female) cyclist teams, competing in multi-stage races. For the illustration, the 2023 editions of the most famous long races for females are considered: 34th Giro d’Italia Donne, 2nd Tour de France Femmes, 9th Vuelta Femenina. Several classical ranking indicators are recalled and adapted to the study cases. The most usual indicator, $T_{L}$ , is based on the riders arriving time for the various stages, i.e., according to Union Cycliste Internationale (UCI) standard rules. One also uses another indicator, $A_{L}$ , which requires that the riders finish the race, whence each stage, in order to define the race best team. Another contribution of the paper derives from specific developments of these indicators, thereby leading to new measures: the “leadership gap” based on $A_{L} - T_{L}$ , and the “competition temperature”, based on entropy. It is argued that the numerical values point to differences in team strategy based on rider skill levels. The ranking of contributions to indicators allow to observe the “crucial core” made of the most competitive teams.

Keywords

dynamics of social systems entropy racing strategy rules of sport strategy

Introduction

Rules have to be devised for providing a realistic hierarchy of choices.

Yet, the ranking methodology can lead to much debate,^1–7 among many others in social choice considerations, as in the case of tournament ranking methods.^1,8–14 In particular, sport activities seem to provide rather objective and quantitatively reliable data for academic studies.¹⁵

Among these, it appears that cyclist races contain much interesting data. Indeed, one can focus on the role of individuals within a team, since team hierarchy is based on the best performance of a subset of members of the team. Within this framework, incentive must be provided to teams and team members for showing some interesting race competition.¹⁶ Thus, relevant hierarchy indices are needed, - and somewhat tied to money awards.^17,18

In the following, one considers the three most famous female races within the Union Cycliste Internationale (UCI) classifications: Giro d’Italia Donne, Tour de France Femmes, Vuelta Femenina.

According to UCI rules, the hierarchy of the teams, at the end of an $L$ -long multi-stage (professional) cyclist race, depends on the cumulative time ( $T_{L}$ ) of the team’s 3 fastest riders for every stage, - not taking into account time bonuses or penalties. These times are relevant for the rider standing, but should not be taken into account for the team rank. However, for the final team ranking at the end of the race, UCI team hierarchy does not even care if such riders, relevant for some stages, do finish the whole race. This highly debatable measure has been discussed elsewhere as paradoxical - and proved to be highly biased.

Thus, one may introduce an “adjusted team final time” measure, $A_{L}$ , based on the (3 fastest) riders of a team who have finished the whole race.^19,20 In so doing, one avoids possible Cipollini effect,²¹ - when riders are specifically selected for the few first usually easy stages, as sprinters, but are withdrawn thereafter, yet globally contributing, even though absent, to the overall team time classification.

Going beyond the above consideration, one may derive metrics that aim at measuring some team skill and also at attempting to quantify team global strategy, for a given race. For so doing, one proposes two new measures or indicators: (i) the “leadership gap index”, (ii) the “race temperature index”. The ranking of teams according to such indicators allow to observe the “crucial core” made of the most competitive teams.

In brief, these two so newly defined metrics complement the entropy approach¹⁹ and hopefully develop previous works^20,21 toward team management and coaching applications.

For completeness, let it be observed that this paper enters the framework of studies on cycling published in the International Journal of Sports Science & Coaching and other journals. Notice that most works pertain to the (physiological) characteristics of the cyclists.^22–25 Closer to the present aim, O’Grady et al. discuss, after interviews, tactical strategies that professional riders and coaches prepare at training time for application in races.²⁶

In Section “Research questions”, one poses the Research Questions and mentions the Data sources. One displays the fundamental characteristics of the races.

In Section “Methodology”, one introduces the methodology, including the formulae for $T_{L}$ and $A_{L}$ . Next, one explains that (i) the “leadership gap index”, in Section “Leadership gap index”, is based on [ $A_{L} - T_{L}$ ]; (ii) the “race temperature index” is defined through Shannon-Boltzmann-Gibbs entropy concepts, in Section “Stage and race temperature index”. It is argued why these indices are so called.

In Section “Other indicators”, other hierarchy measuring indices are considered for readily comparison, i.e., some qualitative advantages and disadvantages of the newly proposed indicators.^12,27–31 There exist fundamentally different approaches in ranking methodology. It is pertinently emphasised that changing the ranking rules, in a multi-stage race, may change some tournament metrics; see, for example, scoring and ranking simulation by Csató.¹³

Here, two inequality indicators can be directly derived from the distribution characteristics in order to evaluate the dispersion of team “values”: the Atkinson index,³² and the Coefficient of Variation. Three classical indicators of dispersion can be next considered: the Herfindahl-Hirschman index, the Gini coefficient, and the Theil index.^33–35 These indicators show how dispersed the final times are, but are calculated without taking into account the ranking of teams. One may delve more into the hierarchy problem if one ranks the components. Moreover, one can calculate other ad hoc indicators: the Pietra-Hoover index,³⁶ and the Rosenbluth coefficient.³⁷

Section “Results and analysis” contains numerical results and some analysis. In Section “Discussion”, one deepens into team hierarchy, comparing teams in the various races. Conclusions follow in Section “Conclusions”, together with suggestions for further research due to obvious limitations of the present study.

Research questions

Due to the considerations outlined in the Introduction section, i.e., UCI unjustified shocking biased constraints on usual team value measures, one can select the following research questions as a guiding thread of the paper:

can one provide indicators with less biased constraints on team ranking?

are they strategic or coaching features which arise in studying and measuring team “competition” hierarchy, in cyclist races?

For finding proper answers geared toward various disciplines but based on case studies the following top multistage races with complex data, are hereby used:

34th Giro d’Italia Donne;

2nd Tour de France Femmes;

9th Vuelta Femenina.

For the present exposé only a single year is examined: 2023.

For simplicity, the races will be called $T d F$ , $G d I$ , $V a E$ . A few fundamental characteristics for the 3 races are found in Table 1: dates of races, length, number of stages, number of riders and of teams, etc. The list of competing teams and their UCI code are given in Table 2, also emphasizing the team level according to UCI for those participating in a specific race: $W$ refers to Women’s World Teams; $C$ to UCI Women’s Continental Teams.

Table 1.

Characteristics values of the 2023 $G d I$ , $T d F$ , $V a E$ races; $L$ : the number of stages; $n_{0}$ : the allowed maximum number of starting riders in a team; $M_{0}$ : the number of teams considered (distinguishing the number of W and C teams); $N_{0}$ and $N_{L}$ : the number of starting and finishing riders; $M_{L}$ is the number of teams finishing the race with at least 3 riders. The rider winner and the team winner (along UCI rules) and the race dates are recalled. (*) N.B. The race was supposed to be 9 stages long, i.e., 968 km, but the 4.4 km long (time trial) 1st stage was cancelled due to weather conditions.

Notations	$G d I$	$T d F$	$V a E$
$L$	8(*)	8	7
$n_{0}$	7	7	7
$M_{0}$	24 (=15+9)	22 (=15+7)	23 (=12+11)
Distance ( $k m$ )	963.6	960.4	740.5
$N_{0}$	167	154	160
$N_{L}$	133	123	127
$M_{L}$	23	22	22
Winning rider	vanVleuten	Vollering	vanVleuten
Winning team	MOV	SDW	UAD
Race dates	06/30-07/09	07/23-07/30	05/01-05/07

Table 2.

UCI acronym of team sponsors, for competing female teams in the 3 here considered main multi-stage 2023 races ( $G d I$ , $T d F$ , $V a E$ ); the 2022 ranking order is from https://www.procyclingstats.com/rankings.php?date = 2023–12–26&nation =&level = &filter = Filter&p = we&s = uci — teams. The 2022 UCI code acronyms are used. The various team participations into the specific races are summarized in the last 3 columns; W refers to women’s world teams; C to UCI women’s continental teams, according to UCI 2023 levels. N.B. $^{(*)}$ change of sponsors during racing year; $^{(* *)}$ had no licence in 2022.

2022	UCI	Team	2023	2023	2023
Rank	Code	Sponsors	$G d I$	$T d F$	$V a E$
1	SDW	Team SD Worx	W	W	W
2 $^{(*)}$	TFS	Trek - Segafredo // Lidl - Trek	W	W	W
3	DSM	Team DSM - Firmenich	W	W	W
4	FST	FDJ - SUEZ	W	W	W
5	MOV	Movistar Team	W	W	W
6	CSR	Canyon-SRAM Racing	W	W	W
7	UAD	UAE Team ADQ	W	W	W
9	JAY	Team Jayco AlUla	W	W	W
10	JVW	Team Jumbo-Visma (TJV)	W	W	W
11	TIB	EF Education-TIBCO-SVB	W	W	W
13	LIV	Liv Racing TeqFind	W	W	W
14	WNT	CERATIZIT-WNT Pro Cycling		C
15	LPW	Lifeplus Wahoo		C
19	AGS	AG Insurance - Soudal Quick-Step	C	C
20	HPU	Team Coop - Hitec Products		C	C
23	AUB	St Michel - Mavic - Auber93 WE		C	C
24	UXT	Uno-X Pro Cycling Team	W	W
25	HPH	Human Powered Health	W	W
26	BPK	BePink - GOLD	C		C
28	COF	Cofidis Women Team		C
30	ARK	Arkéa Pro Cycling Team		C
31	MAT	Massi - Tactic Women Team			C
33	BDU	Bizkaia Durango	C		C
35	TOP	Top Girls Fassa Bortolo	C
36	EIC	Eneicat - CMTeam - Seguros Deportivos			C
39	SWT	Sopela Women’s Team			C
40	VAI	Aromitalia - Basso Bikes - Vaiano	C
42	FBW	Farto-BTC Women’s Cycling Team			C
51	LKF	Laboral Kutxa Fundación Euskadi			C
53	SBT	Isolmant - Premac - Vittoria	C
56	STC	Soltec Team			C
59	BTW	Born To Win G20 Ambedo	C
63	CDR	Cantabria Deporte - Rio Miera			C
64	MDS	Team Mendelspeck	C
112	FED	Fenix-Deceuninck	W	W
145	COG	Israel Premier Tech Roland	W	W	W
$^{(* *)}$	GBJ	GB Junior Team Piemonte Pedale Castanese A.S.D.	C

One can freely obtain relevant data from the organizers websites. However, they are not all provided in a consistent way. Therefore, it is best to rely on professional websites, e.g., https://www.procyclingstats.com. Nevertheless, data cross-checking must be systematically done; one chosen method has been to use $W i k i p e d i a$ pages. Disagreements are still found; they have been manually resolved.

Methodology

One should recall that in a $L$ -long multi-stage cyclist race, the UCI rules imply that the winning team is discovered as the team $(#)$ having the lowest sum of the cumulative times for the 3 fastest riders of the team for each stage, i.e., the lowest $T_{L}$

T_{L}^{(#)} = \sum_{s = 1}^{L} t_{s}^{(#)}

(1)

when the team (finishing) time for stage

s

t_{s}^{(#)} = \sum_{i = 1}^{3} t_{i, s}^{(#)}

(2)

where

t_{i, s}^{(#)}

is the finishing time of one of the 3 fastest riders (

i = 1, 2, 3

) of team

(#)

for that stage.

It is re-emphasized that the fact that such riders do not necessarily finish the $L$ -long race appears to be irrelevant for UCI. But, one may rightly wonder thereafter whether the sum in equation (1) points out to the “best team”. It seems that one should consider the adjusted team final time $A_{L}^{(#)}$ such that

A_{L}^{(#)} = \sum_{j = 1}^{3} t_{j, L}^{(#)}

(3)

where

j = 1, 2, 3

refers to the 3 fastest riders having completed all

L

stages for team

(#)

.²⁰ Let it be emphasized that these 3 “

j

” riders might be quite different from the 3 “

i

” riders having contributed to any

t_{s}^{(#)}

, whence to

T_{L}^{(#)}

One complexity has to be emphasized: according to UCI rules, the final time of a $r i d e r$ , at the end of a stage, whence at the end of the race, takes into account bonuses (and penalties); thus the truly finishing time of a rider is apparently equal to the sum of “the reported final time + bonus - penalties”. However, UCI rules disregard such extra time measures in order to calculate the $t e a m$ time on a stage, whence for the whole race. Thus, the same “restriction” has been used for calculating the $A_{L}$ team final time.

Therefore, the methodological path goes as follows:

get each team $T_{L}^{(#)}$ , according to organizers published data.

rank riders in each team according to their true finishing race time, i.e., excluding bonuses and penalties (if they exist);

select the 3 fastest riders overall in each team, for each daily stage, and add their cumulative times to get $A_{L}$ .

One obtains the values and hierarchy displayed in Tables 3 to 5, in increasing time order. The statistical characteristics of the relevant finishing times distributions are reported in Table 6.

Table 3.

Table displaying the final ranking of the ( $M_{L} = 23$ female) teams after the last $G d I$ 2023 stage, i.e., $T_{L}$ according to the UCI rules, or $A_{L}$ resulting from the sum of the finishing times of the 3 fastest riders (excluding their possible bonuses and penalties) of the team ( $#$ ); the last two columns report the team hierarchy according to the ascending value of $Δ_{L}^{(#)} \equiv A_{L}^{(#)} - T_{L}^{(#)}$ . BTW finished the multi-stage race with only 2 riders.

	$G d I$ 2023
Rank	Team	$T_{L}^{(#)}$	Team	$A_{L}^{(#)}$	Team	$Δ_{L}^{(#)}$
1	MOV	73:54:51	MOV	73:59:03	DFP	0:00:33
2	FST	73:55:37	FST	74:02:34	SDW	0:03:13
3	LTK	74:05:36	DFP	74:16:58	VAI	0:03:27
4	DFP	74:16:25	SDW	74:29:31	BPK	0:04:01
5	CSR	74:20:26	UAD	74:35:06	MOV	0:04:12
6	UAD	74:22:33	LTK	74:35:41	TJV	0:05:45
7	SDW	74:26:18	CSR	74:43:13	FST	0:06:57
8	JAY	74:32:19	FED	74:56:18	COG	0:07:46
9	FED	74:39:03	TIB	75:02:57	TOP	0:08:20
10	TIB	74:43:13	JAY	75:05:57	HPH	0:08:46
11	TJV	75:04:06	TJV	75:09:51	GBJ	0:09:06
12	HPH	75:08:40	HPH	75:17:26	UAD	0:12:33
13	AGS	75:22:19	COG	75:38:24	SBT	0:14:35
14	LIV	75:23:08	AGS	75:39:49	FED	0:17:15
15	COG	75:30:38	LIV	75:45:24	AGS	0:17:30
16	UXT	75:34:50	UXT	75:57:22	TIB	0:19:44
17	TOP	76:54:08	TOP	77:02:28	MDS	0:21:56
18	BPK	77:01:00	BPK	77:05:01	LIV	0:22:16
19	MDS	77:02:05	MDS	77:24:01	UXT	0:22:32
20	SBT	77:49:13	SBT	78:03:48	CSR	0:22:47
21	VAI	78:20:07	VAI	78:23:34	LTK	0:30:05
22	BDU	78:22:49	GBJ	79:18:17	JAY	0:33:38
23	GBJ	79:09:11	BDU	79:30:07	BDU	1:07:18
24	BTW	…	BTW	…	BTW	…

Table 4.

Table displaying the final ranking of the $M_{L} = 22$ teams after the last $T d F$ 2023 stage, i.e., $T_{L}$ according to the UCI rules, or $A_{L}$ resulting from the sum of the finishing times of the 3 fastest riders (excluding their possible bonuses and penalties) of the team( $#$ ); the last two columns report the team hierarchy according to the ascending value of $Δ_{L}^{(#)} \equiv A_{L}^{(#)} - T_{L}^{(#)}$ .

	$T d F$ 2023
Rank	Team	$T_{L}^{(#)}$	Team	$A_{L}^{(#)}$	Team	$Δ_{L}^{(#)}$
1	SDW	76:17:38	SDW	76:26:34	ARK	0:02:12
2	CSR	76:29:43	MOV	76:41:25	UAD	0:04:15
3	MOV	76:35:41	CSR	76:46:20	JAY	0:05:34
4	FST	76:37:18	UAD	76:53:19	COF	0:05:35
5	UAD	76:49:04	FST	76:54:55	COG	0:05:37
6	AGS	76:53:31	AGS	77:01:21	MOV	0:05:44
7	TJV	77:02:04	TJV	77:18:35	HPH	0:05:53
8	LTK	77:05:09	COG	77:20:44	AGS	0:07:50
9	DFP	77:07:58	DFP	77:24:11	WNT	0:08:26
10	COG	77:15:07	LTK	77:33:18	SDW	0:08:56
11	FED	77:27:11	WNT	77:37:40	AUB	0:13:51
12	LIV	77:27:47	AUB	77:55:05	DFP	0:16:13
13	WNT	77:29:14	FED	77:56:01	TJV	0:16:31
14	TIB	77:39:34	COF	77:57:04	CSR	0:16:37
15	AUB	77:41:14	HPH	78:00:33	FST	0:17:37
16	COF	77:51:29	JAY	78:06:15	LPW	0:20:49
17	HPH	77:54:40	ARK	78:17:01	LTK	0:28:09
18	LPW	77:58:42	LIV	78:17:50	FED	0:28:50
19	JAY	78:00:41	LPW	78:19:31	HPU	0:38:39
20	UXT	78:08:20	TIB	78:32:07	LIV	0:50:03
21	ARK	78:14:49	UXT	78:58:52	UXT	0:50:32
22	HPU	79:25:34	HPU	80:04:13	TIB	0:52:33

Table 5.

Table displaying the final ranking of the ( $M_{L} = 22$ ) teams after the last $V a E$ 2023 stage, i.e., $T_{L}$ according to the UCI rules, or $A_{L}$ resulting from the sum of the finishing times of the 3 fastest riders (excluding their possible bonuses and penalties) of the team ( $#$ ); the last two columns report the team hierarchy according to the ascending value of $Δ_{L}^{(#)} \equiv A_{L}^{(#)} - T_{L}^{(#)}$ . SWT finished the $V a E$ 2023 multi-stage race with only 2 riders.

	$V a E$ 2023
Rank	Team	$T_{L}^{(#)}$	Team	$A_{L}^{(#)}$	Team	$Δ_{L}^{(#)}$
1	UAD	56:39:07	FST	57:27:16	TJV	0:36:06
2	FST	56:45:07	UAD	57:29:18	SDW	0:36:39
3	CSR	56:46:21	SDW	57:33:51	TFS	0:37:11
4	SDW	56:57:12	CSR	57:33:55	AUB	0:38:04
5	MOV	57:04:05	TJV	57:45:23	TIB	0:38:21
6	TJV	57:09:17	MOV	57:46:37	FST	0:42:09
7	DSM	57:10:52	TFS	57:53:19	COG	0:42:26
8	TFS	57:16:08	DSM	57:53:29	LIV	0:42:31
9	JAY	57:32:00	JAY	58:15:12	MOV	0:42:32
10	COG	57:36:22	COG	58:18:48	DSM	0:42:37
11	LIV	57:43:19	LIV	58:25:50	JAY	0:43:12
12	LKF	57:50:32	TIB	58:33:59	EIC	0:46:11
13	TIB	57:55:38	AUB	58:53:35	CSR	0:47:34
14	EIC	58:07:34	EIC	58:53:45	UAD	0:50:11
15	AUB	58:15:31	LKF	59:06:29	CDR	0:51:51
16	BDU	58:31:27	BDU	59:40:22	FBW	0:52:25
17	MAT	59:05:37	MAT	60:00:16	MAT	0:54:39
18	BPK	59:15:03	BPK	60:17:18	STC	1:01:27
19	HPU	59:24:51	HPU	60:42:30	BPK	1:02:15
20	FBW	60:10:45	FBW	61:03:10	BDU	1:08:55
21	CDR	60:27:06	CDR	61:18:57	LKF	1:15:57
22	STC	61:17:50	STC	62:19:17	HPU	1:17:39
23	SWT	…	SWT	…	SWT	…

Table 6.

Table displaying a few statistical characteristics of the final times (as defined in the text) distributions of teams in the 3 races: all times in h:m:s; $μ$ is the arithmetic mean; $γ$ is the geometric mean; $σ$ is the standard deviation; Med. is the median; Skewn. the skewness; Kurt. the kurtosis.

	$G d I$			$T d F$			$V a E$
	$T_{L}^{(#)}$	$A_{L}^{(#)}$	$Δ_{L}^{(#)}$	$T_{L}^{(#)}$	$A_{L}^{(#)}$	$Δ_{L}^{(#)}$	$T_{L}^{(#)}$	$A_{L}^{(#)}$	$Δ_{L}^{(#)}$
$M_{L}$	$23$			$22$			$22$
Min.	73:54:51	73:59:03	0:00:33	76:17:38	76:26:34	0:02:12	56:39:07	57:27:16	0:36:06
Max.	79:09:11	79:30:07	1:07:18	79:25:34	80:04:13	0:52:33	61:17:50	62:19:17	1:17:39
$μ$	75:39:04	75:54:54	0:15:50	77:26:01	77:44:41	0:18:39	58:08:16	58:57:51	0:49:35
$γ$	75:38:07	75:53:53	0:10:37	77:25:50	77:44:25	0:12:58	58:07:26	58:56:53	0:48:15
Med.	75:08:40	75:17:26	0:12:33	77:27:29	77:46:23	0:15:02	57:46:56	58:29:55	0:44:42
$σ$	1:35:42	1:39:20	0:14:27	0:43:14	0:50:44	0:16:04	1:18:06	1:25:21	0:12:29
Skewn.	0.84232	0.88415	2.03400	0.70337	0.76641	1.06135	0.93943	0.84955	1.00163
Kurt.	$-$ 0.59051	$-$ 0.38866	5.00210	0.77460	0.77392	$-$ 0.16262	$-$ 0.06882	$-$ 0.37660	$-$ 0.07262

Together with Table 1, Table 6 a posteriori allows to compare race difficulties. It is easily observed that the time distributions of $T_{L}$ and $A_{L}$ are similar for $G d I$ and $T d F$ , both races taking much more time than $V a E$ , since indeed they are $≃ 1.3$ longer. In the 3 cases, the skewness is positive $≃ 0.8$ , indicating a long tail in the final time distributions for the slowest teams. The negative kurtosis for $G d I$ and $V a E$ indicate a flatter distribution, whence a race where teams find a more balanced competition, in contrast to the $T d F$ which presents a peaked distribution at the mean, - itself close to the median. The same deduction holds when observing the $σ$ , much shorter in $T d F$ , indicating a fiercer competition between the top teams.

Leadership gap index

Since the measures $A_{L}^{(#)}$ and $T_{L}^{(#)}$ indicate a different team hierarchy, one can consider that their relative value:

Δ_{L}^{(#)} \equiv A_{L}^{(#)} - T_{L}^{(#)}

(4)

which measures a behavioral difference between teams and/or riders performance which might be due to team members skills or to coaching strategy.

The values of $Δ_{L}^{(#)}$ are given in Tables 3 to 5. The smallest $Δ_{L}^{(#)}$ should correspond to the cases in which the 3 fastest riders in each stage remain so throughout the whole competition, and finish it. A large value in contrast indicates a team emphasis on a distributive role according to riders skills.

Indeed, the indicator reaches a large value if the riders are not much concerned by their final rank. In this case, the “team leaders” do not seem to be “pre-defined”. Thus, $Δ_{L}^{(#)}$ appears to be a measure of a specific rider leadership definition in a team; in other words, the value of $Δ_{L}^{(#)}$ measures a “gap” between team strategies, - depending on riders skills and coaches mandates.

Stage and race temperature index

Shannon and Boltzmann-Gibbs entropy are analogous measures of disorder in informatics and thermodynamics.^38,39 The maximum entropy value corresponds to a state of maximum uncertainty, i.e., when all outcomes are equally likely, pointing to a lack of structure or even predictability because of the absence of disorder. Thus, the concept seems of interest for measuring some operational effect in sport results.⁴⁰

Let $p_{k}^{(s)}$ be the relative time measure ( $≃$ “contribution”) of the (3 fastest riders of a) team ( $k$ ) to the total (cumulative) time that was needed by the 3 fastest riders of each team among all ( $M_{0}$ ) teams in competition in order to finish the stage $s$

p_{k}^{(s)} = \frac{t_{k}^{(s)}}{\sum_{k = 1}^{M_{0}} t_{k}^{(s)}};

(5)

t_{k}^{(s)}

is defined in terms of

t_{i, s}^{(#)}

, the finishing time of one of the 3 fastest riders (

i = 1, 2, 3

) of team

k

(or

#

, in terms of UCI codes) for that stage; see equation (2). This measure can be taken per definition like a probability of a team best finishing time among others. Thus,

p_{k}^{(s)}

can serve as a characteristics how rare the occurrence of such an outcome is. The stochastic Shannon entropy reads

S_{k}^{(s)} = - p_{k}^{(s)} l n (p_{k}^{(s)}) .

(6)

The average of such a $S_{k}^{(s)}$ over the total probability distribution leads to the Shannon information entropy

S_{s}^{(#)} \equiv < S_{k}^{(s)} > \equiv - \sum_{k = 1}^{M_{0}} p_{k}^{(s)} l n (p_{k}^{(s)}) .

(7)

The whole race entropy $S_{L}$ derives from summing over all the teams entropy: $S_{L} = \sum_{#} S_{s}^{(#)}$ .

Thereafter, reconnecting the Shannon information entropy to the thermodynamic Boltzmann-Gibbs entropy, one can define a team dependent (“generalized”) temperature during the stage $s$ as

θ_{k}^{(s)} ≃ \frac{- 1}{p_{k}^{(s)} l n (p_{k}^{(s)})} .

(8)

Mutatis mutandis, this is analogous to the “temperature of financial markets”.^41,42 In fact, one may propose a (

s

-th) “stage temperature index” as

θ^{(s)} ≃ \frac{- 1}{\sum_{k = 1}^{M_{0}} p_{k}^{(s)} l n (p_{k}^{(s)})};

(9)

the smaller it is, the cooler appears to be the competition during the stage. Indeed, recall that the Shannon entropy of a uniform distribution, i.e, if all

p_{k}

are equal, thus in the absence of disorder, is the maximum entropy value which can occur. Randomness or disorder, in

p_{k}

’s, thereby corresponds to a high “(behavioral) temperature”, here seen as an intense competition.

The “team temperature” is forecasted to be higher if the riders have much strategic freedom. It is readily expected that such a temperature is lower if leaders are well defined. In the present case, this occurs, as easily understood, if $Δ_{L}$ is small.

The overall race temperature is of course

θ_{L} = \frac{1}{S_{L}} = \frac{- 1}{\sum_{s = 1}^{L} \sum_{k = 1}^{M_{0}} p_{k}^{(s)} l n (p_{k}^{(s)})}

(10)

where the latter

θ_{L}

has been calculated from the time of riders irrespective of the fact that they might not have finished the whole race; thus, more exactly, one should have written the

θ_{L}

rather as

θ_{L} (T_{L})

. Recall that instead of

t_{k}^{(s)}

in equation (5), one can consider that the pertinent time is that of riders finishing the whole race. Following the above path, one would obtain a final race temperature

θ_{L} (A_{L})

. Moreover, another temperature can be derived from

A_{L} - T_{L}

data, leading to some

θ_{L} (T_{L} - A_{L})

. Due to the nonlinear data transformations,

θ_{L} (T_{L} - A_{L})

\neq

θ_{L} (T_{L}) - θ_{L} (A_{L})

, - in contrast to the Entropy which has an additivity property.³⁹

Again, one can justify the semantic validity for calling $θ_{L}$ a relative temperature index. Indeed, $θ_{L}$ appears to be a measure of the distribution of riders kinetic energy at the end of a race (or stage).

Other indicators

Atkinson index

Thus, the Atkinson index $A t_{L}$ can be used for evaluating the strength dispersion of teams in a race.³² Per definition,

A t_{L} = 1 - γ / μ

(11)

where

γ

is the geometric mean and

μ

is the arithmetic mean of the distribution, - reported in Table 6. The Atkinson index has previously been used in sport in order to measuring competitive balance with an application to English Premier League football.⁴³ One can obviously apply the notion to the present cases. It could also be considered for daily stages (

s

) rather than the whole race (

L

), but such an application is left for further work.

Coefficient of variation

Among the indicators using statistical characteristics of distributions, the Coefficient of Variation ( $C V$ ) measures the data relative dispersion, i.e., pointing to the dispersion ( $σ$ ) around the mean ( $μ$ ) of the distribution; thus, expressed in percentage it is somewhat hinting to inequalities. Per definition, one has

C V_{L} = σ / μ

(12)

easily obtained from the distributions characteristics.

Herfindahl-Hirschman index

One may also recall the Herfindahl-Hirschman index serving to measure the “amount of competition” between economic entities,^44–47 - or for our examples, between teams.¹⁸

The Herfindahl index, also known as Herfindahl-Hirschman index ( $H H I$ ), is a measure of “concentration in a market”.³³ Formally, in obvious notations, it reads

H H I = \sum_{i = 1}^{N} {(\frac{y_{k}}{\sum_{j} y_{j}})}^{2}

(13)

where

y_{k}

is some economic measure, like a company size, or its share (thus, a concentration) in a market. Thus, a

H H I

index

\leq 0.01

indicates a highly competitive market between

N

firms. From a portfolio point of view, a low

H H I

index implies a very diversified portfolio; a high concentration demands

H H I \geq 0.25

; a low concentration

H H I \leq 0.15

;

H H I

ranges between

1 / N

and 1.⁴⁴

Adapted to the case of sport team ranking, $y_{k}$ can be considered as the finishing time ( $t_{k}$ ) of a team ( $k$ ), i.e., leading to $H H I = \sum H H I_{k} \equiv \sum p_{k}^{2}$ , as defined in equation (5), and where the relevant team times are selected depending on the chosen $A_{L}^{(#)}$ or $T_{L}^{(#)}$ scheme, or even $Δ_{L}$ .

As an extreme example, - which sometimes occurs, if 3 riders of each team arrive together, whence have the same finishing time for a stage, all terms in the equation (13) sum are equal, whence $H H I = 1$ , pointing to uniformity or in other words to a rather weakly competitive race. In other words, an increase in $H H I$ represents a decrease in competitive balance.^48,49

One sometimes says that the “number of effectively important competitors” is the inverse of the Herfindahl index.

A normalized $H H I$ is sometimes used in order to attempt some universal definition:

H H I^{*} = \frac{(H H I - \frac{1}{N})}{1 - \frac{1}{N}},

(14)

with the appropriate

N

; it ranges between 0 and 1.

Gini coefficient

The most popular way for quantifying inequality levels, in socio-economic systems, is through the Gini coefficient ( $G i C$ ).^50,51 It reads

G i C = \frac{1}{< y >} \frac{\sum_{i = 1}^{N} \sum_{j = 1}^{N} | y_{i} - y_{j} |}{2 N^{2}}

(15)

where the

i

-th item has a measure

y_{i}

, and

< y >

is the average value of this quantity over the whole set of

N

elements. In the present case

y_{i}

can be the resulting time (

t_{k}

) of a team (

k

) due to 3 riders as above within the

T_{L}

A_{L}

schemes. The Gini coefficient should be equal to 0 if all teams are equivalent but, e.g., = 1 if one team is much above others, or in socio-economic terms, which would be “monopolizing the whole of the available resources”. One should expect a

G i C ≃ 0

if the competition has no winner, - in other words if teams have equivalent final “values”.

Theil index

For completeness, one can define the “final” Theil index. One has

T h_{L} = \frac{1}{M_{L}} \sum_{k = 1}^{M_{L}} \frac{x_{k}}{⟨ x_{k} ⟩} \ln \frac{x_{k}}{⟨ x_{k} ⟩}

(16)

summing

\frac{x_{k}}{⟨ x_{k} ⟩}

over the different (finishing) teams

k

in the race and where

⟨ x_{k} ⟩

is the mean value, of any variable, which here can be any

t_{k}

. This transformation induces negative and positive values of the (log-transformed) data, depending on the ratio

x_{k} / < x_{k} >\equiv

T h_{k}

. Whence,

T h_{L} \equiv \sum T h_{k}

can be very small.

Pietra-Hoover index

It seems of interest, for emphasizing the structure, like the maximum position and the corresponding percentage of the relevant population, to display the data as the difference between the Lorenz curve ( $L o C$ ) and the line of perfect equality⁵²^,⁵³ One has

δ h_{k} = \frac{1}{N} [\sum_{j = 1}^{k} j y_{j} - k]

(17)

with

k \leq N

. In fact, this is the Pietra-Hoover (inequality) index.^30,54

P H I = \frac{\sum_{i}^{N} g_{i} - < g_{i} >}{2 \sum_{i}^{N} g_{i}} .

(18)

It indicates how the variable values should be (re)distributed in order for them to create a perfect equality in times. High values of the index obviously represent a high inequality level since a greater redistribution of values is required in order to achieve equality; vice-versa, lower values of the index represent a lower inequality level.

Rosenbluth coefficient

The Rosenbluth Coefficient is defined as

R o C = \frac{1}{2 \sum i s_{i} - 1}

(19)

where the symbol

i

usually indicates a firm’s rank position on economic markets.³⁷ Thereafter,

s_{i}

can be taken as the rank of the percentage of a size measure, like some

t_{k}

ratio.

Practically, the Rosenbluth index assigns more weight to weaker competitors. Such a measure which weights each competitor by its rank rather than by its “share” seems very appealing for our purpose.

The Rosenbluth coefficient is related to the Gini coefficient through

R o C = \frac{1}{N (1 - G i C)} .

(20)

Results and analysis

Numerical results should be examined along two perspectives: (i) one takes into account new indices based on imposing the constraint that a team evaluation and ranking depends on the members at the valuation time (here at the end of the race), but (ii) besides global statistical values, i.e., irrespective of the team rank, one distinguishes values taking into account team ranks, as a weight. The global values are found in Table 7, in the top and bottom respectively. Most of the outputs arise from freely accessing https://www.wessa.net/desc.wasp.

Table 7.

Table displaying a few concentration coefficients resulting from the final ranking of the ( $M_{L}$ ) teams at the end of the $G d I$ , $T d F$ , and $V a E$ 2023 race, for $T_{L}$ , $A_{L}$ , and $Δ_{L}^{(#)} \equiv A_{L}^{(#)} - T_{L}^{(#)}$ . $M_{L}$ is the number of finishing teams; $S_{L}$ is the final Entropy; $A t_{L}$ the Atkinson index; $C V_{L}$ the Coefficient of Variation; $H H I$ the Herfindahl index; $H H I *$ the Normalized Herfindahl index; $G i C$ the Gini coefficient; $T h_{L}$ the final Theil Index; $P H I$ the Pietra-Hoover index; $R o C$ the Rosenbluth coefficient. One distinguishes global measures (top of Table) from those depending on the ranking order (bottom of Table).

	$G d I$			$T d F$			$V a E$
	$T_{L}^{(#)}$	$A_{L}^{(#)}$	$Δ_{L}^{(#)}$	$T_{L}^{(#)}$	$A_{L}^{(#)}$	$Δ_{L}^{(#)}$	$T_{L}^{(#)}$	$A_{L}^{(#)}$	$Δ_{L}^{(#)}$
$M_{L}$	$23$			$22$			$22$
$θ_{L}$	0.31895	0.31895	0.35652	0.32352	0.32352	0.36126	0.32354	0.32354	0.32652
$S_{L}$	3.13528	3.13527	2.80487	3.09100	3.09099	2.76806	3.09080	3.09077	3.06258
$A t_{L}$	1.05e-04	1.13e-04	0.16414	2.06e-05	2.81e-05	0.15940	1.18e-04	1.38e-04	0.01386
$C V_{L}$	0.02062	0.02133	0.89212	0.00909	0.01062	0.84168	0.02187	0.02357	0.24586
$H H I$	0.04350	0.04350	0.07808	0.04546	0.04546	0.07766	0.04548	0.04548	0.04820
$H H I *$	1.93e-05	2.07e-05	0.03618	3.94e-06	5.38e-06	0.03373	2.28e-05	2.65e-05	0.00288
$G i C$	0.01120	0.01160	0.43649	0.00500	0.00582	0.44368	0.01192	0.01291	0.13211
$T h_{L}$	2.11e-04	2.26e-04	0.33062	4.12e-05	5.63e-05	0.32298	2.38e-04	2.76e-04	0.02846
$P H I$	0.00867	0.00883	0.32024	0.00359	0.00422	0.33864	0.00889	0.00984	0.09997
$R o C$	0.04397	0.04399	0.07716	0.04568	0.04572	0.08171	0.04600	0.04605	0.05237

One can remark that the orders of magnitude for $T_{L}$ and $A_{L}$ do not differ much, but these differ from $Δ_{L}$ . The entropy and the leadership gap temperature significantly differ from race to race. This can be tracked to the similar order of magnitude of the $p_{k}$ , implying similarities in $θ_{L}$ . This indicates that the overall distribution of rankings has not much influence on global characteristics; in fact, one should expect that the $t e a m s$ hierarchies do not much differ from race to race. Concerning the rank effect as a weight for calculating indicators, one observes the largest effects in the “unweighted” indicators. This suggests to provide displays based on ranks.

The first display of interest should be the new $Δ_{L}$ indicator variation as a function of the rank for the different races. Figure 1 shows a plot of $Δ_{L}$ team final times ranked in time increasing order, according to UCI rules. Once and for all notice the meaning of colors: they correspond to the Giro d’Italia ( $G d I$ , green triangles), Tour de France ( $T d F$ , blue circles), and Vuelta a España ( $V a E$ , red triangles). For $T d F$ , the (best OLS) fit leads to some smooth exponential behavior. In contrast, a simple curve cannot fit the $G d I$ and $V a E$ data in which steps appear, suggesting team clustering, as discussed below.

Figure 1.

Plot of $Δ_{L}^{(#)} \equiv A_{L}^{(#)} - T_{L}^{(#)}$ (h:m:s), measuring the difference between the suggestion²⁰ demanding team rankings when only considering riders who finish the race in and the UCI rules for teams final time ranked in time increasing order, according to UCI ( $T_{L}$ ) rules; colors correspond to Giro d’Italia $G d I$ (green), Tour de France $T d F$ (blue), Vuelta a España $V a E$ (red), - 2023 female races. For $T d F$ , the (best OLS) fit leads to : $y$ = 162.49 $e^{(0.1362 x)}$ ; $R^{2}$ = 0.9752.

For the best ranked teams ( $r \leq 16$ ), $Δ_{L}$ values for $G d I$ and $T d F$ are very similar. However the $V a E$ $Δ_{L}$ values are very different from those in the other two races, pointing to either different difficulties or/and to different types of skills of teams members, and subsequently strategies, as hinted in the previous Sections.

Other new indicators imply the $p_{k}$ rank distribution. Figure 2 presents the plot of $p_{k} l n (p_{k})$ , a term appearing in calculating $S_{L}$ , for the $Δ_{L}$ distribution, in ranked time increasing order. A marked difference is found between data for $V a E$ and $G d I$ or $T d F$ . The evolution is nevertheless rather similar, presenting 3 inflexion points. The dashed line is a 4-th order polynomial, used as a guide to the eye only, for distinguishing the rank dependence of the $p_{k} l n (p_{k})$ indicator in the 3 races.

Figure 2.

Plot of the stochastic Shannon entropy of a team, $p_{k} l n (p_{k})$ , - see equation (5), for the $Δ_{L}^{(#)} \equiv A_{L}^{(#)} - T_{L}^{(#)}$ (h:m:s), distribution, in ranked time increasing order; colors correspond to Giro d’Italia $G d I$ (green), Tour de France $T d F$ (blue), Vuelta a España $V a E$ (red) 2023 female races. The dashed line is a 4-th order polynomial, used as a guide to the eye only, - for better distinguishing the rank evolutions of $p_{k} l n (p_{k})$ index in the 3 races.

Similarly, one can study the contribution of each team to the Theil index through $T h_{k}$ $\equiv$ $x_{k} / < x_{k} >$ ,

Figure 3 shows the plot of the $T h_{k}$ values for the $Δ_{L}$ distribution, in ranked time increasing order. The dashed line is a 4-th order polynomial, as a guide to the eye only, pointing the crucial (core) rank ( $r_{c}$ ) at the minimum of the $T h_{k}$ s distribution. Indeed, a marked “difference” occurs for teams below $r ≃ 15$ , in the 3 races. A similar behavioral aspect is found for the variation of the $H H I_{k}$ s, - not shown for saving space.

Figure 3.

Plot of the team Theil index contribution, $T h_{k}$ , to the final Theil index, - see equation (16), for the $Δ_{L}$ distribution, in ranked time increasing order; colors correspond to Giro d’Italia $G d I$ (green), Tour de France $T d F$ (blue), Vuelta a España $V a E$ (red) 2023 female races. The dashed line is a 4-th order polynomial, as a guide to the eye only, and for better pointing the crucial (core) rank ( $r_{c}$ ) at the minimum of the $T h_{k}$ indicator.

The most classical indicator of inequalities is the Gini coefficient; it is often presented as resulting from the ratio of surfaces, - somewhat difficult to estimate at first sight. In order to provide a better vizualisation, one displays the evolution of the distance to the equality distribution line $δ h$ as a function of the rank: Figure 4 displays $δ h$ for the distribution of $T_{L}$ and $A_{L}$ times only for the 2023 $G d I$ finishing teams; on Figure 5, one finds the distance $δ h$ for the distribution of $T_{L}$ and $A_{L}$ times distributions for the 2023 $T d F$ and $V a E$ female races, respectively.

Figure 4.

Display of the distance $δ h$ between the Lorenz curve and the line of perfect equality, on a Gini coefficient graph, for the distribution of $T_{L}$ or $A_{L}$ times of the 2023 $G d I$ finishing teams; the maximum of each curve gives the relevant crucial (core) rank; $p$ =1 corresponds to $M_{L}$ = 23 for $G d I$ . Recall that for the $T d F$ and $V a E$ , $M_{L} = 22$ , whence see Figure 5.

Figure 5.

Display of the distance $δ h$ between the Lorenz curve and the line of perfect equality, on a Gini coefficient graph, for the distribution of $T_{L}$ or $A_{L}$ times distributions; the maximum of each curve gives the crucial (core) rank; $p$ =1 corresponds to $M_{L}$ = 22 for $T d F$ and $V a E$ ; symbols correspond to Tour de France $T d F$ (blue; circles) and Vuelta a España $V a E$ (red; triangles) respectively, - 2023 female races. Recall that for $G d I$ , $M_{L} = 23$ , whence see Figure 4.

Figure 6 displays the distance $δ h$ between the Lorenz curve and the line of perfect equality, for the distribution of relative times. The maximum of each curve gives the crucial (core) rank, since $p$ =1 corresponds to the highest rank, $M_{L}$ = 23 for $G d I$ and = 22 for $T d F$ and $V a E$ . It is remarkable that the maximum of each curve occurs near the crucial (core) rank $r_{c}$ in all cases.

Figure 6.

Display of the distance $δ h$ between the Lorenz curve and the line of perfect equality, on a Gini coefficient graph, strictly for the $Δ_{L}$ distribution of relative times; the maximum of each curve indicates the crucial (core) rank between the top and bottom teams thereby suggesting different team values; $p$ =1 corresponds to $M_{L}$ = 23 for $G d I$ and = 22 for $T d F$ and $V a E$ ; colors correspond to Giro d’Italia $G d I$ (green), Tour de France $T d F$ (blue), Vuelta a España $V a E$ (red), - 2023 female races.

One may propose some interpretation of a team core existence by analogy with an anharmonic oscillator. A few teams, the main ones, have definite goals and aims, with ad hoc team composition but the other teams anticipate or respond to the strategy of the leading teams which have well defined and expectedly well performing leaders. Beside the rider skills differences, whence anticipating different levels of performance, one can also imagine that the not-too-best teams response introduces some non-linearity in the overall race dynamics description.

In all these Figures, in particular in Figures 2 and 6, it might be noticed that the $V a E$ data points present some different behavior than those for $G d I$ and $T d F$ . Such different behaviors can be on one hand traced back to quite different time distributions: observe the orders of magnitudes of $G i C$ and $H H I *$ , and a fortiori $T h_{L}$ , $A t_{L}$ , and $C V_{L}$ , even taking into account the length difference of the races. On the other hand, this behavior likely reflects the difference in team approaches for the races, i.e., the team compositions and the teams levels (see Table 2), leading to time distributions differences.

Figure 7 is a plot of $Δ_{L}$ vs. $T_{L}$ teams final time, according to UCI rule. N.B. The $V a E$ - $Δ_{L}$ and $V a E$ - $T_{L}$ data have been rescaled, (divided and multiplied, respectively) by a factor 1.25, in order to display the data on the same figure. Such a scaling factor ( $≃ 960 / 740$ ) roughly corresponds to the ratio between the lengths of the relevant races; see Table 1. A very similar figure can be made for illustrating the $Δ_{L}$ and $T_{L}$ relationship, without bringing anything specifically more interesting; it is omitted for space saving.

Figure 7.

Scatter plot of $Δ_{L}^{(#)} \equiv A_{L}^{(#)} - T_{L}^{(#)}$ , measuring the difference between $A_{L}^{(#)}$ , the suggestion²⁰ demanding team rankings when only considering riders who finish the race, and $T_{L}^{(#)}$ , according to the UCI rules for teams final time according to UCI ( $T_{L}$ ) rules, vs. $T_{L}$ teams final time, ranked in time increasing order; $Δ_{L}$ and $T_{L}$ teams final time are given in hours:minutes (h:m), and days:hours:minutes (d:h:m), respectively, - the latter ranked in time increasing order, according to UCI ( $T_{L}$ ) rule; colors correspond to Giro d’Italia $G d I$ (green), Tour de France $T d F$ (blue), Vuelta a España $V a E$ (red), - 2023 female races. The $V a E$ data has been rescaled, see text, to make the figure readable.

In the same line of thought, one could compare the differences in team ranks in $T_{L}$ and $A_{L}$ with respect to each other, besides discussing both vectors with respect to $Δ_{L}$ . This can be made along the Kendall $τ$ coefficient or measure their relative Kemeny distance. These considerations are left for Appendix A, in order to refer to the “weighted preferences” notions, somewhat of wider interest than the above measures.

Observing the Tables and the Figures, one is rightly tempted to look for team clustering, within the perspective of this study.

In Tables 3 to 5, one can observe clusters, admitting rank swapping:

$G d I$ : the best 2 teams are swapped in $T_{L}$ and $A_{L}$ ; but the internal ranking is much scrambled between the 3-rd and 8-th team; the following 15 teams are equally ranked, except the last 2;

$T d F$ : scrambling of the best 2nd and 5-th teams; much scrambling thereafter, but with mere small swapping, in the center of the ranks, up to the last ranks;

$V a E$ : the first 8 teams swap ranks regularly by pairs; with almost no scrambling thereafter.

In Figures 1 and 7, one can observe clusters of teams for

$G d I$ : a group of 16 below $T_{L}$ $≃$ 3:04:30 and a group of 8 above; Figure 7;

$G d I$ : more precisely, below $r_{c} = 11$ and above the rank for $r_{e} \geq 21$ ;

but see also a step at $r_{d} \geq 16$ separating 2 clusters; Figure 1;

$T d F$ : below $r_{c} ≃ 10$ and others below and above $r_{d} = 17$ ; Figure 1;

$T d F$ : a cluster of 16 teams centered on $T_{L} ≃ 3 : 02 : 00$ ; Figure 7;

$T d F$ : a cluster of 4 teams appears above 00:31 from $Δ_{L}$ values; Figure 7;

$V a E$ : 5 clusters seem to appear below $r_{c} \leq 5$ , $r_{d} \leq 11$ , and for $r_{e} \leq 17$ , on Figure 1;

$V a E$ : a cluster of 14 teams, around [ $T_{L}; Δ_{L}$ ] $≃$ [ $2 : 23 : 00; 0 : 32$ ], Figure 7.

These observations remind of self-organized complex systems, often amounting to 3 clusters (high, medium, low ranks/classes) under collaboration-competition rules as found in many societies.^55,56

Discussion

Before discussing features, let it be recalled that one looks for (new) indicators containing (new) filters, in particular for stressing the contribution input of team members to a team rank, - due to manager selections and strategies. Thus, one introduces the “leadership gap” $Δ_{L}$ , equation (4), and the “race temperature” $θ_{L}$ , equation (10), - beside classical ranking indices. From the numerical values of interest in a set of study cases, one expects to deduce qualitative aspects of wider insights. Indeed, the classically used indicators (Section “Atkinson index” - Rosenbluth Coefficient) provide hypotheses (or assumptions) to managers devising strategies. The new indicators (Section “Leadership gap index” - Stage and Race Temperature index) increase perspectives.

From all Figures and Tables, one can notice that the (new) indicators reflect different contents: the former makes more precise the role of team members through their entire participation in the hierarchical procedure, while the latter emphasizes the strength of the competition leading to the final ranking.

An additional contribution stems from the weight given by the ranking to the indicators, when displaying them as a function of the team ranks (Figures 1 to 6).

In fact, one observes the existence of clusters of teams, more explicitly in Figure 7 as in many examples of socio-economics populations: a high, a medium and a low class of teams. Further investigation might be pursued through a recently proposed cluster stability indicator, Unit Relevance Index (URI).⁵⁷

Let us recall that (alas) it is very difficult to move from one class to another, except through the introduction of external fields, - most likely money as the incentive. Inequalities and concentrations are inevitable but one can observe how extreme cases are concerned. As examples, one has noticed that the two best teams may interchange their rank according to the filtering, see Tables 3 to 5. Same for the worst teams. In both cases, very generally, the matter is very relevant for team management and race strategies. However, in order to maintain some form of competition, one has not to neglect the “middle class” as the needed ballast. Nevertheless, the most relevant features to be considered are those containing extreme values since they are expected to lead the search toward the main differences in team ranking, - and specifically here disentangling characteristics.

Whence one enters into the consideration about collusion, or competition-collaboration,⁵⁵ - which might change from stage to stage. Further studies should concern whether such collusions can be observed in each stage, in other words how the teams move in the range around the extremum in the displayed curves, particularly in the $δ h$ plots. Clearly, the best teams have not much interest in bringing upward “middle class” teams, nor “middle class” teams bring upward the “low class” teams. No need to say that strategies allowing that a team could be better off by exerting a lower effort maybe hidden in examining classical indicators,^58,59 but could be highlighted through indicators based on differences between reasonably unbiased criteria.

Whence the study of “ranks resulting of strategies and skills” through the classical indices, but further taking into account the rank as weight, could add further values to the reliability of hierarchy findings, and promote attractive competition.

Conclusions

Therefore, before optimizing strategies, one should be convinced of the validity of efficiency criteria. In the present report, one focuses on team ranking when the valuation outcome depends much on the performance of a subset of team members. It is argued and demonstrated through data found in cycling races. It is suggested that new indicators be compared to classical ones. This leads to observe features, like inequalities, clusters, “amount of competition”, “race temperature”, i.e., measures which provide quantified meanings outlining possible strategic goals arising in many team competitions. It is interesting to point that one can one observe management strategies, through indicator values comparison in a given race or when comparing races. Thus, even within such case studies, quantitative measures suggest considerations for further empirical modeling.

In conclusion, the indicators based on (i) the concept of difference between the distributions of data points, and (ii) on a probabilistic reasoning take into account the team final competition measure in an information-like approach, - the Shannon entropy. The proposed methodology is practical, simple, and useful: the study emphasizes that the method is based on scientific rationality and logical principles. A desirable characteristic of inequality measures is the existence of a graphical analogy with the indicator. This can enhance interpretability and help communicate results to non-experts. The study reported here above presents such graphical characteristics allowing the valuation of team performance, - Figures 3 to 6. The new notion of “crucial core rank”, emphasising the main teams is well illustrated, and original.

In summary, the analyses of the proposed indicators point to a few practical features. For example, Tables 1 and 6 allow to envisage race difficulties differences. The displayed values emphasize that the time distributions of $T_{L}$ and $A_{L}$ are similar for $G d I$ and $T d F$ , but differ from those of $V a E$ : moreover, each skewness is positive indicating wide distributions for the slowest teams. Each kurtosis value indicates whether balanced competition occurs in races. The same deduction holds when observing the $σ$ , much shorter in $T d F$ , indicating a fiercer competition between the top teams. The values of $Δ_{L}^{(#)}$ , in Tables 3 to 5, when large, indicate that there is team emphasis on a distributive role according to riders skills. The $θ_{L}$ as a relative temperature index appears to be a measure of the distribution of riders kinetic energy at the end of a race (or stage). An interesting output for race organizers and team managers arise from the ranking plots of indicators; those allow to observe the crucial race core made of the most competitive teams.

In brief, the numerical values point to differences in team strategy and goals in a given race based on rider skill levels. Thus, it seems admissible that the indicators are new useful measures, but surely need to be further examined and developed.

Notice that the definition of $T_{L}$ , $A_{L}$ , $Δ_{L}$ , and $θ_{L}$ can be used not only for measures after the final stage, but also for every stage; see Appendix B. The same holds true for most of the defined and calculated coefficients here above, - but not in the first stage of the race, of course. Extensions, e.g., to the team member finishing $p l a c e$ rather than their $t i m e$ can be easily done. (This might be valuable when discussing colorful jerseys in such races.) The selection of team members including the leader(s) can be based on previous statistics relying on the indicators. This enhances the answer to the question about designing and further optimizing strategies in a competitive environment.

Finally, it might be interesting to further discuss the Stage and Race Temperature indicators, e.g., through illustrative examples in order to appreciate the practical and theoretical meanings of the “intensity” measures. It can be suggested, as further research, that one constructs hypothetical races with only a very small number of teams teams and choose sets of $p_{k}^{(s)}$ values to see the behaviour of stage and race temperature indicators. However, this would demand a different focus, - though surely essential to practical coaching.

The suggestion might clearly demand longer sets of investigations, in order to grasp a meaningful discussion on intensity sizes; this obviously demands several simulations covering several cases.

Nevertheless, races with only a few competing teams are not common. World Rally Championship (WRC) competitions might be of interest, - but the teams ranking rules are very different from those in cyclist races.

Thus, last but not least, even though the paper contains an original contribution to a special type of sporting activity, the lessons learned can be of more general use for many other activities, i.e., as long as the outcomes depend not only on the team effort but also on the performance of a subset of members of the team. Open questions remain on the collaboration-competition aspect of such races.

Footnotes

Acknowledgements

Thanks to reviewers and editor for their patience and comments. Thanks to Prof. J. Miśkiewicz for much help on coding.

Ethical considerations

Not applicable.

Data availability

Data is freely available, see text.

Declaration of conflicting interests

The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Neither relevant financial nor non-financial competing interest has to be mentioned.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship and/or publication of this article: Work was partially supported by the project “A better understanding of socio-economic systems using Quantitative Methods from Physics”, funded by the European Union—Next generation EU and the Romanian Government under the National Recovery and Resilience Plan for Romania, contract no.760034/23.05.2023, code PNRR-C9-I8-CF 255/29.11.2022, through the Romanian Ministry of Research, Innovation and Digitalization, within Component 9, “Investment I8”. Moreover, P.K. acknowledges the support of ‘Digital Finance - Reaching New Frontiers’ (Horizon Marie Sklodowska-Curie Actions Industrial Doctoral Network), Ref. Number 101119635.

ORCID iD

Marcel Ausloos

Appendix A

Considerations on the Kendall $τ$ coefficient and the weighted Kemeny distance.

How teams are ranked might have a substantial effect: in sport, the prize money is higher for the first teams than for the others. Sometimes the last teams face relegation and may loose sponsors. Thus, a swap in positions, due to different ruling, may be crucial.

The difference in hierarchies, including the scattering of the results, derived from ranking rules, can be classically measured through the Kendall $τ$ rank-rank correlation coefficient, or equivalently through the Kemeny distance in terms of the number ( $N$ ) of competing teams, which reads $K = N (N - 1) (1 - τ) / 4$ , when there is no ex aequo.^60,61 However, Can⁶² and Csató¹¹ point out that the Kendall $τ$ coefficient does not take into account the precise position of dissimilarities when comparing two linear ranking sets. In particular, there is no discrimination about the (teams) relative position in each list.

In order to weight the position of discordant pairs, Csató has proposed a hyperbolic function: $w_{C} = 1 / r$ , $r \in [1, r_{M} - 1]$ , based on the lowest rank $r$ of an item of a discordant pair.¹¹ A smoother weight distribution, $w_{A} = \sqrt{1 / r}$ , has also been proposed.²⁰ Obviously, the classical Kendall coefficient $τ$ corresponds to choosing a $w_{K} = 1$ for permutations forcing one of the vectors to become identical to the other. The procedure can be repeated, appropriately weighting the various swaps, whence obtaining a “weighted Kemeny distance” between pairs of ranks; they are called $K_{C}$ , $K_{A}$ , and $K_{K}$ respectively.

For completeness, one can observe the number of concordant $C$ and discordant pairs $D$ , obtain the “score” $S \equiv C - D$ , thereafter the Kendall $τ$ coefficient from $S / (C + D)$ , - when there is no ex aequo in the considered vectors. In the present cases, $C + D \equiv M_{L} (M_{L} - 1) / 2$ . The results are reported in Table A1

Notice that practically, in order to compare the ranks of pairs of teams, it is first useful to organize the teams in alphabetical order, giving them the appropriate rank for a given indicator. In all studied cases in the main text, the ranking chosen is that corresponding to $T_{L}$ , as in Tables 3 - 5.

Notice that the indicators distances are necessarily ordered: $K_{K} \geq K_{A} \geq K_{C}$ , and $A_{L}^{(#)}$ is closer to $Δ_{L}^{(#)}$ than $T_{L}^{(#)}$ .

Appendix B

Considerations for extensions to daily stages.

One can sketch how to specify the main text considerations toward a daily ranking mechanism, within a multi-stage race.

Recall that $t_{i, s}^{(#)}$ is the finishing time of one of the 3 fastest riders ( $i = 1, 2, 3$ ) of team $(#)$ for stage $s$ . The team final time $T_{s}^{(#)}$ is (21)

T_{s}^{(#)} = Σ_{i = 1}^{3} t_{i, s}^{(#)} .

For the first stage, $s = 1$ , of course, and $T_{1}^{(#)} \equiv A_{1}^{(#)}$ . After the second stage, one claims according to UCI rules that (22)

T_{2}^{(#)} \equiv T_{1}^{(#)} + T_{2}^{(#)} .

After the 3rd stage, (23)

T_{3}^{(#)} \equiv T_{1}^{(#)} + T_{2}^{(#)} + T_{3}^{(#)}

etc.

Next, consider the adjusted team time after two stages $s = 1$ and $s = 2$ , i.e., $A_{2}^{(#)}$ . This team adjusted time is not equal to $Σ_{i = 1}^{3} t_{i, 1}^{(#)} + Σ_{i = 1}^{3} t_{i, 2}^{(#)}$ , but is equal to the cumulative time of the best 3 ( $j$ ) riders of the team $a f t e r$ the $s$ -th stage, i.e., $= Σ_{j = 1}^{3} t_{j, 2}^{(#)}$ . Etc.

This seems to represent better the team time evolution and leads to avoid Cipollini-like effects.²¹

References

Kulakowski

. Understanding the analytic hierarchy process. Boca Raton, FL: CRC Press, 2020.

Fritz

Moretti

Staudacher

. Social ranking problems at the interplay between social choice theory and coalitional games. Mathematics 2023; 11: 4905.

Ausloos

Rotundo

Cerqueti

. A theory of best choice selection through objective arguments grounded in linear response theory concepts. Physics 2024; 6: 468–482.

Kossi

. Tournois séquentiels et compétition pour la prime d’excellence scientifique. Rev Fr Econ 2017; 32: 57–94. [in French]. Available from: https://doi.org/10.3917/rfe.174.0057.

Sanz-Menéndez

Cruz-Castro

. University academics’ preferences for hiring and promotion systems. Eur J High Educ 2019; 9: 153–171.

Jose

VRR

Nau

Winkler

. Scoring rules, generalized entropy, and utility maximization. Oper Res 2008; 56: 1146–1157.

Csató

. Some impossibilities of ranking in generalized tournaments. Int Game Theory Rev 2019; 21: 1940002.

Chebotarev PY and Shamis E. Characterizations of scoring methodsfor preference aggregation. Ann Oper Res 1998; 80: 299–332.

Fainmesser

Fershtman

Gandal

. A consistent weighted ranking scheme with an application to NCAA college football rankings. J Sport Econ 2009; 10: 582–600.

10.

González-Díaz

Hendrickx

Lohmann

. Paired comparisons analysis: an axiomatic approach to ranking methods. Soc Choice Welf 2014; 42: 139–169.

11.

Csató

. On the ranking of a Swiss system chess team tournament. Ann Oper Res 2017; 254: 17–36.

12.

Vaziri

Dabadghao

Yih

, et al. Properties of sports ranking methods. J Oper Res Soc 2018; 69: 776–787.

13.

Csató

. A comparative study of scoring systems by simulations. J Sport Econ 2023; 24: 526–545.

14.

Leiva-Bertrán

. Ranking in incomplete tournaments: The generalized win percentage method, efficiency, and NCAA football. J Sport Econ 2025; 26: 3–34.

15.

Jianu

Isaic-Maniu

Brandas

, et al. Testing Benford and universal laws on gambling and betting data in Romania. Ann Oper Res 2023; 342: 1765–1779.

16.

Csató

. Quantifying incentive (in)compatibility: A case study from sports. Eur J Oper Res 2022; 302: 717–726.

17.

Ausloos

. Hint of a universal law for the financial gains of competitive sport teams. The case of Tour de France cycle race. Front Phys 2017; 5: 59.

18.

Ausloos

. Rank–size law, financial inequality indices and gain concentrations by cyclist teams. The case of a multiple stage bicycle race, like Tour de France. Physica A 2020; 540: 123161.

19.

Ausloos

. Shannon entropy and Herfindahl-Hirschman index as team’s performance and competitive balance indicators in cyclist multi-stage races. Entropy 2023; 25: 955.

20.

Ausloos

. Hierarchy selection: New team ranking indicators for cyclist multi-stage races. Eur J Oper Res 2024; 314: 807–816.

21.

Ausloos

. Should one (be allowed to) replace the Cipollini’s? Ann Oper Res 2025; in press. doi: 10.1007/s10479-024-06206-y.

22.

Mostaert

Laureys

Vansteenkiste

, et al. Discriminating performance profiles of cycling disciplines. Int J Sports Sci Coach 2021; 16: 110–122.

23.

Pinedo-Jauregi

Romarate

. Assessing ambient temperature measurements in road cycling races. Int J Sports Sci Coach 2025; 20: 742–747.

24.

Smith

. Assessment influence on peak power output and road cycling performance prediction. Int J Sports Sci Coach 2008; 3: 211–226.

25.

Stessens

Gielen

Meeusen

, et al. Physical performance estimation in practice: A systematic review of advancements in performance prediction and modeling in cycling. Int J Sports Sci Coach 2024; 19: 2222–2243.

26.

O’Grady

Worn

Owens

, et al. Race craft: A qualitative exploration of the development, implementation and reflection of tactical decision making in road cycling. Int J Sports Sci Coach 2023; 18: 2160–2170.

27.

Eliazar

Sokolov

. Measuring statistical evenness: A panoramic overview. Physica A 2012; 391: 1323–1353.

28.

Subramanian

Ramanathan

. A review of applications of analytic hierarchy process in operations management. Int J Prod Econ 2012; 138: 215–241.

29.

Dimitrova

Ausloos

. Primacy analysis in the system of Bulgarian cities. Open Phys 2015; 13: 218–225.

30.

Josa

Aguado

. Measuring unidimensional inequality: Practical framework for the choice of an appropriate measure. Soc Indic Res 2020; 149: 541–570.

31.

Bednay

Fleiner

Tasnádi

. An indifference result for social choice rules in large societies. Eur J Oper Res 2025; 321: 208–213.

32.

Atkinson

. On the measurement of inequality. J Econ Theory 1970; 2: 244–263.

33.

Hirschman

. The paternity of an index. Am Econ Rev 1964; 54: 761–762.

34.

Gini

. Measurement of inequality of incomes. Econ J 1921; 31: 124–125.

35.

Theil

. The information approach to demand analysis. In: Advanced Studies in Theoretical and Applied Econometrics. Dordrecht: Springer Netherlands; 1992. pp.627–651.

36.

Mages

Rohner

. Quantifying redundancies and synergies with measures of inequality. PLoS ONE 2024; 19: e0313281.

37.

Hall

Tideman

. Measures of concentration. J Am Stat Assoc 1967; 62: 162–168.

38.

Shannon

. A mathematical theory of communication. Bell Syst Techn J 1948; 27: 379–423.

39.

Tsallis

. Beyond Boltzmann–Gibbs–Shannon in physics and elsewhere. Entropy 2019; 21: 696.

40.

Silva

Duarte

Esteves

, et al. Application of entropy measures to analysis of performance in team sports. Int J Perform Anal Sport 2016; 16: 753–768.

41.

Kozuki

Fuchikami

. Dynamical model of financial markets: fluctuating ‘temperature’ causes intermittent behavior of price changes. Physica A 2003; 329: 222–230.

42.

Xiao

Polukarov

, et al. Thermodynamic analysis of financial markets: Measuring order book dynamics with temperature and entropy. Entropy 2023; 26: 24.

43.

Borooah

Mangan

. Measuring competitive balance in sports using generalized entropy with an application to English premier league football. Appl Econ 2012; 44: 1093–1102.

44.

Brezina

Pekár

Čičková

, et al. Herfindahl–Hirschman index level of concentration values modification and analysis of their change. Cent Eur J Oper Res 2016; 24: 49–72.

45.

Yiğit

Tür

. Relationship between diversification strategy applications and organizational performance according to Herfindahl index criteria. Procedia Soc Behav Sci 2012; 58: 118–127.

46.

Oladimeji

Udosen

. The effect of diversification strategy on organizational performance. J Compet 2019; 11: 120–131.

47.

Handoyo

Suharman

Ghani

, et al. A business strategy, operational efficiency, ownership structure, and manufacturing performance: The moderating role of market uncertainty and competition intensity and its implication on open innovation. J Open Innov 2023; 9: 100039.

48.

Owen

Ryan

Weatherston

. Measuring competitive balance in professional team sports using the Herfindahl-Hirschman index. Rev Ind Organ 2007; 31: 289–302.

49.

Owen

. Simulation evidence on Herfindahl-Hirschman measures of competitive balance in professional sports leagues. J Oper Res Soc 2022; 73: 285–300.

50.

Marmani

Ficcadenti

Kaur

, et al. Entropic analysis of votes expressed in Italian elections between 1948 and 2018. Entropy 2020; 22: 523.

51.

Cerqueti

Ausloos

. Statistical assessment of regional wealth inequalities: the Italian case. Qual Quant 2015; 49: 2307–2323.

52.

Lorenz

. Methods of measuring the concentration of wealth. Publ Am Stat Assoc 1905; 9: 209–219.

53.

Ausloos

Cerqueti

. Studies on regional wealth inequalities: The case of Italy. Acta Phys Pol A 2016; 129: 959–964.

54.

Frosini

. Approximation and decomposition of Gini, Pietra–Ricci and Theil inequality measures. Empir Econ 2012; 43: 175–197.

55.

Caram

Caiafa

Proto

, et al. Dynamic peer-to-peer competition. Physica A 2010; 389: 2628–2636.

56.

Ausloos

Cloots

Gadomski

, et al. Ranking structures and rank–rank correlations of countries: The FIFA and UEFA cases. Int J Mod Phys C 2014; 25: 1450060.

57.

Cerqueti

Mattera

. Measuring unit relevance and stability in hierarchical spatio-temporal clustering. Spat Stat 2025; 66: 100880.

58.

Csató

. Was Zidane honest or well-informed? How UEFA barely avoided a serious scandal. Econ Bull 2018; 38: 152–158.

59.

Csató

. How to avoid uncompetitive games? The importance of tie-breaking rules. Eur J Oper Res 2023; 307: 1260–1269.

60.

Kemeny

Snell

. Mathematical models in the social sciences. Cambridge, MA: MIT Press, 1962.

61.

Heiser

D’Ambrosio

. Clustering and prediction of rankings within a Kemeny distance framework. In: Lausen B, Van den Poel D, and Ultsch A (Editors). Algorithms from and for Nature and Life. Cham: Springer International Publishing; 2013, pp.19–31.

62.

Can

. Weighted distances between preferences. J Math Econ 2014; 51: 109–111.

New inequality indicators for team ranking in multi-stage female professional cyclist races

Abstract

Keywords

Introduction

Research questions

Methodology

Leadership gap index

Stage and race temperature index

Other indicators

Atkinson index

Coefficient of variation

Herfindahl-Hirschman index

Gini coefficient

Theil index

Pietra-Hoover index

Rosenbluth coefficient

Results and analysis

Discussion

Conclusions

Footnotes

Acknowledgements

Ethical considerations

Data availability

Declaration of conflicting interests

Funding

ORCID iD

Appendix A

Appendix B

References