_{1}

^{*}

Income distributions are commonly unimodal and skew with a heavy right tail. Different skew models, such as the lognormal and the Pareto, have been proposed as suitable descriptions of income distribution and applied in specific empirical situations. More wide-ranging tools have been introduced as measures for general comparisons. In this study, we review the income analysis methods and apply them to specific Lorenz models.

Income distributions are commonly unimodal and skew with a heavy right tail. Therefore, different skew models, such as the lognormal and the Pareto, have been proposed as suitable descriptions of income distribution, but they are usually applied in specific empirical situations [

Fellman [

The Lorenz curve. The most commonly used theory is based on the Lorenz curve. Lorenz [

“Plot along one axis accumulated per cents of the population from poorest to richest, and along the other, wealth held by these per cents of the population”.

Consequently, L ( p ) is an accumulated amount of income (wealth) defined as a function of the proportion p of the population. It satisfies the condition L ( p ) ≤ p because the income share of the poor is less than their proportion of the population. The increase Δ L caused by a fixed increase Δ p of the population is a growing function of p, and accordingly, the derivative L ′ ( p ) is an increasing function of p and L ( p ) is a convex function [

Consider the income distribution F X ( x ) of a non-negative variable X. Let f X ( x ) be the corresponding frequency distribution and let the mean of X be μ X = ∫ 0 ∞ x f X ( x ) d x . Then, the Lorenz curve L X ( p ) is

L X ( p ) = 1 μ X ∫ 0 x p x f X ( x ) d x , (1)

where x p is the p quantile, that is F X ( x p ) = p . The Lorenz curve is not defined if the mean is zero or infinite. A Lorenz curve always starts at ( 0 , 0 ) and ends at ( 1 , 1 ) . The higher the Lorenz curve, the lesser is the inequality of the income distribution. The diagonal L ( p ) = p is commonly interpreted as the Lorenz curve for complete equality between income receivers, but according to [

On the other hand, increasing inequality lowers the Lorenz curve, and theoretically, it can converge towards the lower right corner of the square. A sketch of a Lorenz curve is given in

Variable transformations. Consider a transformed variable Y = g ( X ) , where g ( ⋅ ) is positive and monotone increasing. Then, the distribution of F Y ( y ) is

F Y ( y ) = P ( Y ≤ y ) = P ( g ( X ) ≤ g ( x ) ) = P ( X ≤ x ) = F X ( x ) . (2)

For the transformed variable Y, the p quantile y p is F Y ( y p ) = p , that is, y p = g ( x p ) .

Now

f Y ( y ) = d F Y ( y ) d y = d F X ( x ) d x d x d y = f X ( x ) d x d y . (3)

Hence,

μ Y = ∫ 0 ∞ y f Y ( y ) d y = ∫ 0 ∞ g ( x ) f X ( x ) d x d y d y = ∫ 0 ∞ g ( x ) f X ( x ) d x (4)

and

L Y ( p ) = 1 μ Y ∫ 0 x p g ( x ) f X ( x ) d x . (5)

If the transformation is linear g ( x ) = θ x , then Y = θ X , μ Y = θ μ X ,

L Y ( p ) = 1 θ μ X ∫ 0 x p θ x f X ( x ) d x = L X ( p ) , (6)

and consequently, the Lorenz curve is invariant under linear transformations. A simple example of this property is that the Lorenz curve of income distributions is independent of the currency used. Another not so obvious result is that proportional income increase and flat tax policies are linear transformations and do not influence the Lorenz curve. Consequently, the Lorenz curve satisfies the general rules [

To every distribution F ( x ) with finite mean corresponds a unique Lorenz curve, L X ( p ) . The contrary does not hold because every curve L X ( p ) is a common Lorenz curve for a whole class of distributions F ( θ x ) , where θ is an arbitrary positive constant.

Consider two variables X and Y, their distributions F X ( x ) and F y ( y ) , and their Lorenz curves L X ( p ) and L Y ( p ) . If L X ( p ) ≥ L Y ( p ) for all p, then measured by the Lorenz curves, the distribution F X ( x ) has lower inequality than the distribution F y ( y ) and F X ( x ) is said to Lorenz dominate F y ( y ) . We denote this relation F X ( x ) ≻ L F Y ( y ) [

Income inequalities can be of different type and the corresponding Lorenz curves may intersect and for these no Lorenz ordering can be identified (cf.

very poor among the poor and rich who are not so rich. On the other hand, Lorenz curve L 2 ( p ) corresponds to a population where the poor are relatively not so poor and the rich are relatively rich. For intersecting Lorenz curves, alternative inequality measures have to be defined.

Properties of Lorenz curves. The Lorenz curve has the following general properties:

i) L ( p ) is monotone increasing,

ii) L ( p ) ≤ p ,

iii) L ( p ) is convex,

iv) L ( 0 ) = 0 and L ( 1 ) = 1 .

If the Lorenz curve is differentiable, the derivative has the following properties. Let L X ( p ) = 1 μ X ∫ 0 x p x f X ( x ) d x , F X ( x p ) = p and the density function f X ( x ) . When we differentiate the equation F X ( x p ) = p , we obtain d F X ( x p ) d p = d F X ( x p ) d x p d x p d p = 1 ,

f X ( x p ) d x p d p = 1 (7)

and

d x p d p = 1 f X ( x p ) . (8)

The differentiation of L X ( p ) = 1 μ X ∫ 0 x p x f X ( x ) d x yields

d L X ( p ) d p = 1 μ X d ∫ 0 x p x f X ( x ) d x d x p d x p d p = 1 μ X x p f X ( x p ) d x p d p = x p μ X , (9)

and consequently,

d L X ( p ) d p = x p μ X . (10)

If the Lorenz curve is differentiable twice, then the second derivative is

d 2 L X ( p ) d p 2 = 1 μ X d x p d p = 1 μ X 1 f X ( x p ) .

Hence,

d 2 L ( p ) d p 2 = 1 μ X f X ( x p ) . (11)

If lim p ↑ 1 denotes the limit from the left, we can prove the following theorem [

Theorem 1. If μ X exists, then lim p ↑ 1 L ′ ( p ) ( 1 − p ) = 0 .

Proof. Consider the integral ∫ x ∞ t f X ( t ) d t . If μ X exists, then ∫ 0 ∞ t f X ( t ) d t = μ X and for every ε > 0 there exists an x ′ such that ∫ x ∞ t f X ( t ) d t < ε if x > x ′ . Choose p so that x p > x ′ , then

ε > ∫ x p ∞ t f X ( t ) d t ≥ x p ∫ x p ∞ f X ( t ) d t = x p ( 1 − p ) (12)

and lim p ↑ 1 x p ( 1 − p ) = 0 .

As a consequence of (12),

lim p ↑ 1 L ′ X ( p ) ( 1 − p ) = lim p ↑ 1 x p μ X ( 1 − p ) = 1 μ X lim p ↑ 1 x p ( 1 − p ) = 0 .

Consider a one-parametric class of cumulative distribution functions F ( x , θ ) , defined on the positive x-axis. If we assume that F ( x , θ ) = F ( θ x ) , i.e. it depends only on the product θ x , then the following theorem holds [

Theorem 2. Let F ( x , θ ) be an one-parametric class of distributions with the properties

i) F ( x , θ ) = F ( θ x ) ,

ii) F ( θ x ) is defined on the positive x-axis,

iii) F ( θ x ) and its derivative are continuous,

iv) μ X = E ( X ) exists.

Let T = θ X , then

x p ( θ ) = t p θ (13)

and

μ X ( θ ) = c θ , (14)

where t p and c are independent of θ .

Proof. Let θ be an arbitrary, positive parameter. Then the quantile x p ( θ ) is defined by the equation F ( θ x p ) = p . If we define t p by the equation F ( t p ) = p , then t p does not depend on θ and θ x p ( θ ) = t p , and (13) is

proved. The formula (14) and the statement that L ( p ) = 1 μ ( θ ) ∫ 0 x p ( θ ) x d F ( θ x ) is independent of θ is proved by using the substitution t = θ x in the integrals E ( X ) = ∫ 0 ∞ x d F ( θ x ) and L ( p ) = 1 μ ( θ ) ∫ 0 x p ( θ ) x d F ( θ x ) .

Furthermore, we can prove the following [

Theorem 3. Consider a function L ( p ) defined on the interval [ 0 , 1 ] with the properties

1) L ( p ) is monotone increasing and convex to the p-axis,

2) L ( 0 ) = 0 and L ( 1 ) = 1 ,

3) L ( p ) is differentiable,

iv) lim p ↑ 1 L ′ ( p ) ( 1 − p ) = 0 ,

then L ( p ) is a Lorenz curve of a distribution with finite mean.

Proof. If we denote the unknown distribution F ( x ) and its derivative f ( x ) , then necessarily L ′ ( p ) = x p μ . The derivative L ′ ( p ) is a monotone- increasing function. If its inverse is denoted M ( p ) , we get the necessary relation

F ( x p ) = p = M ( x p μ ) .

If θ = 1 μ , then F ( x ) = M ( θ x ) . Now we shall prove the sufficiency, that is, that M ( θ x ) is a distribution whose mean is μ = 1 θ and whose Lorenz curve is L ( p ) . We denote M ( θ x ) = F ( x ) , then f ( x ) = F ′ ( x ) = θ M ′ ( θ x ) . After observing that the property (iv) indicates that L ′ ( p ) is integrable from 0 to 1, we introduce the variable transformation

y = M ( θ x )

d y = θ M ′ ( θ x ) d x

x = 1 θ L ′ ( y ) .

We obtain

μ = lim t → ∞ ∫ 0 t x θ M ′ ( θ x ) d x = lim p ↑ 1 ∫ 0 p 1 θ L ′ ( y ) d y = 1 θ lim p ↑ 1 ∫ 0 p L ′ ( y ) d y = 1 θ .

The given function L ( p ) has a monotone-increasing inverse function whose mean is μ .

Using the same transformation, we obtain that the Lorenz curve L ˜ ( p ) of F ( x ) = M ( θ x ) is

L ˜ ( p ) = θ ∫ 0 x p x θ M ′ ( θ x ) d x = ∫ 0 p L ′ ( y ) d y = ∫ 0 p L ′ ( y ) d y ,

and the theorem is proved.

These results have been collected in the following theorem [

Theorem 4. Consider a given function L ( p ) with the properties

(i) L ( p ) is monotone increasing and convex to the p-axis,

(ii) L ( 0 ) = 0 and L ( 1 ) = 1 ,

(iii) L ( p ) is differentiable,

(i) lim p ↑ 1 L ′ ( p ) ( 1 − p ) = 0 ,

then L ( p ) is the Lorenz curve of a whole class of distribution functions F ( θ x ) , where θ is an arbitrary positive constant and the function F ( ⋅ ) is the inverse function to L ′ ( p ) .

Fellman [

Theorem 5. A class of continuous distributions F ( x , θ ) with finite mean has a common Lorenz curve if and only if F ( x , θ ) = F ( θ x ) .

Additional properties of the Lorenz curves. Consider the vertical difference D, between the diagonal and the Lorenz curve

D = p − L X ( p )

d D d p = 1 − L ′ X ( p ) = 1 − x p μ X

d 2 D d p 2 = − L ″ X ( p ) = − 1 μ X d x p d p = − 1 μ X f X ( x ) < 0 .

The maximum of D implies 1 − x p μ X = 0 , that is, x p = μ .

For x p = μ X , L ′ X ( p ) = μ X μ X = 1 and at the point p μ = F X ( μ X ) the tangent

is parallel to the line of perfect equality. This is also the point at which the vertical distance between the Lorenz curve and the egalitarian line attains its maximum D max = F ( μ X ) − L X ( F ( μ X ) ) . This maximum is defined as the Pietra index, in this study denoted P, and discussed below [

Kleiber and Kotz have outlined a progressive development of how the income distributions can be characterized by their Lorenz curves [

Income inequality indices. When Lorenz curves intersect, the corresponding distributions cannot be compared by the Lorenz curves. Consequently, the distributions have to be compared by numerical indices mainly based on the Lorenz curves.

Gini index. The most frequently used index is the Gini coefficient, G [

G = 1 − 2 ∫ 0 1 L ( p ) d p . (15)

This definition yields Gini coefficients satisfying the inequalities 0 < G < 1 . The higher the G value, the lower the Lorenz curve and the stronger the inequality. If G X < G Y , then the distribution F X ( x ) , measured by the Gini coefficient, has lower inequality than the distribution F y ( y ) and we say that F X ( x ) Gini dominates F y ( y ) , and denote this relation F X ( x ) ≻ G F Y ( y ) [

The coefficient allows direct comparison of the income of two populations’ distributions, regardless of their sizes. The Gini’s main limitation is that it is not easily decomposable or additive. Also, it does not respond in the same way to income transfers between people in opposite tails of the income distribution as it does to transfers in the middle of the distribution. The reason for the popularity of the Gini coefficient is that it is easy to compute being a ratio of two areas in Lorenz curve diagrams. As a disadvantage, the Gini index only maps a number to the properties of a diagram, but the diagram itself is not based on any model of a distribution process. The “meaning” of the Gini index can only be understood empirically. Hence, the Gini does not capture where in the distribution the inequality occurs. As an additional result, two very different distributions of income having different Lorenz curves can have the same Gini index.

Using the Gini coefficient presented in the text, one can compare the Gini coefficients for L 1 ( p ) and L 2 ( p ) in

There are other inequality measures defined by the Gini coefficient. Yitzhaki [

G ( ν ) = 1 − ν ( 1 − ν ) ∫ 0 1 ( 1 − p ) ν − 2 L ( p ) d p , (16)

where ν > 1 . Different ν ′ s are used in order to identify different inequality properties. For low ν ′ s greater weights are associated with the rich and for high ν ′ s greater weights are associated with the poor.

Using the mean income ( μ ) and the Gini coefficient (G), Sen [

W = μ ( 1 − G ) . (17)

Pietra index. The Pietra index P is defined as the maximum D max = F ( μ X ) − L X ( F ( μ X ) ) , presented above. It can be graphically represented as the longest vertical distance between the diagonal and the Lorenz curve, or the cumulative portion of the total income held below a certain income percentile, with the 45 degree line representing perfect equality. The definitions yield Pietra coefficients satisfying the inequality 0 ≤ P < 1 . The lower bound of P is obtained when there is total income equality, that is, the Lorenz curve coincides with the diagonal. The upper bound can be obtained when the Lorenz curve converges towards the lower right corner. The limits in the inequalities can be obtained, and this is outlined in

If P X < P Y , then the distribution F X ( x ) measured by the Pietra index has lower inequality than the distribution F Y ( y ) , and we say that F X ( x ) Pietra dominates F Y ( y ) . We denote this relation F X ( x ) ≻ P F Y ( y ) . For the Lorenz curves in

obvious result F X ( x ) ≻ L F Y ( y ) ⇒ F X ( x ) ≻ P F Y ( y ) , that is, Lorenz dominance implies Pietra dominance.

An alternative definition of the Pietra index has also been given. It can be defined as twice the area of the largest triangle inscribed in the area between the Lorenz curve and the diagonal line [

gent is parallel to the diagonal. The height of the triangle is h = P 2 , and the base is the diagonal b = 2 . The double of the area is 2 area = 2 1 2 P 2 2 = P .

In comparison, the Gini index, G, is twice the area between the Lorenz curve and the diagonal, and the Pietra index is twice the area of the triangle inscribed in this area. Hence, the inequality G ≥ P holds generally [

In this section, we collect some examples in order to elucidate the theory. The models Pareto [

the Gini coefficient. If one of these properties is estimated, the other is fixed. We consider these three theoretical Lorenz curve models. We present how the Lorenz curves and the Gini and the Pietra indices depend on the model parameters. In addition, we compare the Lorenz curves of the models when their Gini indices are equal.

Pareto model. We define the Pareto distribution as F ( x ) = 1 − x − α , where x ≥ 1 and α > 1 .

The frequency function is f ( x ) = α x − α − 1 , the mean is μ = α α − 1 , the quantiles are x p = ( 1 1 − p ) 1 α , the Lorenz curve L ( p ) = 1 − ( 1 − p ) α − 1 α and the Gini coefficient G = 1 2 α − 1 . If α → 1 , G → 1 and if α → ∞ , G → 0 . In

Finally, the Pietra index is P = 1 α ( α − 1 α ) α − 1 . According to the general theory, the inequality G ≥ P holds for all parameter values, and consequently, P → 0 when α → ∞ . Let β = α − 1 , then P = 1 β + 1 ( β β + 1 ) β . When β → 0 , ( α → 1 ) , then, P → 1 . In

Simplified Rao-Tam model. Consider the simplified Rao-Tam model whose Lorenz curve is L ( p ) = p α ( α > 1 ) [

when α → ∞ L ( p ) → 0 for all 0 ≤ p < 1 and the Lorenz curve converges, towards the lower right corner of the square. In

The Gini coefficient is G = α − 1 α + 1 . When α → 1 , then G → 0 and when α → ∞ then G → 1 . The Pietra index is P = ( 1 α ) 1 α − 1 − ( 1 α ) α α − 1 . Using the vertical difference D = p − p α , the index inequalities D ≤ P < G hold, and for α → 1 G → 0 , and consequently, P → 0 . For increasing α values, the supremum of D = p − p α is one. This must also hold for the supremum of P = ( 1 α ) 1 α − 1 − ( 1 α ) α α − 1 . Consequently, the interval 0 < P < 1 cannot be shortened. In

Chotikapanich. Chotikapanich [

The limits of the fractions studied below result in indefinite forms 0 0 and l’Hospital’s rule lim k → a f ( x ) g ( x ) = lim k → a f ′ ( x ) g ′ ( x ) has to be applied several times. For k → 0 and arbitrary 0 < p < 1 , we obtain lim k → 0 e k p − 1 e k − 1 = lim k → 0 e k p p e k = p . This means that the Lorenz curve converges towards the diagonal. For k → ∞ , one obtains that for all 0 < p < 1

lim k → ∞ e k p − 1 e k − 1 = lim k → ∞ e k p p e k = lim k → ∞ e k p p e k = lim k → ∞ p e − k ( 1 − p ) = 0.

This means that the Lorenz curve converges towards the lower right corner of the square.

The extreme Lorenz curves can be obtain by the limit studies k → 0 and k → ∞ , and the Lorenz curves as functions of the parameter k are presented in

The Gini index is for the Chotikapanich model

G = 1 − 2 ( 1 k e k − 1 e k − 1 − 1 k ( e k − 1 ) ) = 1 − 2 ( e k − k − 1 k ( e k − 1 ) ) .

When k → 0 , one obtains

lim k → 0 G = lim k → 0 ( 1 − 2 ( e k − k − 1 k ( e k − 1 ) ) ) = 1 − 2 lim k → 0 ( e k − 1 ( e k − 1 ) + k e k ) = 1 − 2 lim k → 0 ( e k e k + k e k + e k ) = 1 − 2 1 2 = 0

When k → ∞ , one obtains

lim k → ∞ G = lim k → ∞ ( 1 − 2 ( e k − k − 1 k ( e k − 1 ) ) ) = 1 − 2 lim k → ∞ ( e k − 1 ( e k − 1 ) + k e k ) = 1 − 2 lim k → ∞ ( e k e k + k e k + e k ) = 1 − 2 lim k → ∞ ( 1 2 + k ) = 1 − 0 = 1

Consequently, for G, the inequalities 0 < G < 1 hold and the range cannot be shortened.

The Pietra index is

P = 1 k ln ( e k − 1 k ) − 1 k + 1 e k − 1 = 1 k ln ( e k − 1 k ) − e k − 1 − k k ( e k − 1 ) = 1 k k + 1 k ln ( 1 − e − k ) k − e k − 1 − k k ( e k − 1 ) = 1 + 1 k ln ( 1 − e − k ) k − e k − 1 − k k ( e k − 1 )

In general, P < G , and consequently, P → 0 when k → 0 . When k → ∞ , one obtains

lim k → ∞ ( 1 + 1 k ln ( 1 − e − k ) k − e k − 1 − k k ( e k − 1 ) ) = lim k → ∞ ( 1 + 1 k ln ( 1 − e − k ) k − ) − lim k → ∞ ( e k − 1 ( e k − 1 ) + k e k ) = lim k → ∞ ( 1 − ln ( k ) k + 1 k ln ( 1 − e − k ) ) − 0 = 1

Hence, lim k → ∞ P = 1 and the inequalities 0 < P < 1 hold and cannot be shortened.

The G and P as functions of the parameter k indices are presented in

Above, we made the general remark that different distributions can result in the same Gini index. In

In general, the step from the Lorenz curve to the income distribution starts from the formula

L ′ ( p ) = x p μ , (18)

where x p is the p-percentile and µ is the mean of the corresponding distribution F ( x ) . We define M ( ⋅ ) as the inverse function of the derivative L ′ ( ⋅ ) . From (18), we obtain

p = M ( x p μ ) . (19)

Equation (19) indicates that M ( ⋅ ) is the income distribution function corresponding to the given Lorenz curve, that is, F ( x ) = M ( x μ ) . This connection

between the Lorenz curve and the distribution function is easily defined, but for most of the exact Lorenz curves, it is difficult or even impossible to mathematically obtain the distribution.

Primary income data yield the most exact estimates of the income inequality coefficients, such as Gini and Pietra. Fellman [

This work was supported in part by a grant from the Magnus Ehrnrooth Foundation.

Fellman, J. (2018) Income Inequality Measures. Theoretical Eco- nomics Letters, 8, 557-574. https://doi.org/10.4236/tel.2018.83039