During the breaks, the average was revealed. For example, the arithmetic mean for an interval series

The characteristics of units of statistical aggregates are different in their meaning, for example, the wages of workers in the same profession of an enterprise are not the same for the same period of time, market prices for the same products, crop yields in the district’s farms, etc. Therefore, in order to determine the value of a characteristic that is characteristic of the entire population of units being studied, average values are calculated.
average value – this is a generalizing characteristic of a set of individual values of some quantitative characteristic.

The population studied on a quantitative basis consists of individual values; they are influenced by both general causes and individual conditions. In the average value, deviations characteristic of individual values are canceled out. The average, being a function of a set of individual values, represents the entire aggregate with one value and reflects what is common to all its units.

The average calculated for populations consisting of qualitatively homogeneous units is called typical average. For example, you can calculate the average monthly salary of an employee of a particular professional group (miner, doctor, librarian). Of course, the levels of monthly wages of miners, due to differences in their qualifications, length of service, time worked per month and many other factors, differ from each other and from the level of average wages. However, the average level reflects the main factors that influence the level of wages, and cancels out the differences that arise due to individual characteristics employee. The average salary reflects the typical level of remuneration for a given type of worker. Obtaining a typical average should be preceded by an analysis of how qualitatively homogeneous the given population is. If the totality consists of individual parts, it should be divided into typical groups (average temperature in the hospital).

Average values used as characteristics for heterogeneous populations are called system averages. For example, the average gross domestic product (GDP) per capita, the average consumption various groups goods per person and other similar values, representing the general characteristics of the state as a unified economic system.

The average must be calculated for populations consisting of sufficient large number units. Compliance with this condition is necessary for the law of large numbers to come into force, as a result of which random deviations of individual values from the general trend are mutually canceled out.

Types of averages and methods for calculating them

The choice of the type of average is determined by the economic content of a certain indicator and source data. However, any average value must be calculated so that when it replaces each variant of the averaged characteristic, the final, generalizing, or, as it is commonly called, does not change. defining indicator, which is associated with the averaged indicator. For example, when replacing actual speeds on individual sections of the route with their average speed, the total distance traveled by the vehicle in the same time should not change; when replacing the actual wages of individual employees of an enterprise with the average wage, the wage fund should not change. Consequently, in each specific case, depending on the nature of the available data, there is only one true average value of the indicator that is adequate to the properties and essence of the socio-economic phenomenon being studied.
The most commonly used are the arithmetic mean, harmonic mean, geometric mean, quadratic mean and cubic mean.
The listed averages belong to the class sedate averages and are combined by the general formula:
,
where is the average value of the characteristic being studied;
m – average degree index;
– current value (variant) of the characteristic being averaged;
n – number of features.
Depending on the value of the exponent m, the following types of power averages are distinguished:
when m = -1 – harmonic mean;
at m = 0 – geometric mean;
for m = 1 – arithmetic mean;
for m = 2 – root mean square;
at m = 3 – average cubic.
When using the same initial data, the larger the exponent m in the above formula, the larger the average value:
.
This property of power averages to increase with increasing exponent of the defining function is called the rule of majority of averages.
Each of the marked averages can take two forms: simple And weighted.
Simple medium form used when the average is calculated from primary (ungrouped) data. Weighted form– when calculating the average based on secondary (grouped) data.

Arithmetic mean

The arithmetic mean is used when the volume of the population is the sum of all individual values of a varying characteristic. It should be noted that if the type of average is not specified, the arithmetic average is assumed. Its logical formula looks like:

Simple arithmetic mean calculated based on ungrouped data according to the formula:
or ,
where are the individual values of the characteristic;
j is the serial number of the observation unit, which is characterized by the value ;
N – number of observation units (volume of the population).
Example. The lecture “Summary and grouping of statistical data” examined the results of observing the work experience of a team of 10 people. Let's calculate the average work experience of the team's workers. 5, 3, 5, 4, 3, 4, 5, 4, 2, 4.

Using the simple arithmetic mean formula, we can also calculate averages in chronological series, if the time intervals for which the characteristic values are presented are equal.
Example. The volume of products sold for the first quarter amounted to 47 den. units, for the second 54, for the third 65 and for the fourth 58 den. units The average quarterly turnover is (47+54+65+58)/4 = 56 den. units
If momentary indicators are given in a chronological series, then when calculating the average they are replaced by half-sums of the values at the beginning and end of the period.
If there are more than two moments and the intervals between them are equal, then the average is calculated using the formula for the average chronological

,
where n is the number of time points
In the case when the data is grouped by characteristic values (i.e., a discrete variational distribution series has been constructed) with arithmetic average weighted calculated using either frequencies or frequencies of observations of specific values of the characteristic, the number of which (k) is significantly less than the number of observations (N).
,
,
where k is the number of groups of the variation series,
i – group number of the variation series.
Since , a , we obtain the formulas used for practical calculations:
And
Example. Let's calculate the average length of service of work teams in a grouped row.
a) using frequencies:

b) using frequencies:

In the case when the data is grouped by intervals , i.e. are presented in the form of interval distribution series; when calculating the arithmetic mean, the middle of the interval is taken as the value of the attribute, based on the assumption of a uniform distribution of population units over a given interval. The calculation is carried out using the formulas:
And
where is the middle of the interval: ,
where and are the lower and upper boundaries of the intervals (provided that the upper boundary of a given interval coincides with the lower boundary of the next interval).

Example. Let's calculate the arithmetic mean of the interval variation series constructed based on the results of a study of the annual wages of 30 workers (see lecture “Summary and grouping of statistical data”).
Table 1 – Interval variation series distribution.

Intervals, UAH	Frequency, people	Frequency,	The middle of the interval
600-700 700-800 800-900 900-1000 1000-1100 1100-1200	3 6 8 9 3 1	0,10 0,20 0,267 0,30 0,10 0,033	(600+700):2=650 (700+800):2=750 850 950 1050 1150	1950 4500 6800 8550 3150 1150	65 150 226,95 285 105 37,95

UAH or UAH
Arithmetic means calculated on the basis of source data and interval variation series may not coincide due to the uneven distribution of attribute values within the intervals. In this case, for a more accurate calculation of the weighted arithmetic mean, one should use not the middles of the intervals, but the simple arithmetic means calculated for each group ( group averages). The average calculated from group means using a weighted calculation formula is called general average.
The arithmetic mean has a number of properties.
1. The sum of deviations from the average option is zero:
.
2. If all the values of the option increase or decrease by the amount A, then the average value increases or decreases by the same amount A:

3. If each option is increased or decreased by B times, then the average value will also increase or decrease by the same number of times:
or
4. The sum of the products of the option by the frequencies is equal to the product of the average value by the sum of the frequencies:

5. If all frequencies are divided or multiplied by any number, then the arithmetic mean will not change:

6) if in all intervals the frequencies are equal to each other, then the weighted arithmetic mean is equal to the simple arithmetic mean:
,
where k is the number of groups of the variation series.

Using the properties of the average allows you to simplify its calculation.
Let us assume that all options (x) are first reduced by the same number A, and then reduced by a factor of B. The greatest simplification is achieved when the value of the middle of the interval with the highest frequency is chosen as A, and the value of the interval (for series with identical intervals) is selected as B. The quantity A is called the origin, so this method of calculating the average is called way b ohm reference from conditional zero or way of moments.
After such a transformation, we obtain a new variational distribution series, the variants of which are equal to . Their arithmetic mean, called moment of the first order, is expressed by the formula and, according to the second and third properties, the arithmetic mean is equal to the mean of the original version, reduced first by A, and then by B times, i.e.
For getting real average(average of the original series) you need to multiply the first-order moment by B and add A:

The calculation of the arithmetic mean using the method of moments is illustrated by the data in Table. 2.
Table 2 – Distribution of factory shop workers by length of service

Employees' length of service, years	Amount of workers	Middle of the interval
0 – 5 5 – 10 10 – 15 15 – 20 20 – 25 25 – 30	12 16 23 28 17 14	2,5 7,5 12,7 17,5 22,5 27,5	15 -10 -5 0 5 10	3 -2 -1 0 1 2	36 -32 -23 0 17 28

Finding the first order moment . Then, knowing that A = 17.5 and B = 5, we calculate the average length of service of the workshop workers:
years

Harmonic mean
As shown above, the arithmetic mean is used to calculate the average value of a characteristic in cases where its variants x and their frequencies f are known.
If statistical information does not contain frequencies f for individual options x of the population, but is presented as their product, the formula is applied weighted harmonic mean. To calculate the average, let's denote where . Substituting these expressions into the formula for the arithmetic weighted average, we obtain the formula for the harmonic weighted average:
,
where is the volume (weight) of the indicator attribute values in the interval numbered i (i=1,2, …, k).

Thus, the harmonic mean is used in cases where it is not the options themselves that are subject to summation, but their reciprocals: .
In cases where the weight of each options equal to one, i.e. individual values of the inverse characteristic occur once, applied mean harmonic simple:
,
where are individual variants of the inverse characteristic, occurring once;
N – number option.
If there are harmonic averages for two parts of a population, then the overall average for the entire population is calculated using the formula:

and is called weighted harmonic mean of group means.

Example. During trading on the currency exchange, three transactions were concluded in the first hour of operation. Data on the amount of hryvnia sales and the hryvnia exchange rate against the US dollar are given in table. 3 (columns 2 and 3). Determine the average exchange rate of the hryvnia against the US dollar for the first hour of trading.
Table 3 – Data on the progress of trading on the foreign exchange exchange

The average dollar exchange rate is determined by the ratio of the amount of hryvnia sold during all transactions to the amount of dollars acquired as a result of the same transactions. The final amount of the sale of the hryvnia is known from column 2 of the table, and the number of dollars purchased in each transaction is determined by dividing the amount of the sale of the hryvnia by its exchange rate (column 4). A total of $22 million was purchased during three transactions. This means that the average exchange rate of the hryvnia for one dollar was
.
The resulting value is real, because replacing it with actual hryvnia exchange rates in transactions will not change the final amount of hryvnia sales, which serves as defining indicator: million UAH
If the arithmetic mean were used for the calculation, i.e. hryvnia, then at the exchange rate for the purchase of 22 million dollars. it would be necessary to spend 110.66 million UAH, which is not true.

Geometric mean
The geometric mean is used to analyze the dynamics of phenomena and allows one to determine the average growth coefficient. When calculating the geometric mean, individual values of a characteristic are relative indicators of dynamics, constructed in the form of chain values, as the ratio of each level to the previous one.
The simple geometric mean is calculated using the formula:
,
where is the sign of the product,
N – number of averaged values.
Example. The number of registered crimes over 4 years increased by 1.57 times, including for the 1st – 1.08 times, for the 2nd – 1.1 times, for the 3rd – 1.18 and for the 4th – 1.12 times. Then the average annual growth rate of the number of crimes is: , i.e. the number of registered crimes grew annually by an average of 12%.

1,8
-0,8
0,2
1,0
1,4

1
3
4
1
1

3,24
0,64
0,04
1
1,96

3,24
1,92
0,16
1
1,96

To calculate the weighted mean square, we determine and enter into the table and . Then the average deviation of the length of products from the given norm is equal to:

The arithmetic average would be unsuitable in this case, because as a result we would get zero deviation.
The use of the mean square will be discussed further in terms of variation.

Example : It is required to determine the average age of the student correspondence form training according to the data specified in the following table:

Age of students, years ( X)	Number of students, people ( f)	average value of the interval (x",xcentral)	*xifi**





26 and older
Total:

To calculate the average in interval series, first determine the average value of the interval as the half-sum of the upper and lower limits, and then calculate the average using the arithmetic weighted average formula.

Above is an example with equal intervals, with the 1st and last being open.

Answer: The average student age is 22.6 years, or approximately 23 years.

Harmonic mean has a more complex structure than the arithmetic mean. Used in cases where statistical information does not contain frequencies for individual values of the attribute, and is represented by the product of the attribute value by frequency . The harmonic mean as a type of power mean looks like this:

Depending on the form of presentation of the source data, the harmonic mean can be calculated as simple or weighted. If the source data is not grouped, then average harmonic simple :

It is used in cases of determining, for example, the average costs of labor, materials, etc. per unit of production for several enterprises.

When working with grouped data, use weighted harmonic mean:

Geometric meanapplies in cases where when the total volume of the averaged feature is a multiplicative quantity,those. is determined not by summing, but by multiplying the individual values of the characteristic.

Shape of geometric weighted mean in practical calculations not applicable .

Mean square used in cases where, when replacing individual values of a characteristic with an average value, it is necessary to keep the sum of squares of the original values unchanged .

home scope of its use – measurement of the degree of fluctuation of individual values of a characteristic relative to the arithmetic mean(standard deviation). In addition, the mean square used in cases where it is necessary to calculate the average value a characteristic expressed in square or cubic units of measurement (when calculating the average size of square areas, average diameters pipes, trunks, etc.).

The root mean square is calculated in two forms:

All power means differ from each other in the values of the exponent. Wherein, the higher the exponent, the morequantitative value of the average:

This property of power averages is called property of majorance of averages.

The most common type of average is the arithmetic mean.

Simple arithmetic mean

A simple arithmetic mean is the average term, in determining which the total volume of a given attribute in the data is equally distributed among all units included in the given population. Thus, the average annual output per worker is the amount of output that would fall to each employee if the entire volume of output were in to the same degree distributed among all employees of the organization. The arithmetic mean simple value is calculated using the formula:

Simple arithmetic average— Equal to the ratio of the sum of individual values of a characteristic to the number of characteristics in the aggregate

Example 1 . A team of 6 workers receives 3 3.2 3.3 3.5 3.8 3.1 thousand rubles per month.

Find average salary
Solution: (3 + 3.2 + 3.3 +3.5 + 3.8 + 3.1) / 6 = 3.32 thousand rubles.

Arithmetic average weighted

If the volume of the data set is large and represents a distribution series, then the weighted arithmetic mean is calculated. This is how the weighted average price per unit of production is determined: the total cost of production (the sum of the products of its quantity by the price of a unit of production) is divided by the total quantity of production.

Let's imagine this in the form of the following formula:

Weighted arithmetic average— equal to the ratio of (the sum of the products of the value of a feature to the frequency of repetition of this feature) to (the sum of the frequencies of all features). It is used when variants of the population under study occur an unequal number of times.

Example 2 . Find the average salary of workshop workers per month

The average salary can be obtained by dividing the total salary by total number workers:

Answer: 3.35 thousand rubles.

Arithmetic mean for interval series

When calculating the arithmetic mean for an interval variation series, first determine the mean for each interval as the half-sum of the upper and lower limits, and then the mean of the entire series. In the case of open intervals, the value of the lower or upper interval is determined by the size of the intervals adjacent to them.

Averages calculated from interval series are approximate.

Example 3. Determine the average age of evening students.

Averages calculated from interval series are approximate. The degree of their approximation depends on the extent to which the actual distribution of population units within the interval approaches uniform distribution.

When calculating averages, not only absolute but also relative values (frequency) can be used as weights:

The arithmetic mean has a number of properties that more fully reveal its essence and simplify calculations:

1. The product of the average by the sum of frequencies is always equal to the sum of the products of the variant by frequencies, i.e.

2.Medium arithmetic sum varying quantities is equal to the sum of the arithmetic averages of these quantities:

3. The algebraic sum of deviations of individual values of a characteristic from the average is equal to zero:

4. The sum of squared deviations of options from the average is less than the sum of squared deviations from any other arbitrary value, i.e.

Often in statistics, when analyzing a phenomenon or process, it is necessary to take into account not only information about the average levels of the indicators being studied, but also scatter or variation in the values of individual units , which is important characteristic the population being studied.

The most subject to variation are stock prices, supply and demand, and interest rates over different periods of time and in different places.

The main indicators characterizing the variation , are range, dispersion, standard deviation and coefficient of variation.

Range of variation represents the difference between the maximum and minimum values of the characteristic: R = Xmax – Xmin. The disadvantage of this indicator is that it evaluates only the boundaries of variation of a trait and does not reflect its variability within these boundaries.

Dispersion lacks this shortcoming. It is calculated as the average square of deviations of the characteristic values from their average value:

A simplified way to calculate variance carried out using the following formulas (simple and weighted):

Examples of application of these formulas are presented in tasks 1 and 2.

A widely used indicator in practice is standard deviation :

The standard deviation is defined as Square root from the variance and has the same dimension as the trait being studied.

The considered indicators allow us to obtain the absolute value of the variation, i.e. evaluate it in units of measurement of the characteristic being studied. Unlike them, the coefficient of variation measures variability in relative terms - relative to the average level, which in many cases is preferable.

Formula for calculating the coefficient of variation.

Examples of solving problems on the topic “Indicators of variation in statistics”

Problem 1 . When studying the influence of advertising on the size of the average monthly deposit in banks in the region, 2 banks were examined. The following results were obtained:

Define:
1) for each bank: a) average deposit per month; b) contribution dispersion;
2) the average monthly deposit for two banks together;
3) Deposit variance for 2 banks, depending on advertising;
4) Deposit variance for 2 banks, depending on all factors except advertising;
5) Total variance using the addition rule;
6) Coefficient of determination;
7) Correlation relationship.

Solution

1) Let's create a calculation table for a bank with advertising . To determine the average monthly deposit, we will find the midpoints of the intervals. In this case, the value of the open interval (the first) is conditionally equated to the value of the interval adjacent to it (the second).

We will find the average deposit size using the weighted arithmetic average formula:

29,000/50 = 580 rub.

We find the variance of the contribution using the formula:

23 400/50 = 468

We will perform similar actions for a bank without advertising :

2) Let's find the average deposit size for the two banks together. Хср =(580×50+542.8×50)/100 = 561.4 rub.

3) We will find the variance of the deposit for two banks, depending on advertising, using the formula: σ 2 =pq (formula for the variance of an alternative attribute). Here p=0.5 is the proportion of factors dependent on advertising; q=1-0.5, then σ 2 =0.5*0.5=0.25.

4) Since the share of other factors is 0.5, then the variance of the deposit for two banks, depending on all factors except advertising, is also 0.25.

5) Determine the total variance using the addition rule.

= (468*50+636,16*50)/100=552,08

= [(580-561,4)250+(542,8-561,4)250] / 100= 34 596/ 100=345,96

σ 2 = σ 2 fact + σ 2 rest = 552.08+345.96 = 898.04

6) Determination coefficient η 2 = σ 2 fact / σ 2 = 345.96/898.04 = 0.39 = 39% - the size of the contribution depends on advertising by 39%.

7) Empirical correlation ratio η = √η 2 = √0.39 = 0.62 – the relationship is quite close.

Problem 2 . There is a grouping of enterprises according to the size of marketable products:

Determine: 1) the dispersion of the value of marketable products; 2) standard deviation; 3) coefficient of variation.

Solution

1) By condition, an interval distribution series is presented. It must be expressed discretely, that is, find the middle of the interval (x"). In groups of closed intervals, we find the middle using a simple arithmetic mean. In groups with an upper limit - as the difference between this upper limit and half the size of the next interval (200-(400 -200):2=100).

In groups with a lower limit - the sum of this lower limit and half the size of the previous interval (800+(800-600):2=900).

We calculate the average value of marketable products using the formula:

Хср = k×((Σ((x"-a):k)×f):Σf)+a. Here a=500 is the size of the option at the highest frequency, k=600-400=200 is the size of the interval at the highest frequency Let's put the result in the table:

So, the average value of commercial output for the period under study is generally equal to Хср = (-5:37)×200+500=472.97 thousand rubles.

2) We find the variance using the following formula:

σ 2 = (33/37)*2002-(472.97-500)2 = 35,675.67-730.62 = 34,945.05

3) standard deviation: σ = ±√σ 2 = ±√34,945.05 ≈ ±186.94 thousand rubles.

4) coefficient of variation: V = (σ /Хср)*100 = (186.94 / 472.97)*100 = 39.52%

When statistically processing the results of research of various kinds, the obtained values are often grouped into a sequence of intervals. To calculate generalized collations of such sequences, it is sometimes necessary to calculate middle interval- “central option”. The methods for calculating it are quite primitive, but have some features arising from both the scale used for measurement and the nature of the grouping (open or closed gaps).

Instructions

1. If the interval is a section of constant number sequence, then to find its middle, use ordinary mathematical methods for calculating the arithmetic mean. Minimum value interval(his preface) add with the maximum (end) and divide the total in half - this is one of the methods for calculating the arithmetic mean. Let's say this rule applies when it comes to age interval X. Let's say, mid-age interval in the range from 21 to 33 years the mark will be 27 years old because (21+33)/2=27.

2. Sometimes it is more convenient to use another method of calculating the arithmetic mean between the upper and lower limits interval. In this option, first determine the width of the range - subtract the minimum value from the maximum value. After this, divide the resulting value in half and add the total to the minimum value of the range. Let's say, if the lower limit corresponds to the value of 47.15, and the upper limit corresponds to 79.13, then the width of the range will be 79.13-47.15 = 31.98. Then the middle interval will be 63.14 because 47.15+(31.98/2) = 47.15+15.99 = 63.14.

3. If the interval is not part of an ordinary number sequence, then calculate it middle in accordance with the repeatability and dimension of the measuring scale used. Let's say, if we are talking about a historical period, then the middle interval will be a specific calendar date. So for interval from January 1, 2012 to January 31, 2012, the midpoint will be January 16, 2012.

4. In addition to ordinary (closed) intervals, statistical research methods can also operate with “open” ones. For such ranges, one of the boundaries is not defined. For example, the open period can be specified by the wording “from 50 years and older.” The middle in this case is determined by the method of analogies - if all other ranges of the sequence in question have identical widths, then it is assumed that this open interval has the same dimension. In the opposite case, you need to determine the dynamics of metamorphosis of the width of the gaps preceding the open one, and derive its conditional width based on the resulting tendency of metamorphosis.

Occasionally in everyday activities there may be a need to detect middle straight line segment. For example, if you need to make a pattern, a sketch of a product, or easily saw a wooden block into two equal parts. Geometry and a little bit of everyday ingenuity come to the rescue.

You will need

Compass, ruler; pin, pencil, thread

Instructions

1. Use ordinary tools prepared for measuring length. This is the easiest method to find middle segment. Measure the length of the segment with a ruler or tape measure, divide the resulting value in half and measure the resulting total from one end of the segment. You will get a point corresponding to the middle of the segment.

2. There is a more accurate method for finding the midpoint of a segment, learned from a school geometry course. To do this, take a compass and a ruler, and the ruler can be replaced by any object of suitable length with a straight side.

3. Set the distance between the legs of the compass so that it is equal to length segment or larger than half the segment. After this, place the compass needle at one end of the segment and draw a semicircle so that it intersects the segment. Move the needle to the other end of the segment and, without changing the span of the legs of the compass, draw the second semicircle correctly in the same way.

4. You have received two points of intersection of semicircles on both sides of the segment, middle which we want to discover. Combine these two points using a ruler or a flat block. The connecting line will pass exactly in the middle of the segment.

5. If you don’t have a compass at hand or the length of the segment significantly exceeds the possible span of its legs, you can use a simple device from improvised means. It can be made from an ordinary pin, thread and pencil. Tie the ends of the thread to a pin and a pencil, and the length of the thread should slightly exceed the length of the segment. With such an improvised substitute for a compass, all that remains is to follow the steps described above.

Video on the topic

Helpful advice
You can quite accurately locate the middle of a board or block using an ordinary thread or cord. To do this, cut the thread so that it matches the length of the board or bar. All that remains is to fold the thread in half and cut it into two equal parts. Attach one end of the resulting measurement to the end of the object being measured, and the 2nd end will correspond to its middle.