What is called a random variable x. The concept of a random variable. Law of distribution of a random variable. Tasks for independent work

Let a continuous random variable X be specified by the distribution function F(X) . Let us assume that all possible values random variable belong to the segment [ A, B].

Definition. Mathematical expectation a continuous random variable X, the possible values ​​of which belong to the segment , is called a definite integral

If possible values ​​of a random variable are considered on the entire numerical axis, then the mathematical expectation is found by the formula:

In this case, of course, it is assumed that improper integral converges.

Definition. Variance of a continuous random variable is the mathematical expectation of the square of its deviation.

By analogy with the variance of a discrete random variable, to practically calculate the variance, the formula is used:

Definition. Standard deviation Called Square root from dispersion.

Definition. Fashion M0 of a discrete random variable is called its most probable value. For a continuous random variable, mode is the value of the random variable at which the distribution density has a maximum.

If the distribution polygon for a discrete random variable or the distribution curve for a continuous random variable has two or more maxima, then such a distribution is called Bimodal or Multimodal.

If a distribution has a minimum but no maximum, then it is called Antimodal.

Definition. Median MD of a random variable X is its value relative to which it is equally probable that a larger or smaller value of the random variable will be obtained.

Geometrically, the median is the abscissa of the point at which the area limited by the distribution curve is divided in half.

Note that if the distribution is unimodal, then the mode and median coincide with the mathematical expectation.

Definition. The starting moment About K A random variable X is the mathematical expectation of the value X K.

For a discrete random variable: .

.

The initial moment of the first order is equal to the mathematical expectation.

Definition. Central moment About K random variable X is the mathematical expectation of the value

For a discrete random variable: .

For a continuous random variable: .

The first order central moment is always equal to zero, and the second-order central moment is equal to the dispersion. The third-order central moment characterizes the asymmetry of the distribution.

Definition. The ratio of the central moment of the third order to the standard deviation to the third power is called Asymmetry coefficient.

Definition. To characterize the peakedness and flatness of the distribution, a quantity called Excess.

In addition to the quantities considered, the so-called absolute moments are also used:

Absolute starting moment: .

Absolute central moment: .

The absolute central moment of the first order is called Arithmetic mean deviation.

Example. For the example discussed above, determine the mathematical expectation and variance of the random variable X.

Example. There are 6 white and 4 black balls in an urn. A ball is removed from it five times in a row, and each time the removed ball is returned back and the balls are mixed. Taking the number of extracted white balls as a random variable X, draw up a distribution law for this value, determine its mathematical expectation and dispersion.

Since the balls in each experiment come back and are mixed, the tests can be considered independent (the result of the previous experiment does not affect the probability of the occurrence or non-occurrence of an event in another experiment).

Thus, the probability of a white ball appearing in each experiment is constant and equal to

Thus, as a result of five consecutive trials, the white ball may not appear at all, or appear once, twice, three, four or five times.

To draw up a distribution law, you need to find the probabilities of each of these events.

1) The white ball did not appear at all:

2) The white ball appeared once:

3) The white ball will appear twice: .

4) The white ball will appear three times:

Random value as a fundamental concept of probability theory has great importance in its applications. This concept is an abstract expression of a random event. Moreover, it is sometimes more convenient to operate with random variables than with random events.

Random is a quantity that, as a result of experiment, can take on one or another (but only one) value (before the experiment it is not known which one).

Events are usually designated in capital letters Latin alphabet, letter probability R, For example, R(A). Realizations of an event (random variables) are indicated in small letters: a 1 , a 2 , …, a n.

Since in probability theory and mathematical statistics are being considered mass phenomena, then the random variable is usually characterized possible values ​​and their probabilities.

Among the random variables encountered in practice, discrete and continuous ones can be distinguished.

Discrete random variablesare called those that take only values ​​separated from each other and can be enumerated in advance. For example, the number of cars on a given kilometer section of road at a specific point in time; number of defective components of car parts in a batch of n things.

For discrete random variables It is characteristic that they accept separate, isolated values, which can be listed in advance. For example, the number of cars on a given road section can only take integer values ​​0, 1,2, ..., P and depends on the time of day and traffic intensity.

There are other types of random variables that are more common and have great practical significance.

Continuous random variableis called one whose possible values ​​continuously fill a certain interval(numeric axis interval). The number line interval can be finite or infinite. Examples of continuous random variables are the uptime of a car under given road conditions, the speed of a car on a given road, and measurement error.

Unlike discrete possible values ​​of continuous random variables cannot be listed in advance, since they continuously fill a certain gap.

Random variables are usually denoted by capital letters of the Latin alphabet - X, Y, Z, T, and their possible values ​​are correspondingly small x i, y i, z i, t i, Where i = 1, 2, .... P.

Consider a discrete random variable X with possible values x 1 , x 2 , …, xn. As a result of repeated experiments, the value T can take each of the values x i, i.e.:

X = x 1 ; X = x 2 ; ...; X = xn.

Let us denote the probabilities of these events by the letter R with the corresponding indices:


P(X = x 1)= p 1 ; P(X = x 2)= p 2 ; ...; P(X = x n)= p n .

Based on the fact that events x i form a complete group of incompatible events, i.e. no other events can occur, the sum of the probabilities of all possible values ​​of the random variable T is equal to one.

This total probability is somehow distributed between the individual values ​​of the random variable

Discrete random variable can be fully described from a probabilistic point of view if you accurately indicate the probability of each event, i.e., specify this distribution. This will establish the law of distribution of the random variable.

Law of distribution of a random variableis any relation that establishes a connection between the possible values ​​of a random variable and their corresponding probabilities. Knowing it, one can judge before experiment which values ​​of a random variable will appear more often and which less frequently. The methods or forms of representing the law of distribution of a random variable are different.

The simplest form of the task distribution law of a discrete random variable T is a distribution series or a table listing the possible values ​​of this quantity and their corresponding probabilities.

The simplest form of specifying this law is a table that lists the possible values ​​of a random variable and their corresponding probabilities.

Such a table is called a distribution series of the random variable X.


0 x 1 x 2 x 3 x 4 x 5 x 6

Distribution function

The distribution law is a complete and exhaustive characteristic of a discrete random variable. However, it is not universal, since it cannot be applied to continuous random variables. A continuous random variable takes on an infinite number of values, filling a certain interval. It is almost impossible to create a table that includes all the values ​​of a continuous random variable. Consequently, for a continuous random variable there is no distribution law, in the same sense as it exists for a discrete random variable.

How to describe a continuous random variable?

For this purpose, it is not the probability of event X = x that is used, but the probability of event X<х, где х - некоторая переменная. Вероятность этого события зависит от х и является функцией х.

This function is called distribution function random variable X and is denoted F(x):

F(x)=P(X

The distribution function is a universal characteristic of a random variable. It exists for any random variables: discrete and continuous.

Properties of the distribution function:

1. When x 1 > x 2 F(x 1)> F(x 2)

2. F(- ∞)=0

3. F(+ ∞)=1

The distribution function of a discrete random variable is a discontinuous step function; jumps occur at points corresponding to possible values ​​of the random variable and are equal to the probability of these values. The sum of these jumps is equal to one.

1 F(x)





Numerical characteristics of random variables.

The main characteristics of discrete random variables are:

· distribution function;

· distribution range;

for a continuous random variable:

· distribution function;

· distribution density.

Any law represents some function, and the indication of this function completely describes the random variable.

However, when solving a number of practical problems, it is not always necessary to characterize a random variable in full. It is enough to indicate only some numerical parameters characterizing the random variable.

Such characteristics, the purpose of which is to represent in a concentrated form the most significant features of the distribution, are called numerical characteristics of a random variable.

Position Characteristics

(MOJ,mode,median)

Of all the used numerical characteristics of random variables, the characteristics that describe the position of the random variable on the numerical axis are most often used, namely, they indicate some average value around which the possible values ​​of the random variable are grouped.

For this purpose they are used the following characteristics:

· expected value;

· median.

The mathematical expectation (average value) is calculated as follows:

X 1 R 1 +x 2 R 2 +….+x n R n ∑ x i р i

р 1 + р 2 + …..+р n n

Considering that ∑ p i , MOZ is equal to M[X] = x i p i

The mathematical expectation of a random variable is the sum of the products of all possible values ​​of a random variable and the probabilities of these values.

The above formulation is valid only for discrete random variables.

For continuous quantities

M[X] = x f(x)dx, Where f(x) - distribution density X.

There are different ways to calculate the average. The most common forms of representing averages are arithmetic mean, median and mode.

The arithmetic mean is obtained by dividing the total value of a given characteristic for the entire homogeneous statistical population by the number of units of this population. To calculate the arithmetic average, the formula is used:

Хср = (Х1+Х2+... +Хn):n,

where Xi is the value of the characteristic of the i-th unit of the population, n is the number of units of the population.

Fashion a random variable is called its most probable value.


M


Median is the value that is located in the middle of the ordered series. For an odd number of units in a series, the median is unique and is located exactly in the middle of the series; for an even number, it is defined as the average value of two adjacent units of the population occupying the middle position.

Statistics is a branch of science that studies the quantitative side of mass phenomena public life, consisting of individual elements, units. The combination of elements constitutes a statistical population. The purpose of the study is to establish quantitative patterns of development of this phenomenon. It is based on the application of probability theory and the law of large numbers. The essence of this law is that, despite the individual random fluctuations of individual elements of the aggregate, a certain pattern manifests itself in the total mass, characteristic of the given aggregate as a whole. The greater the number of individual elements characterizing the phenomenon under study is considered, the more clearly the pattern inherent in this phenomenon is revealed.

Crime is a social, mass phenomenon; it is a statistical aggregate of numerous facts of individual criminal manifestations. This gives grounds to use methods of statistical theory to study it.

In statistical studies of social phenomena, three stages can be distinguished:

1) statistical observation, i.e. collection of primary statistical material;

2) summary processing of the collected data, during which the results are calculated, summary (summarizing) indicators are calculated and the results are presented in the form of tables and graphs;

3) analysis, during which the patterns of the statistical population under study are identified, the relationships between its various components, and a meaningful interpretation of generalizing indicators is carried out.

The first stage of statistical research is statistical observation. It plays a special role, since errors made during the data collection process are almost impossible to correct at further stages of work, which ultimately entails incorrect conclusions about the properties of the phenomenon being studied and their incorrect interpretation.

According to the method of recording facts, statistical observation is divided into continuous and discontinuous. By continuous, or current, we mean such observation in which the establishment and identification of facts is carried out as they arise. With continuous observation, facts are recorded either regularly at certain intervals or as needed.

Based on the coverage of units of the surveyed population, continuous and non-continuous observation are distinguished. Continuous observation is an observation in which all units of the population being studied are subject to recording. For example, the registration of crimes theoretically represents continuous observation. However, in practice, a certain part of crimes, called latent, remains outside the statistical population under study and therefore, in fact, such observation is incomplete. Incomplete observation is an observation in which not all units of the population being studied are subject to registration. It is divided into several types: observation of the main array, selective observation and some others.

Observation of the main array (sometimes called the imperfect continuous method) is a type of non-continuous observation in which, of the entire set of units of an object, such a part of them is observed that constitutes the overwhelming, predominant share of the entire set. Conducting observations using this method is practiced in cases where complete coverage of all units of the population is associated with special difficulties and at the same time, excluding a certain number of units from observation does not have a significant impact on conclusions about the properties of the entire population. Therefore, the registration of crimes can most likely be attributed specifically to this type of observation.

Most perfect look Non-continuous observation is sampling, in which, in order to characterize the entire population, only a certain part of it is examined, but sampled according to certain rules. The main condition for the correctness of conducting sample observation is such a selection, as a result of which the selected part of the units for all the characteristics to be studied would sufficiently accurately characterize the entire population as a whole. Most often, sample observation is used during sociological research. In the future, we will consider the rules and methods for selecting units during selective observation.

After the primary material has been collected and verified, the second stage of statistical research is carried out. Statistical observation provides material characterizing individual units of the object of study. The purpose of the summary is to summarize, systematize and generalize the results of observation so that it becomes possible to identify character traits and essential properties, to discover patterns of the phenomena and processes being studied.

The simplest example of a summary is the summation of all reported crimes. However, such a generalization does not give a complete picture of all the properties of the crime situation. To characterize crime deeply and comprehensively, it is necessary to know how the total number of crimes is distributed by type, time, place and method of commission, etc.

The distribution of units of the object under study into homogeneous groups according to their essential characteristics is called statistical grouping. Objects studied by statistics are usually characterized by many properties and relationships, expressed by various characteristics. Therefore, the grouping of objects under study can be done depending on the objectives of the statistical study according to one or more of these characteristics. Thus, the personnel of the body can be grouped by positions, special ranks, age, length of service, marital status, etc.

As a result of processing and systematization of primary statistical materials a series of digital indicators are obtained that characterize individual aspects of the phenomena or processes being studied or their changes. These rows are called statistical. According to their content, statistical series are divided into two types: distribution series and dynamics series. Distribution series are series that characterize the distribution of units of the original population according to any one characteristic, the varieties of which are arranged in a certain order. For example, the distribution of the total number of crimes by individual species, the numbers of all personnel by position are distribution rows.

Dynamic series are series that characterize changes in the size of social phenomena over time. A detailed consideration of such series and their use in the analysis and forecast of the crime situation is the subject of a separate lecture.

The results of statistical observation and summaries of its materials are expressed primarily in absolute values ​​(indicators). Absolute values ​​show dimensions social phenomenon in given conditions of place and time, for example, the number of crimes committed or the number of persons who committed them, the actual number of personnel or the number of vehicles. Absolute values ​​are divided into individual and total (i.e., total). Individual are absolute values ​​that express the size of quantitative characteristics of individual units of a particular set of objects (for example, the number of victims or material damage in a specific criminal case, the age or length of service of a given employee, his salary, etc.). They are obtained directly in the process of statistical observation and are recorded in primary accounting documents. Individual absolute values ​​serve as the basis of any statistical study.

In contrast to individual ones, total absolute values ​​characterize the final value of a characteristic for a certain set of objects covered statistical observation. They are obtained either by directly counting the number of observation units (for example, the number of crimes of a certain type), or as a result of summing the attribute values ​​of individual units of the population (for example, the damage caused by all crimes).

However, absolute values ​​taken by themselves do not always provide a proper idea of ​​the phenomena and processes being studied. Therefore, along with absolute values, relative values ​​are of great importance in statistics.

Comparison is the main technique for assessing statistical data and an integral part of all methods of their analysis. However, a simple comparison of two quantities is not enough to accurately assess their relationship. This ratio must also be measured. The role of the measure of such a relationship is performed by relative quantities.

Unlike absolute values, relative values ​​are derived indicators. They are obtained not as a result of simple summation, but by relative (multiple) comparison of absolute values ​​with each other.

Depending on the nature of the phenomenon being studied and the specific objectives of the study, relative values ​​may have different shape (appearance) expressions. The simplest form of expression relative size is a number (integer or fraction) showing how many times one quantity is greater than another, taken as the basis of comparison, or what part it makes up.

Most often, in the analytical activities of internal affairs bodies, another form of representing relative numbers is used, a percentage ratio, in which the main value is taken as 100. To determine the percentage ratio, the result of dividing one absolute value by another (base) multiply by 100.

An important role in the summary processing of statistical data belongs to the average value. Since each individual unit of the statistical population has individual characteristics, differing from any other quantitative value, is used to characterize the properties of the entire statistical population as a whole. average value . In statistics, the average value is understood as an indicator that reflects the level of a variable variable in value per unit of a homogeneous population.

To characterize the homogeneity of a statistical population

According to the relevant criteria, various indicators are used: variation, dispersion, standard deviation. These indicators make it possible to assess the extent to which the corresponding average value reflects the properties of the entire population as a whole, and whether it can even be used as a generalizing characteristic of a given statistical population. A detailed consideration of the listed indicators is a separate issue.

Discrete random variable and the law of its distribution

Along with the concept of a random event, probability theory also uses the more convenient concept random variable.

Definition. Random variable is a quantity that, as a result of experiment, takes on one of its possible values, and it is not known in advance which one.

We will denote random variables in capital letters Latin alphabet ( X, Y, Z,…), and their possible meanings are indicated in corresponding small letters ( x i , y i ,…).

Examples: the number of points obtained when throwing a die; number of appearances of the coat of arms in 10 coin tosses; number of shots until the first hit on the target; the distance from the center of the target to the hole upon impact.

It can be noted that the set of possible values ​​for the listed random variables has different type: for the first two quantities it is finite (6 and 11 values, respectively), for the third quantity the set of values ​​is infinite and represents a set natural numbers, A for the fourth– all points of a segment whose length is equal to the radius of the target. Thus, for the first three quantities we obtain a set of values ​​from individual (discrete) values ​​isolated from each other, and for the fourth it represents a continuous area. According to this indicator, random variables are divided into two groups: discrete and continuous.

Definition. discrete, if it takes on separate, isolated possible values ​​with certain probabilities. The number of possible values ​​of a discrete random variable can be finite or infinite.

Definition. The random variable is called continuous, if the set of its possible values ​​completely fills some finite or infinite interval. The number of possible values ​​of a continuous random variable is infinite.

To specify a discrete random variable, you need to know its possible values ​​and the probabilities with which these values ​​are accepted. The correspondence between them is called law of distribution random variable. It can be in the form of a table, formula or graph.

A table that lists the possible values ​​of a discrete random variable and their corresponding probabilities is called near distribution:

x i x 1 x 2 x n possible values
p i p 1 p 2 p n probability of possible values

Note that the event that a random variable takes one of its possible values ​​is reliable, therefore, or

Task. The coin is tossed 5 times. Random value X– the number of coat of arms drops. Create a series of distributions of a random variable X.



Solution. It's obvious that X can take 5 values: 0, 1, 2, 3, 4, 5, that is X= 0, 1, 2, 3, 4, 5. By condition, . Let's calculate the probability of each value using Bernoulli's formula: .

The coat of arms will not appear even once (k = 0): .

Or .

The coat of arms will appear once (k = 1):
.

The coat of arms will appear twice (k = 2):

The coat of arms will appear three times (k = 3):

The coat of arms will appear four times (k = 4):

The coat of arms will appear five times (k = 5):

Therefore, the distribution series looks like:

binomial probabilities

In this case, the sum of the probabilities is equal to one:

Graphically, the distribution law of a discrete random variable can be represented as distribution polygon– a broken line connecting points of the plane with coordinates ( x i, p i). That is, the possible values ​​of a random variable are plotted along the abscissa axis, and the probabilities of these values ​​are plotted along the ordinate axis. For clarity, the resulting points are connected by straight segments. The distribution polygon, like the distribution series, completely characterizes the random variable and is one of the forms of the distribution law.

Educational institution "Belarusian State

agricultural Academy"

Department of Higher Mathematics

Guidelines

to study the topic “Random Variables” by students of the Faculty of Accounting for Correspondence Education (NISPO)

Gorki, 2013

Random variables

    Discrete and continuous random variables

One of the main concepts in probability theory is the concept random variable . Random variable is a quantity that, as a result of testing, takes only one of its many possible values, and it is not known in advance which one.

There are random variables discrete and continuous . Discrete random variable (DRV) is a random variable that can take on a finite number of values ​​isolated from each other, i.e. if the possible values ​​of this quantity can be recalculated. Continuous random variable (CNV) is a random variable, all possible values ​​of which completely fill a certain interval of the number line.

Random variables are denoted by capital letters of the Latin alphabet X, Y, Z, etc. Possible values ​​of random variables are indicated by the corresponding small letters.

Record
means "the probability that a random variable X will take a value of 5, equal to 0.28.”

Example 1 . The dice are thrown once. In this case, numbers from 1 to 6 may appear, indicating the number of points. Let us denote the random variable X=(number of points rolled). This random variable as a result of the test can take only one of six values: 1, 2, 3, 4, 5 or 6. Therefore, the random variable X there is DSV.

Example 2 . When a stone is thrown, it travels a certain distance. Let us denote the random variable X=(stone flight distance). This random variable can take any, but only one, value from a certain interval. Therefore, the random variable X there is NSV.

    Distribution law of a discrete random variable

A discrete random variable is characterized by the values ​​it can take and the probabilities with which these values ​​are taken. The correspondence between possible values ​​of a discrete random variable and their corresponding probabilities is called law of distribution of a discrete random variable .

If all possible values ​​are known
random variable X and probabilities
appearance of these values, then it is believed that the law of distribution of DSV X is known and can be written in table form:

The DSV distribution law can be depicted graphically if points are depicted in a rectangular coordinate system
,
, …,
and connect them with straight line segments. The resulting figure is called a distribution polygon.

Example 3 . Grain intended for cleaning contains 10% weeds. 4 grains were selected at random. Let us denote the random variable X=(number of weeds among the four selected). Construct the DSV distribution law X and distribution polygon.

Solution . According to the example conditions. Then:

Let's write down the distribution law of DSV X in the form of a table and construct a distribution polygon:

    Expectation of a discrete random variable

The most important properties of a discrete random variable are described by its characteristics. One of these characteristics is expected value random variable.

Let the DSV distribution law be known X:

Mathematical expectation DSV X is the sum of the products of each value of this quantity by the corresponding probability:
.

The mathematical expectation of a random variable is approximately equal to the arithmetic mean of all its values. Therefore, in practical problems, the average value of this random variable is often taken as the mathematical expectation.

Example 8 . The shooter scores 4, 8, 9 and 10 points with probabilities of 0.1, 0.45, 0.3 and 0.15. Find the mathematical expectation of the number of points with one shot.

Solution . Let us denote the random variable X=(number of points scored). Then . Thus, the expected average number of points scored with one shot is 8.2, and with 10 shots - 82.

Main properties mathematical expectation are:


.


.


, Where
,
.

.

, Where X And Y

Difference
called deviation random variable X from its mathematical expectation. This difference is a random variable and its mathematical expectation is zero, i.e.
.

    Variance of a discrete random variable

To characterize a random variable, in addition to the mathematical expectation, we also use dispersion , which makes it possible to estimate the dispersion (spread) of the values ​​of a random variable around its mathematical expectation. When comparing two homogeneous random variables with equal mathematical expectations, the “best” value is considered to be the one that has less spread, i.e. less dispersion.

Variance random variable X is called the mathematical expectation of the squared deviation of a random variable from its mathematical expectation: .

In practical problems, an equivalent formula is used to calculate the variance.

The main properties of the dispersion are:


.


.

, Where X And Y are independent random variables.

Dispersion characterizes the spread of a random variable around its mathematical expectation and, as can be seen from the formula, is measured in square units compared to the units of the random variable itself. Therefore, to harmonize the units of measurement of the spread of a random variable with the units of measurement of the value itself, we introduce standard deviation
.

Example 9 . Find the dispersion and standard deviation of the DSV X, given by the distribution law:

Solution . DSV variance X calculated by the formula

Let's find the mathematical expectation of this random variable: . Let's write down the distribution law for a random variable
:

,
.

Questions for self-control of knowledge

    What is a random variable?

    Which random variable is called discrete and which is called continuous?

    What is the distribution law of a discrete random variable called?

    What is the mathematical expectation of a discrete random variable and what are its main properties?

    What is the deviation of a random variable from its mathematical expectation?

    What is called the variance of a discrete random variable and what are its main properties?

    Why is the standard deviation introduced and how is it calculated?

Tasks for independent work