Uniform continuous distribution in MS EXCEL. Uniform and exponential laws of distribution of a continuous random variable

This issue has long been studied in detail, and the method of polar coordinates, proposed by George Box, Mervyn Muller and George Marsaglia in 1958, was most widely used. This method allows you to get a pair of independent normally distributed random variables with mean 0 and variance 1 as follows:

Where Z 0 and Z 1 are the desired values, s \u003d u 2 + v 2, and u and v are random variables uniformly distributed on the segment (-1, 1), selected in such a way that the condition 0 is satisfied< s < 1.
Many use these formulas without even thinking, and many do not even suspect their existence, since they use ready-made implementations. But there are people who have questions: “Where did this formula come from? And why do you get a pair of values ​​at once? In the following, I will try to give a clear answer to these questions.


To begin with, let me remind you what the probability density, the distribution function of a random variable and the inverse function are. Suppose there is some random variable, the distribution of which is given by the density function f(x), which has the following form:

This means that the probability that the value of this random variable will be in the interval (A, B) is equal to the area of ​​the shaded area. And as a consequence, the area of ​​the entire shaded area should be equal to unity, since in any case the value of the random variable will fall into the domain of the function f.
The distribution function of a random variable is an integral of the density function. And in this case, its approximate form will be as follows:

Here the meaning is that the value of the random variable will be less than A with probability B. And as a result, the function never decreases, and its values ​​lie in the interval .

An inverse function is a function that returns the argument of the original function if you pass the value of the original function into it. For example, for the function x 2 the inverse will be the root extraction function, for sin (x) it is arcsin (x), etc.

Since most pseudo-random number generators give only a uniform distribution at the output, it often becomes necessary to convert it to some other one. In this case, to a normal Gaussian:

The basis of all methods for transforming a uniform distribution into any other distribution is the inverse transformation method. It works as follows. A function is found that is inverse to the function of the required distribution, and a random variable uniformly distributed on the segment (0, 1) is passed to it as an argument. At the output, we obtain a value with the required distribution. For clarity, here is the following picture.

Thus, a uniform segment is, as it were, smeared in accordance with the new distribution, being projected onto another axis through an inverse function. But the problem is that the integral of the density of the Gaussian distribution is not easy to calculate, so the above scientists had to cheat.

There is a chi-square distribution (Pearson distribution), which is the distribution of the sum of squares of k independent normal random variables. And in the case when k = 2, this distribution is exponential.

This means that if a point in a rectangular coordinate system has random X and Y coordinates distributed normally, then after converting these coordinates to the polar system (r, θ), the square of the radius (the distance from the origin to the point) will be distributed exponentially, since the square of the radius is the sum of the squares of the coordinates (according to the Pythagorean law). The distribution density of such points on the plane will look like this:


Since it is equal in all directions, the angle θ will have a uniform distribution in the range from 0 to 2π. The converse is also true: if you specify a point in the polar coordinate system using two independent random variables (the angle distributed uniformly and the radius distributed exponentially), then the rectangular coordinates of this point will be independent normal random variables. And the exponential distribution from the uniform distribution is already much easier to obtain, using the same inverse transformation method. This is the essence of the Box-Muller polar method.
Now let's get the formulas.

(1)

To obtain r and θ, it is necessary to generate two random variables uniformly distributed on the segment (0, 1) (let's call them u and v), the distribution of one of which (let's say v) must be converted to exponential to obtain the radius. The exponential distribution function looks like this:

Its inverse function:

Since the uniform distribution is symmetrical, the transformation will work similarly with the function

It follows from the chi-square distribution formula that λ = 0.5. We substitute λ, v into this function and get the square of the radius, and then the radius itself:

We obtain the angle by stretching the unit segment to 2π:

Now we substitute r and θ into formulas (1) and get:

(2)

These formulas are ready to use. X and Y will be independent and normally distributed with a variance of 1 and a mean of 0. To get a distribution with other characteristics, it is enough to multiply the result of the function by the standard deviation and add the mean.
But it is possible to get rid of trigonometric functions by specifying the angle not directly, but indirectly through the rectangular coordinates of a random point in a circle. Then, through these coordinates, it will be possible to calculate the length of the radius vector, and then find the cosine and sine by dividing x and y by it, respectively. How and why does it work?
We choose a random point from uniformly distributed in the circle of unit radius and denote the square of the length of the radius vector of this point by the letter s:

The choice is made by assigning random x and y rectangular coordinates uniformly distributed in the interval (-1, 1), and discarding points that do not belong to the circle, as well as the central point at which the angle of the radius vector is not defined. That is, the condition 0< s < 1. Тогда, как и в случае с Гауссовским распределением на плоскости, угол θ будет распределен равномерно. Это очевидно - количество точек в каждом направлении одинаково, значит каждый угол равновероятен. Но есть и менее очевидный факт - s тоже будет иметь равномерное распределение. Полученные s и θ будут независимы друг от друга. Поэтому мы можем воспользоваться значением s для получения экспоненциального распределения, не генерируя третью случайную величину. Подставим теперь s в формулы (2) вместо v, а вместо тригонометрических функций - их расчет делением координаты на длину радиус-вектора, которая в данном случае является корнем из s:

We get the formulas, as at the beginning of the article. The disadvantage of this method is the rejection of points that are not included in the circle. That is, using only 78.5% of the generated random variables. On older computers, the lack of trigonometric functions was still a big advantage. Now, when one processor instruction simultaneously calculates sine and cosine in an instant, I think these methods can still compete.

Personally, I have two more questions:

  • Why is the value of s evenly distributed?
  • Why is the sum of squares of two normal random variables exponentially distributed?
Since s is the square of the radius (for simplicity, the radius is the length of the radius vector that specifies the position of a random point), we first find out how the radii are distributed. Since the circle is filled uniformly, it is obvious that the number of points with radius r is proportional to the circumference of the circle with radius r. The circumference of a circle is proportional to the radius. This means that the distribution density of the radii increases uniformly from the center of the circle to its edges. And the density function has the form f(x) = 2x on the interval (0, 1). Coefficient 2 so that the area of ​​the figure under the graph is equal to one. When such a density is squared, it becomes uniform. Since theoretically, in this case, for this it is necessary to divide the density function by the derivative of the transformation function (that is, from x 2). And visually it happens like this:

If a similar transformation is done for a normal random variable, then the density function of its square will turn out to be similar to a hyperbola. And the addition of two squares of normal random variables is already a much more complex process associated with double integration. And the fact that the result will be an exponential distribution, personally, it remains for me to check it with a practical method or accept it as an axiom. And for those who are interested, I suggest that you familiarize yourself with the topic closer, drawing knowledge from these books:

  • Wentzel E.S. Probability theory
  • Knut D.E. The Art of Programming Volume 2

In conclusion, I will give an example of the implementation of a normally distributed random number generator in JavaScript:

Function Gauss() ( var ready = false; var second = 0.0; this.next = function(mean, dev) ( mean = mean == undefined ? 0.0: mean; dev = dev == undefined ? 1.0: dev; if ( this.ready) ( this.ready = false; return this.second * dev + mean; ) else ( var u, v, s; do ( u = 2.0 * Math.random() - 1.0; v = 2.0 * Math. random() - 1.0; s = u * u + v * v; ) while (s > 1.0 || s == 0.0); var r = Math.sqrt(-2.0 * Math.log(s) / s); this.second = r * u; this.ready = true; return r * v * dev + mean; ) ); ) g = new Gauss(); // create an object a = g.next(); // generate a pair of values ​​and get the first one b = g.next(); // get the second c = g.next(); // generate a pair of values ​​again and get the first one
The mean (mathematical expectation) and dev (standard deviation) parameters are optional. I draw your attention to the fact that the logarithm is natural.

The distribution function in this case, according to (5.7), will take the form:

where: m is the mathematical expectation, s is the standard deviation.

The normal distribution is also called Gaussian after the German mathematician Gauss. The fact that a random variable has a normal distribution with parameters: m,, is denoted as follows: N (m, s), where: m =a =M ;

Quite often, in formulas, the mathematical expectation is denoted by a . If a random variable is distributed according to the law N(0,1), then it is called a normalized or standardized normal value. The distribution function for it has the form:

.

The graph of the density of the normal distribution, which is called the normal curve or Gaussian curve, is shown in Fig. 5.4.

Rice. 5.4. Normal distribution density

Determining the numerical characteristics of a random variable by its density is considered on an example.

Example 6.

A continuous random variable is given by the distribution density: .

Determine the type of distribution, find the mathematical expectation M(X) and the variance D(X).

Comparing the given distribution density with (5.16), we can conclude that the normal distribution law with m =4 is given. Therefore, mathematical expectation M(X)=4, variance D(X)=9.

Standard deviation s=3.

The Laplace function, which has the form:

,

is related to the normal distribution function (5.17), by the relation:

F 0 (x) \u003d F (x) + 0.5.

The Laplace function is odd.

Ф(-x)=-Ф(x).

The values ​​of the Laplace function Ф(х) are tabulated and taken from the table according to the value of x (see Appendix 1).

The normal distribution of a continuous random variable plays an important role in the theory of probability and in the description of reality; it is very widespread in random natural phenomena. In practice, very often there are random variables that are formed precisely as a result of the summation of many random terms. In particular, the analysis of measurement errors shows that they are the sum of various kinds of errors. Practice shows that the probability distribution of measurement errors is close to the normal law.

Using the Laplace function, one can solve problems of calculating the probability of falling into a given interval and a given deviation of a normal random variable.

Consider a uniform continuous distribution. Let's calculate the mathematical expectation and variance. Let's generate random values ​​using the MS EXCEL functionRAND() and the Analysis Package add-in, we will evaluate the mean and standard deviation.

evenly distributed on the interval, the random variable has:

Let's generate an array of 50 numbers from the range )