Probability

The probability (of the Latin probare , “to prove”, “to test”) is an evaluation of the probable character of a event. In mathematics the study of the probabilities is a subject of great importance giving place to many applications. The study of the probabilities, relatively recent in the history of mathematics, experienced many developments during the last three sciècles. The study of the aspect alétoire and partly unforeseeable of certain phenomena, in particular the games of chance, led the mathematicians to develop a theory which then had implications in fields as varied as the Météorologie, the Finance or the Chimie. This article is a simplified approach of the conceptes and results of importance of probability.

History

As opposed to what one pourait to think the scientific study of the probabilities first of all is relatively recent in the history of mathematics. Other fields such as the Geometry, the Arithmetic , the Algebra or astronomy were the mathematical subject of study during Antiquity but one does not find a trace of mathematical texts on the probabilities.

Concept of " risque" appeared only at the 12th century for the evaluations of commercial contracts with the Treaty of the contracts of Pierre of Jean Olivi, and developed at the 16th century with the generalization of the contracts of marine insurance. It will be necessary pourant to await the 17th century and the correspondence between Blaise Pascal and Pierre de Fermat for a scientific beginning of treatment of the subject around minor problems on games of chance.

Probabilities with XVIIe at the XIXe century

Besides some considerations elementary by Girolamo Cardano at the beginning of the 16th century and by Galileo at the beginning of the 17th century, the true beginning of the theory of probability dates from the correspondence between Pierre de Fermat and Blaise Pascal in 1654. Those start to work out the bases of the mathematical treatment of the probabilities around the study of games of chance suggested, amongst other things, by the knight of Mother. (see opposite a page of the correspondence between Pascal and Fermat) Although being regarded as the founders of the treatment of the probabilities they did not publish anything of their work and it will be necessary to await Huygens for a first work on the subject.

Encouraged by Pascal, Christiaan Huygens publishes ratiociniis in ludo aleae (reasoning on the sets of dice) in 1657. This work constitutes the first important work on the probabilities. It defines in it the concepts of Espérance and develops to with it several problems of divisions of profits at the time of plays or pullings in ballot boxes. Two works founders are also to note: Ars Conjectandi of Jacques Bernoulli (posthumous, 1713) which defines the concept of Random variable and gives the first version of the Loi of the great numbers and Theory of probability of Abraham de Moivre (1718) which generalizes the use of the Combinatoire.

The " theory of the erreurs" , which seeks to quantify the difference between measurement that one makes of a variable and his true value and which is a prefiguration of the central limit theorems, sees the day with Opera Miscellanea of Roger Cotes (posthumous, 1722). The first to apply it to the errors to the observations is Thomas Simpson in 1755.

Pierre-Simon Laplace gives a first version of the central limit theorem into 1812 which does not apply whereas for a variable in two states, for example pile or face but not a die has 6 faces.

Under the impulse of Quételet, which opens in 1841 the first statistical office the Superior council of Statistics, the Statistiques develop and become a field with whole share of mathematics which is based on the probabilities but do not form of it any more part.

Birth of the modern theory of the probabilities

The modern theory of probability takes really its esssort only with the concept of Mesure and measurable units that Emile Borel introduces in 1897. This concept of measurement is complete by Henri Leon Lebesgue and its theory of the Intégration. The first modern version of the Théorème of the central limit is given by Alexandre Liapounov in 1901 and the first proof of the modern theorem given by Paul Levy in 1910. In 1902 Andrei Markov introduces the Chaînes of Markov to undertake a generalization of the law of the great numbers for a continuation of experiments dépandant from/to each other. These chains of Markov will know many applications amongst other things to model the diffusion or for the indexing of Web sites on google.

1933 will have to be waited until so that the theory of probability left a whole of methods and examples various and becomes a true theory, axiomatized by Kolmogorov.

Kiyoshi Itô sets up a theory and a lemma which bears its name in the years 1940. Those make it possible to connect the stochastic Calcul and the partial derivative equations thus establishing the link between analyzes and probabilities. The mathematician Wolfgang Doeblin had on his side outlined a similar theory before committing suicide with the defeat of his battalion in June 1940. Its work was sent in a fold sealed to the Academy of Science which was open only in 2000.

Applications

The games of chance are the most natural application of the probabilities but many other fields rest or make use of the probabilities. Let us quote amongst other things:
  • the Statistical , are a vast domain which is based on the probabilities for the processing and the interpretation data.
  • the Game theory is strongly based on the probability and is useful in economy and more precisely in Micro-économie.
  • the optimal estimate by use of the law of Bayes, which is used as base with most of the applications of automatic decision (Medical imagery, astronomy, Character recognition, filters anti junk email).
  • In Physical as in Molecular biology the study of the Brownian Movement for small particles as well as the Équation of Fokker-Planck utilizes conceptes resting on the stochastic Calcul and the random Marche
  • the epidemiology utilizes many aspects of the probabilities and the staistiques ones for the study and the forecast of epidemics.
  • the study of random Matrices with implications in Mechanical quantum or in the Theory of the cords.
  • the financial Mathématiques make a broad use of the theory of probability for the study of the and stock exchange prices Derivative products. Let us quote for example the Modèle of Black-Scholes for the study of the courrs of the purse.
  • the Cryptographie utilizes many probabilistic aspects amongst other things for the Cryptanalyse (deciphering of encrypted texts). Let us note also probabilistic tests for to determine if a number is first which are used for the generation of the keys for encoding.
  • and even the music.

Basic principles

The probability of some event has, \ textstyle \ mathbb {P} (A), is represented by a number ranging between 0 and 1. An impossible event has a probability of 0 and one unquestionable event has a probability of 1. It should be known that the opposite is not inevitably true. An event which has a probability 0 can occur very well if an infinite number of different events can occur. This is detailed in the negligible article Ensemble and an example of event of probability 0 and being able to occur (quickly) is outlined in the part “law of the great numbers”.

\ textstyle has \ cup B is the meeting of has and B. \ textstyle has \ course B is the intersection of has and B. \ mathbb {P} (has|B) is called the Conditional probability of has knowing B. It is the probability of having has when one knows that one has B. For example, for a die with 6 face the probability of having one 2 (A) when it is known that the result is even (B) is equal to \ textstyle \ frac {1/6} {1/2} =1/3 because the probability of having at the same time one 2 and one even number is equal to 1/6 and the probability of having an even number is equal to 1/2. Here it is noticed that \ textstyle has \ course B=A because one always has an even number when one has 2.

Concept of independence

See also: Independence (probabilities)

Two events has and B is independent if and only if the realization of the one do not influence that of the other. In mathematical term, that is formalized as follows:

Two events has and B is independent if and only if they check
\ mathbb {P} (has \ course B) = \ mathbb {P} (A) \ cdot \ mathbb {P} (B)
This concept of independence intervenes in many theorems for example in the Loi of the great numbers and the Central limit theorem low exposed.

Random variable

See also: Random variable

An important concept probability is that of Random variable. A random variable is an application which with a possible result of the experiment associates a value. A random variable thus will take such or such value according to the result obtained; and they are not the possible values of the variable, nor the value which it takes once one knows the result of the experiment which is random, but the value which it will take before to have carried out the experiment.

The random variables were introduced in the beginning to represent a profit. For example let us carry out the following experiment, launch a coin and according to whether the result is pile we gain ten euros, or face we lose one euro. That is to say G the random variable which takes value 10 when we obtain pile and value -1 when we obtain face. G represents the profit at the conclusion of a throw of the part. In a more general way one random variable is a certain function, in general noted X, which depends on the result of a random experiment for example in this case the result of the pile or face.

Function of distribution and density

See also: Density of probability

In probability, the Fonction of distribution of a Random variable X is the function \ F_X which with any reality x associates F_X (X) = \ mathbb {P} X. It is the probability that variable X is smaller than X. The function of distribution of a variable is a strictly increasing function going from 0 to 1.

For the continuous variables one then defines the density of probability or law of a variable by the function F which is the Dérivée from F compared to x.f_X (X) = \ frac {D \ mathbb {P} X} {D X}. the knowledge of the function of distribution then makes it possible to calculate, for example, the probability that X lies between has and B while integrating.

The hope

See also: Expectation

The hope is a number which often merges with the average of a variable, to see on this subject the Loi of the great numbers or the next paragraph. One defines it by:

\ mathbb {E} (X) = \ sum_ {i=1} ^ {K} p_i \, x_i for a variable with a finished number of possible achievements.
For example, for a die with 6 faces, each vis-a-vis a probability 1/6 of appearing and the hope is worth then \ textstyle \ frac {1+2+3+4+5+6} {6} =3,5.
\ mathbb {E} (X) = \ int_ {\ mathbb {R}} X \, F (X) \, dx for a variable continues density F.

Two basic theorems of the probabilities

Two mathematical theorems have a particular place of probability. These two theorems which are the law of the great numbers and the central limit theorem and are presented here succintement to render comprehensible some the interest and the use.

Law of the great numbers

See also: Law of the great numbers

One presents here only the strong law of the great numbers but it should be known that of another versions of laws of the great numbers exist.

For independent random variables, of the same law X_i and whose hope exists:

\ frac {X_1+X_2+… +X_n} {N} \ rightarrow_ {N \ rightarrow \ infty} \ mathbb {E} (X)

Concretely this law says to us that the empirical Moyenne of a variable tends towards its hope. For example for a die with 6 faces which one will throw several times of continuation the average of the throws tends towards hope 3,5.

To tend towards is taken to the direction Presque surely, like very often of probability, i.e. that the probability that arrives is equal to 1. As outlined in the basic principles it can be very well made that " exceptionnellement" this average does not tend towards the hope. One pourait very well, for example, to draw only from the 1 during the throws of dice and that the average is then 1 but that does not arrive " jamais". In general, if one launches dice sufficient time one will fall as much from time on chaqu' one of the 6 faces. This theorem formalizes this remark of good sense.

Central limit theorem

See also: Theorem of the central limit

This central limit theorem is useful to know how a sum between a realization of a variable and the median value behaves. The law of the great numbers shows that the average of the achievements tends towards the hope, the central limit theorem, when with him, watch in which way this average tends towards the hope. A way simple, but not very rigorous, to write this theorem makes it possible to better include/understand its utility:

\ frac {\ sum_ {k=0} ^ {k=n} (X_i- \ mathbb {E} (X))}{N} \ rightarrow_ {N \ rightarrow \ infty} NR \ left (0, \ frac {\ sigma} {\ sqrt {N}} \ right)

\ textstyle NR \ left (0, \ frac {\ sigma} {\ sqrt {N}} \ right) is the normal Loi against of variance \ textstyle \ frac {\ sigma} {\ sqrt {N}} , otherwise called Gaussian and represented Ci. This theorem with a very great utility in physics for example. It can be included/understood by " The average of the errors observed tends towards a law normale." The sum of a great error count on observations for example is almost Gaussian, it would be Gaussian if an infinity of errors were summoned but in practice that is not often the case. The Gaussian law then provides an approximation for the law of the error often more easily usable than the exact law which is not all the time known. Moreover better number of natural phenomena are due to the superposition of many causes, more or less independent which are summoned between them. It results from it that the normal law represents them in a reasonably effective way.

To be more correct it would be necessary to write the central limit theorem in the following way:

\ frac {1} {\ sigma \ sqrt {N}} \ sum_ {k=0} ^ {k=n} (X_i- \ mathbb {E} (X))\ rightarrow_ {N \ rightarrow \ infty} NR (0,1)

where the limit is taken within the meaning of " to tend in loi" , i.e. that the distribution of term of left tends towards the distribution of a Gaussienne.

It also should be known that there exist many generalizations of this theorem, amongst other things for variables which would not be identically distributed (conditions of Liapounov or conditions of Lindeberg) or for variables of varriance infinite (due to Gnedenko and Kolmogorov)

The use of combinative of the probability

See also: Combinative

Several problems of probabilities are brought back to a calculation of enumeration, in particular those for which there is a finished number of events and where the probability of each event is the same one. The difficulty to calculate probabilities lies then only in the determination of the number of possible cases, the number of outcomes successful to the realization of an event or number of achievements of an event.

Concept of Space probabilized and axioms of probability

See also: Axioms of the probabilities

See also: Space probabilized

The study of the probabilities in mathematics proceeds in a probabilized space described low. With this probabilized space one can then define the Axiomes of the probabilities developed not Kolmogorov and which are used as a basis for a mathematical study of the probabilities.

A probabilized space comprises three parts:

  • a universe \ Omega: The universe is the whole of all the possible results of the random event. For example for a die 6 faces the universe has is Ω ≡ {1, 2,3,4,5,6}.
  • a whole of events \ mathcal {B} : It is a tribe on the events Ω. This unit in the broad sense contains all the possible results of the event. For example for a die with 6 faces it contains the possibility of having one 1 or one 2: {1, 2}, the possibility nothing of leaving like result: the empty set \ textstyle \ emptyset, the possibility of leaving any face the die {1, 2,3,4,5,6}. In general with of probability one is satisfied to take the Tribu borélienne. As example the tribe borélienne for the result of a die with 4 faces is given (that for the die with 6 faces is even larger but follows the same principle):
{ø, {1}, {2}, {3}, {4}, {1,2}, {1,3}, {1,4}, {2,3}, {2,4}, {3,4}, {1,2,3}, {1,2,4}, {1,3,4}, {2,3,4}, {1,2,3,4}}. It is noticed that this tribe contains the ensmeble vacuum ø and Ω= {1,2,3,4}. This is the case for all the tribes.
  • a measurement \ mathbb {P} : This measurement or probability is the probability of carrying out one of the elements of \ mathcal {B} . This probability lies between 0 and 1 for all the elements of \ mathcal {B} , it is the first axiom probabilities. For example for a die has 6 faces: the probability of having {1} is 1/6, the probability of Ω= {1, 2,3,4,5,6}, of drawing any from the 6 faces, is 1 (this is also always the case, it is the second axiom probabilities), the probability of the empty set ø is 0. This is always the case, it is also a consequence of the axioms of the probabilities.

Accordingly, for disjoined events two to two (i.e., of intersection two to two vacuum) has 1, has 2, has 3…, the probability of their union appears as the sum of their probabilities, or, with the mathematical notations, P \ left (A_1 \ cup A_2 \ cup A_3 \ cup \ cdots \ right)

P (A_1) +P (A_2) +P (A_3) + \ cdots.

It is the third and last axiom of the probabilities. For example, and always for a die with 6 faces, probability of drawing one 1 or one 2 \ mathbb {P} (\ {1,2 \}) = \ mathbb {P} (\ {1 \}) + \ mathbb {P} (\ {2 \}) =2/6

Stochastic calculation

See also: stochastic Calculation

A stochastic process, is a random process which dépent time. These stochastic processes replace the ordinary differential equation when the random one between concerned. In discrete time, these processes are known under the name of Time serieses and are useful amongst other things in econometrics.

A random process X is a family of random variables indexed by a subset of \ textstyle \ R or \ textstyle \ N, often compared to time. It is thus a function of two variables: time and the state of the world (an event ω). The whole of the states of the world, the universe is traditionally noted Ω. the application which with T associates X (ω, T) is called trajectory of the process. The Brownian movement is a particularly simple example of random process indexed by \ textstyle \ R. One can also see it like the limit of a random Marche when the step of time tends towards 0.

Some examples of use of the stochastic processes include the econometrics, the Brownian Movement, the fluctuations of the stockmarket, the Voice recognition.

Chain Markov

See also: Chain of Markov

A chain of Markov is a stochastic Processus having the Markovian Propriété. In such a process, the prediction of the future as from the present does not require the knowledge of the past. We consider only the chains of Markov in discrete time but it should be known that there exists uen generalization in continuous time. A chain in discrete time is a sequence X 1, X 2, X 3,… of random variable. The value X N being the state of the process at the time n. If the distribution of Conditional probability of X N +1 on the last states is a function of X N only, then:

P (X_ {n+1} =x|X_0, X_1, X_2, \ ldots, X_n) = P (X_ {n+1} =x|X_n).

where X is an unspecified state of the process. The identity above is the property of Markov for the particular case of a chain in discrete time. The probability P (X_ {n+1} =x|X_n=y) is called the probability of transition from X to there it is the probability of going from X to there at time N.

This property of Markov is opposed to the concept of Hystérésis or the state acuel depends on the history and not only on the actual position. These chains of Markov intervene in the study of the random Marche and have many fields of application: Filter ATI-spam, Brownian Movement, ergodic Hypothèse, Information theory, Pattern recognition, Algorithme of Viterbi used in mobile telephony, etc…

See also: random Walk

Let us quote amongst other things as particular cases of chains of Markov the random Marche which is used in particular for the study of the Diffusion or of the plays of pile or face. A random walk is a chain of Markov where the probability of transition depends only on X there. Autrment says a chain of Markov where one a: P (X_ {n+1} =x|X_n=y) =f (X there) .

Plays of pile or face where one jourait 1 with each throw is, for example a random walk. If one has there after N throws, P (X_ {n+1} =x|X_n=y) =1/2 if (X there) =+1 or -1 and 0 if not. (one with a chance on two to gain 1 a chance out of two to lose 1)

Equations with the stochastic derivatives

See also: stochastic Differential equation

The equations with the stochastic derivative are a form of differential equation integrating a term of white Bruit into the first order for example: \ frac {dX (T)}{dt} = \ driven (X (T))+ \ sigma (X (T))\ xi (T)

To make an analogy with physics \ driven (X (T)) is the mean velocity as in point X (T) and \ sigma is related to the coefficient of diffusion. The Lemma of Itô and the Intégrale of Itô then make it possible to pass from these stochastic equations to partial derivative equations traditional or to integral equations. For example by using the lemma of Itô one obtains for the probability of being at the moment T as in point X:

\ frac {\ partial p (X, T)}{\ partial T} = \ frac {\ sigma (X) ^2} {2} \ frac {\ partial^2 p (X, T)}{\ partial x^2} + \ driven (X) \ frac {\ partial p (X, T)}{\ partial X}

This lemma is particularly important because it makes it possible to establish the link between the stochastic study of equation and the partial derivative equations which concern the analyzes. This lemma amongst other things makes it possible to obtain the equation of Fokker-Planck in physics and to treat the Brownian movement by partial derivative equations traditional or to model the stock exchange prices in financial Mathématiques.

Probability in epistemology

The Theory of probability is an important branch of the Mathématiques which is used to describe and quantify the dubious one. Uncertainty can be born from our ignorance, be due to one embrouillement or an incomprehension, or caused by the essential random aspect of nature. In all the cases, we measure the uncertainty of the events on a scale of zero (for the impossible events) with one (for the unquestionable events).

There exist two ways of considering the probabilities. The first historically consisted in carrying out combinative calculations in the case of games of chance (Pascal, Bernoulli, Pólya…) this approach can califier of objectifies . The second, which started to be spread about 1974, is founded on the Théorème of Cox-Jaynes, which shows under reasonable assumptions that any mechanism of training is either isomorphous with the theory of probability, or soft . In this second approach, the probability is regarded as the numerical translation of a state of knowledge and thus a subjective value (but nevertheless obtained by a rational process); the Subjectivité is explained by the fact why the context of interpretation of an event differs at each one. It is the school bayésienne.

The idea of probability is generally separate in two concepts:

  1. probability of random, which represents the probability of future events whose realization depends on some random phenomena physical, as to obtain have while launching a Dé or to obtain a certain number while turning a wheel;
  2. the Probability of épistémé, which represents uncertainty that we have in front of assertions, when we do not have the complete knowledge of the circumstances and causalities. Such proposals can be checked on last events or will be perhaps true in the future, but do not check themselves. Some examples of probabilities of épistémé are:
* To assign a probability with the assertion which a law suggested of the physics is true,
* Déterminer how it is “probable” that a suspect committed a crime, while basing itself on the evidence presented.

A probability with our incapacity to predict which precisely are the forces which could affect a phenomenon, or forms is part of the nature of reality reducible itself as the quantum Mécanique suggests it? The question remains open to date (see also Principe of uncertainty).

Although the same mathematical rules apply independently of interpretation chosen, the choice has important philosophical implications: do we never speak about the real-world (and there is the right to speak about it?) or simply of the representations that we do have some? Not being able by definition to differentiate the real-world from what we know, it is of course impossible to slice our point of view: the question is for us, by nature, subjective (see also free will).

Rigorous mathematical descriptions of this type of problems transfer the day only recently, in particular since

To give a possible mathematical direction, and in addition reducer, with a probability, consider a coin which you launch. Intuitively, we consider the probability of obtaining vis-a-vis any throw of the part equalizes to 1/2; but what means this sentence operationally? If we launch the part 9 times of continuation, the part will not be able obviously to fall “four times and half” on each side; it is even possible to obtain 6 face and 3 pile, even 9 face of continuation. What means in this case report/ratio 1/2 in this context and which can we exactly make?

Frequentist approaches

An initial approach was to use what will be formalized later under the name of Loi of the great numbers: we suppose whereas we carry out a certain number of throws of a part, each throw of the part being independent - what means that the exit of each throw is not affected by the preceding throw. It is what one names the model frequentist.

If we carry out NR throws of the part and that NR F represents the number of times where the part gives face, then we can, for any NR , to consider the proportion NR F / NR .

As NR becomes increasingly large, we expect in our example that the report/ratio NR F / NR becomes increasingly close to 1/2. That suggests us defining the probability P ( F ) of obtaining face as being the limiting , when NR tends towards the infinite one, of the continuation of the proportions:

P (F) = \ lim_ {NR \ to \ infty} {N_F \ over NR}

In practice, we cannot of course launch a part an infinity of time; also in general this formula applies to the situations in which we already assigned a priori, with a particular exit, a probability (in this case, we supposed that the part was honest and thus that the probability of obtaining face was to be equal to 1/2). The law of the great numbers says whereas, for a probability P ( F ) given, and any strictly positive reality ε arbitrarily small, there exists a number N such as for all NR > N one has,

\ left| P (F) - {N_F \ over NR} \ right| < \ varepsilon

In other words, by saying that “the probability of obtaining face is equal to 1/2”, we want to say that the report/ratio of the number of faces by the full number of throws will become arbitrarily close to 1/2 when the number of throws increases.

Limits of the approach frequentist

Restriction on the phenomena répétables

Let us suppose a part or a die produced in ice: one allots to “pile” in a case, with the ace in the other, a respective probability of 1/2 and 1/6e. One can however hope to launch them, in any case with room temperature, only one very limited number of times, and it is excluded to hope to make some observation above with the law of the great numbers. Let us must for as much depriving to us in their case to use the probabilities?

Admittedly, one can imagine throws with thousands of other parts or dice similar to find great numbers, but since they exist only in our mental representation, it is many states of knowledge.

It is clear that one has the right about of reliability to speak about a probability of 10^-9 without of carrying out a billion tests. That does not want to say that there will not be failure as of the second or third test… the direction of the probability, in this case, poses problem.

Erroneous idea that a probability is necessarily objective

That is illustrated by the paradox of the trucks prospectors:

A being expensive oil drilling, one is devoted as a preliminary to prospection campaigns estimating a probability of finding oil or not while drilling at a given place. This probability will lead according to its value, of the costs, and the reserves estimated (of probability they also) with the decision to drill or not.

Let us imagine two trucks prospectors one working for the company has and at the beginning of series of measurement. It estimates the probability of presence of oil at 57%. Another Juste opposite working for the company B and at the end of the series of measurement brought back in the final analysis this probability to 24%. Both are right according to measurements they have.

As for oil, there is, or there is not. From the point of view of the rock, the probability is 1 or 0, anything else.

Thus multiple truths coexist, sometimes between individuals, sometimes also at the same individual. The Théorème of Cox-Jaynes results in regarding in fact any probability as Subjective, or more exactly suitable for lived personal of the observer, and which evolves/moves as its knowledge is refined.

He can be also particularly difficult to draw the significant conclusions starting from the calculated probabilities. An amusing enigma of probability, the Problème of Monty Hall highlights certain traps.

References

Random links:Edmundo Calamy la anciano | Provigo | Bearing (river transport) | Bécorath | Privilege (data-processing) | Litovel | Loki