ProbabilityProbability Distributions
Given a probability space and a random variable , the distribution of tells us how distributes probability mass on the real number line. Loosely speaking, the distribution tells us where we can expect to find and with what probabilities.
Definition (Distribution of a random variable)
The distribution (or law) of a random variable is the probability measure on which maps a set to .
Exercise
Suppose that represents the amount of money you're going to win with the lottery ticket you just bought. Suppose that is the law of . Then
We can think of as pushing forward the probability mass from to by sending the probability mass at to for each . The probability masses at multiple 's can stack up at the same point on the real line if maps the 's to the same value.
The distribution of a discrete random variable is the measure on $\mathbb{R}$ obtained by pushing forward the probability masses at elements of the sample space to the locations of their images on the real line.
Exercise
A problem on a test requires students to match molecule diagrams to their appropriate labels. Suppose there are three labels and three diagrams and that a student guesses a matching uniformly at random. Let denote the number of diagrams the student correctly labels. What is the probability mass function of the distribution of ?
Solution. The number of correctly labeled diagrams is an integer between 0 and 3 inclusive. Suppose the labels are , and suppose the correct labeling sequence is (the final result would be the same regardless of the correct labeling sequence). The sample space consists of all six possible labeling sequences, and each of them is equally likely since the student applies the labels uniformly at random. So we have
The probability mass function of the distribution of is therefore
All together, we have
Cumulative distribution function
The distribution of a random variable
Definition (Cumulative distribution function)
If
A probability mass function XEQUATIONX1780XEQUATIONX and its corresponding CDF
Exercise
Consider a random variable
Solution. The first one is true, since the CDF goes from about 0.1 at
The second one is also true, since there is no probability mass past 2.
The third one is false: there is no probability mass in the interval from
Exercise
Suppose that
Solution. By definition of
where the last step follows since
Exercise
Random variables with the same cumulative distribution function are not necessarily equal as random variables, because the probability mass sitting at each point on the real line can come from different
For example, consider the two-fair-coin-flip experiment and let
Solution. If we define
(In fact, we can express