Skip to content
Sahithyan's S2
Sahithyan's S2 — Methods of Mathematics

Joint Distribution

Specifies the probability of observing a combination of values for two or more random variables. Characterizes the relationship between multiple random variables, including their dependencies and correlations.

Definition

For two random variables XX and YY, the joint probability distribution gives the probability that XX and YY simultaneously take on specific values. This can be expressed as:

  • For discrete random variables: P(X=x,Y=y)P(X = x, Y = y) or PX,Y(x,y)P_{X,Y}(x, y)
  • For continuous random variables: fX,Y(x,y)f_{X,Y}(x, y)

For a joint probability distribution P(AB)P(A \cap B), P(A)P(A) and P(B)P(B) are the marginal probabilities.

Covariance

Denoted by Cov(X,Y)\text{Cov}(X,Y) or σXY\sigma_{XY}. Measures the linear relationship between two random variables.

Cov(X,Y)=xy(xμX)(yμY)P(X=x,Y=y)\mathrm{Cov}(X, Y) = \sum_{x} \sum_{y} (x - \mu_X)(y - \mu_Y) \, P(X = x, Y = y)
  • Positive covariance: Indicates that higher than mean values of one variable tend to be paired with higher than mean values of other variable.
  • Negative covariance: Indicates that higher than mean values of one variable tend to be paired with lower than mean values of other variable.
  • If the two random variables are independent then the covariance will be zero.

Properties:

  • Cov(X,Y)=E[(XE(X))(YE(Y))]\text{Cov}(X,Y) = E\Big[\big(X-E(X)\big)\big(Y-E(Y)\big)\Big]
  • Cov(X,Y)=E(XY)E(X)E(Y)\text{Cov}(X,Y) = E(XY) - E(X)E(Y)
  • Cov(X,a)=0\text{Cov}(X,a) = 0
  • Cov(X,X)=Var(X)\text{Cov}(X,X) = \text{Var}(X)
  • Cov(aX,bY)=abCov(X,Y)\text{Cov}(aX,bY) = ab\text{Cov}(X,Y)
  • Cov(X+Y,Z)=Cov(X,Z)+Cov(Y,Z)\text{Cov}(X+Y,Z) = \text{Cov}(X,Z) + \text{Cov}(Y,Z)

Correlation

Corr(X,Y)=ρXY=Cov(X,Y)Var(X)Var(Y)\text{Corr}(X,Y) = \rho_{XY} = \frac{\text{Cov}(X,Y)}{\sqrt{\text{Var}(X)\text{Var}(Y)}}

ρXY\rho_{XY} is the Pearson correlation coefficient.

Sample correlation coefficient

Denoted by r[1,1]r \in [-1,1].

r=(xiyi)nxˉyˉ(xi2nxˉ2)(yi2nyˉ2)r = \frac{\sum (x_i y_i) - n \bar{x} \bar{y}}{\sqrt{\left( \sum x_i^2 - n \bar{x}^2 \right) \left( \sum y_i^2 - n \bar{y}^2 \right)}}

Properties

Non-negativity

  • Discrete case: x,y    P(X=x,Y=y)0\forall x,y\;\;P(X = x, Y = y) \geq 0
  • Continuous case: x,y    fX,Y(x,y)0\forall x,y\;\;f_{X,Y}(x, y) \geq 0

Total probability equals 1

  • Discrete case: xyP(X=x,Y=y)=1\sum_x \sum_y P(X = x, Y = y) = 1
  • Continuous case: fX,Y(x,y)dydx=1\int_{-\infty}^{\infty}\int_{-\infty}^{\infty} f_{X,Y}(x, y) \, dy \, dx = 1

Marginal distributions

The distribution of an individual variable can be derived from the joint distribution:

  • Discrete case: P(X=x)=yP(X=x,Y=y)P(X = x) = \sum_y P(X = x, Y = y)
  • Continuous case: fX(x)=fX,Y(x,y)dyf_X(x) = \int_{-\infty}^{\infty} f_{X,Y}(x, y) \, dy

Conditional distributions

The distribution of one variable given a specific value of the other:

  • Discrete case: P(X=xY=y)=P(X=x,Y=y)P(Y=y)P(X = x | Y = y) = \frac{P(X = x, Y = y)}{P(Y = y)}
  • Continuous case: fXY(xy)=fX,Y(x,y)fY(y)f_{X|Y}(x|y) = \frac{f_{X,Y}(x, y)}{f_Y(y)}

Independence

Random variables XX and YY are independent iff:

  • Discrete case: x,y    P(X=x,Y=y)=P(X=x)P(Y=y)\forall x,y\;\; P(X = x, Y = y) = P(X = x) \cdot P(Y = y)
  • Continuous case: x,y    fX,Y(x,y)=fX(x)fY(y)\forall x,y\;\; f_{X,Y}(x, y) = f_X(x) \cdot f_Y(y)

Representation

Joint distributions can be represented in various ways:

  • For discrete variables: probability mass tables or matrices
  • For continuous variables: joint density functions or contour plots
  • Copulas: functions that describe the dependence structure between variables

Types

For Discrete Variables

For joint probability mass function, if x,yx,y are independent, P(x,y)=P(x)P(y)P(x,y)=P(x)P(y).

Cumulative probability:

P(Xx,Yy)=xyP(x,y)P(X \le x, Y \le y) = \sum_{x}\sum_{y} P(x,y)

For marginal probability of X=aX = a, yP(a,y)\sum_y P(a,y).

For Continuous Variables

Suppose ff is the joint probability density function. The joint probability for any region AA lying in x-y plane is:

f[(X,Y)A]=Af(x,y)  dxdyf\big[(X,Y) \in A \big] = \int \int_A f(x,y)\; \text{d}x\text{d}y

The cumulative distribution function,

F(a,b)=P(Xa,Yb)=baf(x,y)  dxdyF(a,b) = P(X \le a, Y \le b) = \int_{-\infty}^{b} \int_{-\infty}^{a} f(x,y)\; \text{d}x\text{d}y

For marginal probability density function of XX,

g(x)=f(x,y)dyg(x) = \int_{-\infty}^{\infty} f(x,y) \text{d}y