Chebyshev’s Inequality
“In probability theory, Chebyshev’s inequality (also called the Bienaymé–Chebyshev inequality) guarantees that, for a wide class of probability distributions, no more than a certain fraction of values can be more than a certain distance from the mean.
Specifically, no more than 1/k2 of the distribution’s values can be more than k standard deviations away from the mean
equivalently, at least 1 − 1/k2 of the distribution’s values are within k standard deviations of the mean
In statistics. The inequality has great utility because it can be applied to any probability distribution in which the mean and variance are defined.”
Ref: https://en.wikipedia.org/wiki/Chebyshev%27s_inequality
Let X (integrable) be a random variable with finite expected value μ and finite non-zero variance σ2. Then for any real number k > 0,
Only the case  is useful. When 
 the right-hand side 
 and the inequality is trivial as all probabilities are ≤ 1.
As an example, using  shows that the probability that values lie outside the interval 
 does not exceed 
.
“Markov’s inequality (and other similar inequalities) relate probabilities to expectations, and provide (frequently loose but still useful) bounds for the cumulative distribution function of a random variable.”
“If X is a nonnegative random variable and a > 0, then the probability that X is at least a is at most the expectation of X divided by a:[1]
Let {\displaystyle a={\tilde {a}}\cdot \operatorname {E} (X)}); then we can rewrite the previous inequality as
“
Ref: https://en.wikipedia.org/wiki/Markov%27s_inequality
Check Null Hypothesis concept as well as Chi Square Test here: http://bangla.salearningschool.com/recent-posts/important-basic-concepts-statistics-for-big-data/
Chi-Square Statistic:
“A chi square (χ2) statistic is a test that measures how expectations compare to actual observed data (or model results).”
https://www.investopedia.com/terms/c/chi-square-statistic.asp
“What does chi square test tell you?
The Chi–square test is intended to test how likely it is that an observed distribution is due to chance. It is also called a “goodness of fit” statistic, because it measures how well the observed distribution of data fits with the distribution that is expected if the variables are independent.”
https://www.ling.upenn.edu/~clight/chisquared.htm
“In probability theory and statistics, the chi-square distribution (also chi-squared or χ2-distribution) with k degrees of freedom is the distribution of a sum of the squares of k independent standard normal random variables. The chi-square distribution is a special case of the gamma distribution and is one of the most widely used probability distributions in inferential statistics, notably in hypothesis testing and in construction of confidence intervals.[2][3][4][5] When it is being distinguished from the more general noncentral chi-square distribution, this distribution is sometimes called the central chi-square distribution.”: https://en.wikipedia.org/wiki/Chi-squared_distribution
“A chi-squared test, also written as χ2 test, is any statistical hypothesis test where the sampling distribution of the test statistic is a chi-squared distribution when the null hypothesis is true. Without other qualification, ‘chi-squared test’ often is used as short for Pearson’s chi-squared test. The chi-squared test is used to determine whether there is a significant difference between the expected frequencies and the observed frequencies in one or more categories.”: https://en.wikipedia.org/wiki/Chi-squared_test
“
Learn
“
Probability Axioms (I am not convinced that the following is the best way to say)
“
“
https://plus.maths.org/content/maths-minute-axioms-probability
1. Probability is non-negative
2. P{S} = 1
3. Probability is additive
If A and B are two mutually exclusive (independent) events
P (A U B) = P(A) + P(B)
P (A intersection B) = empty = 0 . [nothing common]
P{A} = 1 – P'(A)
P{phi = empty} = 0
What does the probability density function mean?
“Probability density function (PDF) is a statistical expression that defines a probability distribution for a continuous random variable as opposed to a discrete random variable. When the PDF is graphically portrayed, the area under the curve will indicate the interval in which the variable will fall” https://www.investopedia.com/terms/p/pdf.asp
“A probability density function is most commonly associated with absolutely continuous univariate distributions. A random variable  has a density 
, where 
 is a non-negative Lebesgue-integrable function, if:
Hence, if  is the cumulative distribution function of 
, then:
and  is continuous at 
Intuitively, one can think of  as being the probability of 
 falling within the infinitesimal interval 
.”
https://en.wikipedia.org/wiki/Probability_density_function
Jump to navigationJump to search
The graph of a probability mass function. All the values of this function must be non-negative and sum up to 1.
“In probability and statistics, a probability mass function (PMF) is a function that gives the probability that a discrete random variable is exactly equal to some value.[1] Sometimes it is also known as the discrete density function. The probability mass function is often the primary means of defining a discrete probability distribution, and such functions exist for either scalar or multivariate random variables whose domain is discrete.
A probability mass function differs from a probability density function (PDF) in that the latter is associated with continuous rather than discrete random variables. A PDF must be integrated over an interval to yield a probability.[2]
The value of the random variable having the largest probability mass is called the mode.”https://en.wikipedia.org/wiki/Probability_mass_function
Here, we will discuss mixed random variables. These are random variables that are neither discrete nor continuous, but are a mixture of both. In particular, a mixed random variable has a continuous part and a discrete part.
https://www.probabilitycourse.com/chapter4/4_3_1_mixed.php . Also check the examples from here
Expected values of a random variable
The expected value of a discrete random variable is the probability-weighted average of all its possible values. In other words, each possible value the random variable can assume is multiplied by its probability of occurring, and the resulting products are summed to produce the expected value.
https://en.wikipedia.org/wiki/Expected_value
The “moments” of a random variable
The “moments” of a random variable (or of its distribution) are expected values of powers or related functions of the random variable. The rth moment of X is E(Xr). In particular, the first moment is the mean, µX = E(X). The mean is a measure of the “center” or “location” of a distribution
http://homepages.gac.edu/~holte/courses/mcs341/fall10/documents/sect3-3a.pdf
Joint distributions
“Joint distributions Notes: Below X and Y are assumed to be continuous random variables. This case is, by far, the most important case. Analogous formulas, with sums replacing integrals and p.m.f.’s instead of p.d.f.’s, hold for the case when X and Y are discrete r.v.’s. Appropriate analogs also hold for mixed cases (e.g., X discrete, Y continuous), and for the more general case of n random variables X1, . . . , Xn.
• Joint cumulative distribution function (joint c.d.f.): F(x, y) = P(X ≤ x, Y ≤ y)”
https://faculty.math.illinois.edu/~hildebr/461/jointdistributions.pdf
The above were mostly from the Internet, and as is.