QERM 514/Notation
From QERM Wiki
Statistics is all about confusing notation. What gets termed what varies from source to source, and terms are interchanged adding to the confusion. Below is a list of some of the notation used in QERM 514, and other terms to assist in reading. Feel free to expand on anything below.
Contents |
General math
Sets and numbers
The set of all possible one dimensional numbers (that is, ordinary numbers like 3.224 or -8.7) is denoted with a capital R (in the notes using a nifty font called blackboard bold which is not available on Wiki). In just about every modern source, the collection of all real numbers is denoted with a capital R. The collection of all length n vectors, is likewise denoted Rn. Integers (negative and positive whole numbers) are nearly always called Z (again, often using another font), while natural numbers (non-negative integers) are denoted N or Z+ (although this is not as universal, and some books will use either for positive integers--that is excluding zero--rather than non-negative integers).
The element symbol
is used to indicate a particular variable is in a set. For instance,
means x is a real number. Some (I find older) books use an element symbol which looks remarkably close to an epsilon, although they are distinct.
Sums and products
The sum of a collection of numbers a1, a2, ... ,an is usually written with a (large) capital sigma.
The product of the numbers is written with a capital pi.
Matrix notation
The dimension of a matrix is always written rows by columns. A matrix is generally written as a capital (in many sources, bold) letter, while an element in the matrix is given a lowercase of the same letter. For instance, in matrix A, the first row, second column element would be written a12.
Vectors are perhaps the exception. In 514 (and most other books), the term vector can be thought of as a matrix with a column dimension of 1. Sources vary on how to denote a vector, although they frequently differ from the form of a matrix. A vector is often written with a lowercase letter (514 notes), sometimes with an arrow over, such as
, or sometimes just in bold. Some sources (particularly in stats) underscore a vector with a tilde (which I have note been able to replicate on the wiki).
Matrices have their own operations, which get special notation. The transpose of a matrix A is usually written A' or AT. The determinant of a matrix is denoted either det(A) or |A|, while the inverse is written A-1. The identity matrix is usually written In, for an n by n dimensional matrix, or I if the dimension is clear from context. Trace and rank are usually written tr(A) and rank(A).
Probability
Almost universally, random variables are given a capital English letter (a few exceptions exist, such as denoted errors with ε). The pdf or pmf for a random variable X is usually written with a lowercase function name and the random variable subscripted, such as fX(x). If the random variable being described is clear from context, the subscript is dropped. Cumulative density functions (cdfs) are always written with an uppercase letter which corresponds to the lowercase pdf/pmf. Thus the cdf of X would be written FX(x). In either pdfs/pmfs or cdfs, it is sometimes useful to denote the parameters of the distribution. When this is needed, the notation fX(x | θ) is used, indicate the dependence on the parameter θ. Parameters are nearly always greek letters.
For philosophical reasons, when describing a pdf fX(x | θ) as a function of θ rather than x, most books write l(θ | x). This is totally unnecessary since f(x | θ) = l(θ | x). This is not done in the 514 notes.
Normal density
Since it comes up all the time, the standard normal (mean 0 and var 1) pdf and cdf are denoted φ(x) and Φ(x) respectively. This is pretty standard notation found in most textbooks.
Probability
The probability of an event is written in most books in one of several ways. The 514 notes (and many other sources) use P(event), while other books use Pr{event} or some variation on it.
Commonly used letters
Custom and tradition result in some letters being used for certain things. For instance, in geometry, π is virtually always used to denote 3.14159.... Of course, even pi gets recycled for other functions (the prime counting function from number theory is, for example, also always denoted with a pi). So a list like the one below is a bit dangerous, since there is not really any rule that these must be followed, but they often are.
| name | Common use |
|---|---|
| ε | Normal error, or just error |
| f | function, often a pdf in 514 |
| g | function, often a link function in glms |
| Γ | Gamma function |
| L | Likelihood or log likelihood |
| λ | Likelihood ratio |
| η | Often used to denote Xβ in glms |
| θ | Often an arbitrary vector of parameters |
| σ2 | Variance, often σ is the standard deviation. |
| Σ | Sum, covariance matrix |
| w | predictor (generally a different predictor than x). |
| x | predictor, Pearson's statistic |
| y | response |
| z | Normal random variable, standard normal |
