μ (mu) | mean value, a measure of location |
π, Π (pi) | lower-case: ratio of circumference to diameter for a circle; upper-case: product operator |
σ, Σ (sigma) | as a variable: standard deviation (lower-case) or covariance matrix (upper-case), measuring dispersion; as an operator: summation |
2D | 2 Dimension |
ABM | Adaptive Basis-function Model |
AdaBoost | Adaptive Boosting |
adaline | adaptive linear element |
ADF | Assumed Density Filter |
AIC | Akaike Information Criterion |
ALS | Alternating Least Squares |
ANOVA | ANalysis Of VAriance |
ARD | Automatic Relevance Determination |
AUC | Area Under the Curve |
AWS | Amazon Web Services |
BART | Bayesian Adaptive Regression Trees |
BFGS | Broyden, Fletcher, Goldfarb, Shanno |
BIC | Bayesian Information Criterion |
BMA | Bayesian Model Averaging |
BP | Belief Propagation |
BUGS | Bayesian Updating using Gibbs Sampling |
C4.5 | Classifier (tree) 4.5: successor to ID3 |
CART | Classification And Regression Trees |
CD | Contrastive Divergence |
CDF | Cumulative Distribution Function |
CG | Conjugate Gradient |
CI | Central Interval |
CI | Confidence Interval |
CI | Credible Interval |
CIFAR | Canadian Institute For Advanced Research |
CNN | Convolutional Neural Network |
COLT | COmputational Learning Theory |
CPD | Conditional Probability Distribution |
CPU | Central Processing Unit |
CRC | Canada Research Chair |
CRF | Conditional Random Field |
CUDA | Compute Unified Device Architecture |
CV | Cross Validation |
d-separation | dependence separation |
DAG | Directed Acyclic Graph |
DBM | Deep Boltzmann Machine |
DBN | Deep Belief Network |
DBN | Dynamic Bayesian Network |
DCM | Dirichlet Compound Multinomial |
DDN | Deep Directed Network |
DGM | Directed Graphical Model |
DNA | DeoxyriboNucleic Acid |
DNN | Deep Neural Network |
dof | degrees of freedom |
DP | Dirichlet Process |
EB | Empirical Bayes |
EC2 | Elastic Compute Cloud |
ECOC | Error Correcting Output Code |
EER | Equal Error Rate |
EB | Empirical Bayes |
EM | Expectation Maximization |
EP | Expectation Propagation |
ERM | Empirical Risk Minimization |
exp | exponent |
FA | Factor Analysis |
FDR | False Discovery Rate |
FLDA | Fisher's Linear Discriminant Analysis |
FNR | False Negative Rate |
FPR | False Positive Rate |
GAM | Generalized Additive Model |
GaP | Gamma Poisson |
GCV | Generalized Cross Validation |
GDA | Gaussian Discriminant Analysis |
GGM | Gaussian Graphical Model |
GLM | Generalized Linear Model |
GLMM | Generalized Linear Mixture Model |
GM | Graphical Model |
GMM | Gaussian Mixture Model |
GP | Gaussian Process |
GPU | Graphics Processing Unit |
HDI | Highest Density Interval |
HLDA | Heteroscedastic Linear Discriminant Analysis |
HME | Hierarchical Mixture of Experts |
HMM | Hidden Markov Model |
HPD | Highest Posterior Density |
HTTP | Hyper Text Transfer Protocol |
ICA | Independent Component Analysis |
ICML | International Conference on Machine Learning |
ID3 | Iterative Dichotimiser (tree) 3 |
iff | if and only if |
IID | Independent, Identically Distributed |
IP | Imputation Posterior |
IPF | Iterative Proportional Fitting |
IRLS | Iteratively Reweighted Least Squares |
JAGS | Just Another Gibbs Sampler |
JTA | Junction Tree Algorithm |
k | a popular variable name for a count; e.g. the number of nearest neighbors or the number of clusters |
KDE | Kernel Density Estimate |
KL | Kullback - Leibler |
KNN | "k" Nearest Neighbor, where "k" is the number of neighbors |
l0, l1, l2 | Lesbegue space, where the norm is defined as the "p"-th root of the sum of abolute values raised to the "p"-th power |
L1VM | l1 regularized Vector Machine |
LARS | Least Angle Regression |
LASSO | Least Absolute Shrinkage and Selection Operator |
LaTeX | Lamport TeX typesetting system [TeX: pronounced "tech"; an abbreviated form of tau, epsilon, chi] |
L-BFGS | Limited-memory Broyden Fletcher Goldfarb Shanno |
LBP | Loopy Belief Propagation |
LDA | Latent Dirichlet Allocation |
LDA | Linear Discriminant Analysis |
LeNet5 | LeCunn convolutional neural Network 5 |
LG-SSM | Linear Gaussian State Space Model |
LMS | Least Mean Squares |
log | logarithm |
LOOCV | Leave One Out Cross Validation |
LSTM | Long Short-Term Memory |
LVM | Latent Variable Model |
MAP | Maximum A Posteriori |
MAP | Mean Average Precision |
MAR | Missing At Random |
MARS | Multiple Adaptive Regression Splines |
MART | Multiple Additive Regression Trees |
MatLab | Matrix Laboratory |
MC | Monte Carlo |
MCAR | Missing Completely At Random |
MCMC | Markov Chain Monte Carlo |
MDL | Minimum Description Length |
MEMM | Maximum Entropy Markov Model |
MH | Metropolis Hastings |
MI | Mutual Information |
MIT | Massachusetts Institute of Technology |
ML | Machine Learning |
ML | Maximum Likelihood |
MLE | Maximum Likelihood Estimate |
MLP | Multi-Layer Perceptron |
MNIST | Modified National Institute of Standards and Technology data |
MPCA | Multinomial Principal Component Analysis |
MRF | Markov Random Field |
MSE | Mean Squared Error |
MVN | Multi-Variate Normal |
NaN | Not a Number |
NB | Nota Bene |
NBC | Naive Bayes Classifier |
NHST | Null Hypothesis Significance Testing |
NIG | Normal Inverse Gaussian |
NIPS | Neural Information Processing Systems conference |
NLL | Negative Log Likelihood |
NMAR | Not Missing At Random |
NP | Non-deterministic Polynomial time |
NSERC | Natural Sciences and Engineering Research Council |
OLS | Ordinary Least Squares |
P | Polynomial time |
p-value | the probability of a false rejection of the null hypothesis |
PAC | Probably Approximately Correct |
PCA | Principal Component Analysis |
PDF | Probability Density Function |
PMI | Pointwise Mutual Information |
PMTK | Probabilistic Modeling ToolKit |
PPCA | Probabilistic Principal Component Analysis |
PR | Precision Recall |
QDA | Quadratic Discriminant Analysis |
QQ | Quantile-Quantile |
RBF | Radial Basis Function |
RBM | Restricted Boltzmann Machine |
RNN | Recurrent Neural Network |
RBPF | Rao-Blackwellized Particle Filtering |
RKHS | Reproducing Kernel Hilbert Space |
ROC | Receiver Operating Characteristic |
RRM | Regularized Risk Minimization |
RSS | Residual Sum of Squares |
RVM | Relevance Vector Machine |
SAT | Scholastic Aptitude Test |
SBL | Sparse Bayesian Learning |
SdA | Stacked denoising Autoencoder |
SGD | Stochastic Gradient Descent |
SIR | Sampling Importance Resampling |
SLAM | Simultaneous Localization And Mapping |
SLT | Statistical Learning Theory |
SpAM | Sparse Additive Model |
SSE | Sum of Squared Errors |
SSM | State Space Model |
SSVM | Structural Support Vector Machine |
SVD | Singular Value Decomposition |
t | a popular variable name for a test statistic; as in "t" test or "t" distribution |
TNR | False Positive Rate |
TPR | True Positive Rate |
UCB | Upper Confidence Bound |
UGM | Undirected Graphical Model |
UKF | Unscented Kalman Filter |
VB | Variational Bayes |
VC | Vapnik-Chervonenkis |
VE | Variable Elimination |
VIBES | Variational Inference on a Bayesian network |
XOR | eXclusive OR |