13B  13 Billion (parameters) 
 405B  405 Billion (parameters) 
 7B  7 Billion (parameters) 
 70B  70 Billion (parameters) 
 A100  Ampere 100 Nvidia GPU 
 A2C  Advantage Actor Critic 
 A3C  Asynchronous Advantage Actor Critic (predates A2C) 
 AAAI  Association for the Advancement of Artificial Intelligence 
 ACL  Association for Computational Linguistics 
 ACM  Association for Computing Machiner 
 AdaM  Adaptive Moment estimation (momentum) 
 AdaMW  Adaptive Moment estimation with Weight decay 
 ADMM  Alternating Direction Method of Multipliers 
 AGI  Artificial General Intelligence 
 AGIEval  Artificial General Intelligence Evaluation (exams dataset) 
 AI  Artificial Intelligence 
 AI2  Allen Institute for Artificial Intelligence 
 aka  also known as 
 AMI  Amazon Machine Image 
 AMD  Advanced Micro Devices 
 ANI  Artificial Narrow Intelligence 
 ANN  Artificial Neural Network 
 ANSI  American National Standards Institute 
 AP  Advanced Placement (exams) 
 APE  Automated Prompt Engineering 
 API  Application Programming Interface 
 AR  Augmented Reality 
 ARC  Abstraction and Reasoning Corpus 
 ARC  AI2 Reasoning Challenge 
 ARC-C  AI2 Reasoning Challenge - Challenge set 
 ARC-E  AI2 Reasoning Challenge - Easy set 
 ARES  Automated RAG Evaluation System 
 ASCII  American Standard Code for Information Interchange 
 ASI  Artificial Super Intelligence 
 ASR  Automatic Speech Recognition (speech-to-text) 
 ASR  Automatic Speech Translation 
 AT  Added Toxicity 
 AUC  Area Under the Curve (curve could be ROC, DET, PR, etc) 
 AVX2  Advanced Vector eXtensions version 2 (256-bit: eight 32-bit single-precision numbers) 
 AVX512  Advanced Vector eXtensions version 512 (512-bit: sixteen 32-bit single-precision numbers) 
 AWS  Amazon Web Services 
 B  Billion 
 B100  Blackwell 100 Nvidia GPU 
 BAIR  Berkeley AI Research 
 BART  Bidirectional and Auto-Regressive Transformers 
 BB  BIG Benchmark 
 BBH  Beyond the Imitation Game (BIG) Bench Hard suite 
 BBQ  Bias Benchmark for Question answering 
 BCE  Binary Cross Entropy 
 BERT  Bidirectional Encoder Representations from Transformers 
 BEST-RQ  BErt-based Speech pre-Training with Random-projection Quantizer 
 BF16  16-bit Brain Floating-point format = (-1 * sign_bit) * (2 ** (128 * exponent_bit[7] + 64 * exponent_bit[6] + 32 * exponent_bit[5] + 16 * exponent_bit[4] + 8 * exponent_bit[3] + 4 * exponent_bit[2] + 2 * exponent_bit[1] + 1 * exponent_bit[0] - 127)) * (1 + 1/2 * mantissa_bit[6] + 1/4 * mantissa_bit[5] + 1/8 * mantissa_bit[4] + 1/16 * mantissa_bit[3] + 1/32 * mantissa_bit[2] + 1/64 * mantissa_bit[1] + 1/128 * mantissa_bit[0]) [example: bin(torch.tensor(-1.5, dtype = torch.bfloat16).view(torch.uint16)) = 0b1011111111000000] 
 BFCL  Berkeley Function Calling Leaderboard 
 BFGS  Broyden Fletcher Goldfarb Shanno optimization 
 BFS  Breadth-First Schedule (or Search) 
 Bi-LSTM  Bidirectional LSTM 
 BIG  Beyond the Imitation Game (the Turing Test is known as the Imitation Game) 
 BLAS  Basic Linear Algebra Subprograms 
 BLEU  BiLingual Evaluation Understudy 
 BLOOM  Bigscience Large Open-science Open-access Multilingual language model 
 BM25  Best Match 25 (an extension of TF*IDF with length normalization and term frequency saturation) 
 BN  Batch Normalization (center and scale) 
 BOLD  Bias in Open-ended Language generation Dataset 
 BoolQ  Boolean (yes/no) Questions (dataset) 
 BPE  Byte Pair Encoding 
 BPTT  Back Propagation Through Time 
 BSD  Berkeley Software Distribution license 
 C4  Colossal, Cleaned Common Crawl 
 CAM  Class Activation Map 
 CBOW  Continuous Bag Of Words 
 CBRNE  Chemical, Biological, Radiological, Nuclear, and high-yield Explosives (threats) 
 CD  Contrastive Divergence 
 CelebA  Celebrity faces with Attributes 
 CERN  Conseil EuropĂ©en pour la Recherche NuclĂ©aire 
 cGAN  conditional GAN 
 ChartQA  Chart Question Answering 
 CI  Confidence Interval, where confidence is the probability that the interval construction method will generate an interval that contains the true value of the parameter of interest [if there is no overlap between a pair of confidence intervals, we assume there is a statistically significant difference between the parameters being compared] 
 CIFAR  Canadian Institute For Advanced Research 
 CLEVR  Compositional Language and Elementary Visual Reasoning 
 CLIP  Contrastive Language-Image Pretraining 
 CLM  Causal Language Modeling 
 CLS  CLaSsification token 
 CNN  Convolutional Neural Network 
 CNTK  Cognitive ToolKit 
 CO2  Carbon diOxide (emissions) 
 COCO  Common Objects in Context 
 CoLA  Corpus of Linguistic Acceptability 
 CoNLL  Conference on Natural Language Learning 
 ConvNet  Convolutional Network 
 CoQA  Conversation Question Answering 
 CoT  Chain of Thought 
 CP  Context Parallelism (input sequence chunks are processed in parallel) 
 CPU  Central Processing Unit 
 CR  Customer Reviews dataset 
 CRF  Conditional Random Field 
 CSAM  Child Sexual Abuse Material 
 CSI  Control Sequence Introducer: an ANSI sequence for controlling foreground and background colors for text, e.g. f'\x1b[0;30;48;2;{red};{green};{blue}m' contains 0 for reset; 30 for black foreground color; 48 for background color; and 2 for red, green, and blue components for background color 
 CSS  Cascading Style Sheet 
 CSV  Comma Separated Values 
 CUDA  Common Unified Device Architecture 
 cuDNN  CUDA DNN library 
 CV  Cross Validation; also Computer Vision 
 CVF  Computer Vision Foundation 
 CVPR  Computer Vision and Pattern Recognition 
 DAG  Directed Acyclic Graph 
 DCGAN  Deep Convolutional GAN 
 DCQCN  Data Center Quantized Congestion Notification 
 DDDQN  Dueling Double Deep Quality estimation Network [to be fair, I've not seen others abbreviate this] 
 DDPG  Deep Deterministic Policy Gradient 
 DDQN  Double Deep Quality estimation Network (as in two networks) 
 DeBERTa  Decoding-enhanced BERT with disentangled attention 
 DET  Detection Error Trade-off 
 distilBERT  distilled (smaller) version of larger BERT model 
 df  degrees of freedom 
 DFS  Depth-First Schedule (or Search) 
 DL  Deep Learning 
 DM Mathematics  Deep Mind Mathematics dataset 
 DNN  Deep Neural Network 
 DocQA  Document Question Answering (dataset) 
 DocVQA  Document Visual Question Answering (dataset) 
 DP  Data Parallelism (observations are processed in parallel) 
 DPO  Direct Preference Optimization 
 DQN  Deep Quality estimation Network 
 DRAM  Dynamic Random Access Memory 
 DRL  Deep Reinforcement Learning 
 DROP  Discrete Reasoning Over the content of Paragraphs 
 DSO  Dynamic Shared Object 
 DSPy  Demonstrate Search Predict for python (pipeline optimization) 
 DSVM  Data Science Virtual Machine 
 DTD  Describable Textures Dataset 
 DUC  Document Understanding Conference 
 EC2  Elastic Compute Cloud 
 ECACL  European Chapter of the ACL 
 ECCV  European Conference on Computer Vision 
 ECMP  Equal Cost Multi-Path (routing) 
 ELECTRA  Efficiently Learning an Encoder that Classifies Token Replacements Accurately 
 ELBO  Evidence Lower BOund 
 ELMo  Embeddings from Language Models 
 Elo  Arpad Elo's last name (pronounced "ee lou"): devised rating system where player's initial rating moves up or down based on rating of opponent 
 ELRA  European Language Resources Association 
 ELU  Exponential Linear Unit 
 EM  Exact Match 
 EM  Expectation Maximization 
 EMA  Exponential Moving Average 
 EMNLP  Empirical Methods in Natural Language Processing 
 ETA  Estimated Time of Arrival (of completion) 
 EuroSAT  European Satellite 
 EWMA  Exponentially Weighted Moving Average 
 EXAMS  multi-subject high-school EXAMinationS (dataset) 
 exp  exponential function [base is 'e' (Euler's number ~ 2.71828)] 
 F score  Function returning the harmonic mean of precision and recall (always less than or equal to arithmetic mean) 
 F-beta  TP / (TP + (FP + beta * FN) / (1 + beta)) 
 F1  TP / (TP + (FP + FN) / 2) 
 f8_e4m3  8-bit floating-point format, with 4-bit exponent and 3-bit mantissa 
 FAISS  Facebook Artificial Intelligence Similarity Search 
 FER  Facial Expression Recognition 
 FFN  Feed Forward Network 
 FFT  Fast Fourier Transform 
 FGVC  Fine-Grained Visual Classification 
 FID  Frechet Inception Distance 
 FLaN  Finetuned Language Network 
 FLEURS  Few-shot Learning Evaluation of Universal Representations of Speech 
 FLOPs  FLoating-point Operations (Per Second) 
 FMA  Fused Multiply-Add 
 FN  False Negative [Actual = Positive; Prediction = Negative] 
 FNR  FN Rate 
 FP  False Positive [Actual = Negative; Prediction = Positive] 
 FP8  8-bit Floating Point representation (see f8_e4m3) 
 FPR  FP Rate 
 FRR  False Refusal Rate (false positive rate for safety) 
 FSDP  Fully Sharded Data Parallelism 
 FT  Fine Tuning 
 GAE  Generalized Advantage Estimation 
 GAIA  General AI Assistants (benchmark) 
 GAN  Generative Adversarial Network 
 GAT  Graph ATtention network 
 GB  GigaBytes 
 GCN  Graph Convolutional Network 
 GCP  Google Cloud Platform 
 GELU  Gaussian Error Linear Unit 
 GEMM  GEneral Matrix Multiplication 
 gensim  generate similar 
 GGML  GPT-Generated Model Language 
 GGUF  GPT-Generated Unified Format 
 GLM  General Language Model 
 GLM  Generalized Linear Model 
 GloVe  Global Vectors for word representation 
 GLUE  General Language Understanding Evaluation 
 GMAT  Graduate Management Admission Test 
 GNN  Graph Neural Network 
 Gov  Government 
 GPQA  Graduate-level Google-Proof Question Answering (dataset) 
 GPT  Generative Pre-trained Transformer 
 GPTQ  GPT Quantization 
 GPU  Graphics Processing Unit 
 GQA  Generalized Query Attention 
 GQA  Grouped Query Attention 
 GRE  Graduate Record Examination 
 GRU  Gated Recurrent Unit cell (a set of 3 or 6 matrices) 
 GSM8K  Grade School Math 8000 problems dataset 
 GTSRB  German Traffic Sign Recognition Benchmark dataset 
 GTX  Giga Texel shader eXtreme 
 HBM  High-Bandwidth Memory 
 HDF5  Hierarchical Data Format version 5 
 HellaSwag  Harder Endings, Longer contexts, and Lowshot Activities for Situations With Adversarial Generations 
 HELM  Holistic Evaluation of Language Models 
 HH  Helpful and Harmless dialogue dataset 
 HMM  Hidden Markov Model 
 HNSW  Hierarchical Navigable Small Worlds 
 HTML  Hyper Text Markup Language 
 HTTP  Hyper Text Transfer Protocol 
 HTTPS  Hyper Text Transfer Protocol Secure 
 HSV  Hue, Saturation, and Value 
 HumanEval  Human (code) Evaluation (dataset) 
 I  Identity matrix 
 I  Informational message 
 ICASSP  International Conference on Acoustics, Speech, and Signal Processing 
 ICCV  International Conference on Computer Vision 
 ICD  Insecure Code Detector 
 ICLR  International Conference on Learning Representations 
 ICML  International Conference on Machine Learning 
 IDF  Inverse Document Frequency 
 IDSIA  Istituto Dalle Molle di Studi sull'Intelligenza Artificiale 
 IEEE  Institute of Electrical and Electronics Engineers 
 IFEval  Instruction Following Evaluation (benchmark) 
 IFT  Instruction Fine Tuning 
 IID  Independent and Identically Distributed 
 IJCAI  International Joint Conference on Artificial Intelligence 
 IJCNLP  International Joint Conference on Natural Language Processing 
 ILSVRC  Imagenet Large Scale Visual Recognition Challenge 
 IMDB  Internet Movie DataBase 
 IML  Instruction Meta Learning 
 IO  Input Output 
 IOU  Intersection Over Union 
 IRA  Irish Republican Army (referenced by a paper, regarding safety) 
 IS  Inception Score 
 ISBN  International Standard Book Number 
 ISSN  International Standard Service Number 
 ITN  Inverse Text Normalization 
 JSON  JavaScript Object Notation 
 k  A variable often used to represent a count, as in k-fold CV or k-means 
 K80  Kepler 80 Nvidia GPU 
 KITTI  Karlsruhe Institute of Technology and Toyota Technological Institute 
 KL  Kullback - Leibler divergence (relative entropy) 
 KTO  Kahneman-Tversky Optimization 
 l1, l2  Lebesgue space norm, defined as the "p"-th root of the sum of abolute values raised to the "p"-th power 
 L-BFGS  Limited-memory Broyden Fletcher Goldfarb Shanno optimization 
 LAMB  Layerwise Adaptive Moments optimizer for Batch training 
 LaMDA  Language Model for Dialog Applications 
 LCFT  Long Context Fine Tuning 
 LG  LLaMA Guard 
 LHC  Large Hadron Collider 
 libROSA  library for the Recognition and Organization of Speech and Audio 
 LID  Language IDentification 
 LLaMA  Large Language model Meta AI 
 LLaVA  Large Language and Vision Assistant 
 LLM  Large Language Model 
 LM  Language Model 
 LMSys  Large Model Systems (organization) 
 log  logarithm [base is 'e' (Euler's number), unless specified otherwise] 
 LoRA  Low Rank Adaptation 
 LR  Learning Rate 
 LREC  Language Resources and Evaluation Conference 
 LSAT  Law School Admission Test 
 LSTM  Long Short-Term Memory cell (a set of 4 or 8 matrices) 
 LT  Lost Toxicity 
 M4T  Massively Multilingual and Multimodal Machine Translation 
 M60  Maxwell 60 Nvidia GPU 
 MAE  Mean Absolute Error 
 MAP  Maximum A Posteriori 
 MAP@k  Mean Average Precision for 'k' recommendations 
 MAP-Elites  Multi-dimensional Archive of Phenotypic Elites 
 MAST  ML Application Scheduler on Twine (Twine is Metas cluster management system) 
 MATH  Mathematics Aptitude Test of Heuristics (dataset) 
 MB  Mega Bytes 
 MBPP  Mostly Basic Python Problems (dataset) 
 MC  Monte Carlo 
 MC  Multiple Choice 
 MCMC  Markov Chain Monte Carlo 
 MCQ  Multiple Choice Question 
 MCTS  Monte Carlo Tree Search 
 MDP  Markov Decision Process 
 METEOR  Metric for Evaluation of Translation with Explicit ORdering 
 MFU  Model FLOPs Utilization 
 MFCC  Mel(ody) Frequency Cepstral Coefficients 
 MGSM  Multilingual Grade School Math 
 MHR  Modularity - Hierarchy - Reuseg 
 MIPRO  Multi-prompt Instruction PRoposal Optimizer 
 MIT  Massachusetts Institute of Technology license 
 ML  Machine Learning 
 MLE  Maximum Likelihood Estimate 
 MLM  Masked Language Modeling 
 MLP  Multi-Layer Perceptron (stack of "dense" layers) 
 MLS  Multilingual LibriSpeech 
 MMDialog  Multi-Modal Dialog 
 MMLU  Massive Multi-task Language Understanding 
 MMLU-Pro  Massive Multi-task Language Understanding - Professional 
 MMMU  Massive Multi-discipline Multimodal Understanding (benchmark) 
 MNIST  Modified NIST 
 MNLI  Multi-genre Natural Language Inference dataset 
 MoE  Mixture of Experts 
 MPNet  Masked and Permuted pre-training Network 
 MPQA  Multi-Perspective Question Answering dataset 
 MPT  MosaicML Pretrained Transformer (Databricks) 
 MR  Movie Reviews dataset 
 MRI  Magnetic Resonance Imaging 
 MRPC  Microsoft Research Paraphrase Corpus 
 MSE  Mean Squared Error 
 MT  Machine Translation 
 MT-Bench  Multi-Turn Benchmark 
 MuSR  Multi-step Soft Reasoning 
 MXNet  Mixing eager and graph mode for Networks 
 n  A variable often used for a count of something; e.g. n-dimensional or n-gram 
 NAACL  North American chapter of the ACL 
 NaN  Not a Number 
 NAS  Neural Architecture Search 
 NCCL  Nvidia Collective Communications Library 
 NCCLX  Nvidia Collective Communications Library eXtension (Meta) 
 NDCG@k  Normalized Discounted Cumulative Gain for 'k' recommendations 
 NER  Named Entity Recognition 
 NeurIPS  Neural Information Processing Systems 
 NExT-QA  Next generation of VQA models to Explain Temporal actions 
 NF4  Normal Float 4 (4-bits) 
 NIC  Network Interface Card 
 NIH  National Institutes of Health 
 NIH  Needle In a Haystack 
 NIST  National Institute of Standards and Technology 
 NLG  Natural Language Generation 
 NLI  Natural Language Inference (entailment: if A then B; contradiction: if A then not B) 
 NLL  Negative Log Likelihood 
 NLLB  No Language Left Behind (translation) 
 NLP  Natural Language Processing 
 NLTK  Natural Language ToolKit 
 NMS  Non Max Suppression 
 NMT  Neural Machine Translation 
 NN  Nearest Neighbor 
 NN  Neural Network 
 NPC  Non-Playable Character (a character controlled by a computer) 
 NSFW  Not Safe For Work 
 NUMA  Non-Uniform Memory Access 
 NumPy  Numeric library for Python 
 Nvidia  "invidia" is Latin for "envy", which sounds like a pronounciation of NV (Next Vision) 
 OBQA  Open Book Question Answering 
 OCR  Optical Character Recognition 
 OGB  Open Graph Benchmark 
 OGBN  OGB Node propery prediction task 
 OOV  Out Of Vocabulary 
 OPRO  Optimization by PROmpting 
 OPT  Open Pre-trained Transformer 
 P40  Pascal 40 Nvidia GPU 
 PAIR  Prompt Automatic Iterative Refinement 
 PaLM  Pathways Language Model 
 PAWS  Paraphrase Adversaries from Word Scrambling 
 PB  Peta Bytes 
 PCA  Principal Component Analysis 
 PCam  Patch Camelyon 
 PCI  Peripheral Component Interconnect 
 PDF  Portable Document Format 
 PDF  Probability Density Function 
 PEFT  Parameter Efficient Fine Tuning 
 PG  Policy Gradient 
 PHP  Personal Home Page 
 PHP  PHP: Hypertext Processor 
 PhotoDNA  Photo DeoxyriboNucleic Acid (image identification) 
 PII  Personally Identifiable Information 
 PIL  Python Imaging Library 
 PIQA  Physical Interaction Question Answering 
 PMLR  Proceedings of Machine Learning Research 
 POMDP  Partially Observable Markov Decision Process 
 POS  Part Of Speech 
 PER  Prioritized Experience Replay 
 PID  Process Identifier 
 Pixel  Picture element 
 PLM  Permuted Language Modeling 
 PM  Prosody Model 
 PNG  Portable Network Graphics image format 
 POS  Part Of Speech 
 PP  Pipeline Parallelism 
 PPO  Proximal Policy Optimization 
 PR  Precision vs Recall curve 
 ProLog  Programming Logic language 
 PTB  Penn TreeBank 
 PubMed  indexed Published Medical literature 
 PUE  Power Usage Effectiveness (GPUs require cooling) 
 pvalue  probability of false reject (for null hypothesis) 
 QA  Question Answering 
 QKV  Query Key Value 
 QLoRA  Quantized Low Rank Adaptation 
 QNLI  Question-answering Natural Language Inference dataset 
 QQP  Quora Question Pairs (dataset) 
 QuAC  Question Answering in Context dataset 
 QuALITY  Question Answering with Long Input Texts, Yes! 
 QT  Quality Tuning 
 R-CNN  Region-based CNN 
 RaCE  Reading Comprehension dataset from Examinations 
 RAG  Retrieval Augmented Generation 
 RAGAS  RAG ASessment (framework) 
 RAM  Random Access Memory 
 Rand  Random 
 RDMA  Remote Direct Memory Access 
 ReAct  Reasoning and Acting (agent loop) 
 REINFORCE  REward Increment = Nonnegative Factor times Offset Reinforcement times Characteristic Eligibility 
 ReLU  Rectified Linear Unit 
 RESISC  Remote Sensing Image Scene Classification 
 ResNet  Residual Network 
 REST  REpresentational State Transfer 
 RGB  Red, Green, and Blue 
 RL  Reinforcement Learning 
 RLAIF  Reinforcement Learning from AI Feedback 
 RLHF  Reinforcement Learning from Human Feedback 
 RM  Reward Model 
 RMSnorm  Root Mean Square normalization 
 RMSprop  Root Mean Square gradient propagation 
 RNN  Recurrent Neural Network 
 RoBERTa  Robustly optimized BERT approach 
 ROC  Receiver Operating Characteristic curve 
 RoCE  RDMA over Converged Ethernet 
 ROI  Region Of Interest 
 RoPE  Rotary Position Embeddings 
 ROUGE  Recall-Oriented Understudy for Gisting Evaluation 
 RS  Rejection Sampling 
 RT  RunTime; also RealTime 
 RTE  Recognizing Textual Entailment dataset 
 RTX  Ray-tracing Texel eXtreme 
 RWKV  Receptance Weighted Key Value (architecture) 
 SAC  Soft Actor Critic 
 SARSA  State Action Reward State Action 
 SAT  Scholastic Aptitude Test 
 SBERT  Sentence BERT 
 SciPy  Scientific library for Python 
 SDPA  Scaled Dot Product Attention 
 SELU  Scaled Exponential Linear Unit 
 SentEval  Sentence Evaluation 
 seq2seq  sequence-to-sequnce 
 SFT  Supervised Fine Tuning 
 SG  Skip-Gram 
 SGD  Stochastic Gradient Descent 
 SGM  Standard Generalized Markup text format 
 SICK-R  Sentences Involving Compositional Knowledge - Relatedness 
 SIGCOMM  Special Interest Group on data COMMunications 
 SIGIR  Special Interest Group on Information Retrieval 
 SiLU  Sigmoid Linear Unit (activation function); aka Swish 
 SIQA  Social Interaction Question Answering 
 SLM  Strange Loop Machine (MDP loop) 
 SLT  Spoken Language Technology 
 SLT  Statistical Learning Theory 
 SME  Subject Matter Expert 
 SMI  System Management Interface 
 SMoE  Sparse Mixture of Experts 
 SNAP  Stanford Network Analysis Platform 
 SNLI  Stanford Natural Language Inference dataset 
 SO  Shared Object 
 spaCy  syntactic parser using C-extensions for python (Cython) 
 SQL  Structured Query Language 
 SRAM  Static Random Access Memory 
 SRN  Simple Recurrent Network [refers to SimpleRNN() layer] 
 SSCD  Self-Supervised Copy Detection 
 SSD  Single Shot multibox Detector 
 SSD  Solid State Drive 
 SST  Stanford Sentiment Treebank 
 STEM  Science, Technology, Engineering, and Mathematics 
 STL  Self-Taught Learning 
 STSb  Semantic Text Similarity benchmark 
 SUN  Scene Understanding dataset 
 SUTLM  Speech Unit and Text Language Model 
 SVHN  Street View House Numbers dataset 
 SVM  Support Vector Machine 
 SWA  Sliding Window Attention 
 SWAG  Situations With Adversarial Generations 
 swin  shifted window (transformer) 
 SwiGLU  Swish Gated Linear Unit (activation function) 
 SXM#  Servier PCI eXpress Module, with version number 
 t  A variable often used for a test statistic, as in t statistic, t distribution, t test 
 T5  Text-To-Text Transfer Transformer 
 tanh  hyperbolic tangent 
 TB  Tera Bytes 
 tCO2eq  tonnes of carbon dioxide equivalent 
 TD  Temporal Difference 
 TDP  Thermal Design Power 
 TD3  Twin Delayed Deep Deterministic policy gradient 
 TDNN-OPGRU  Time-Delay Neural Network with Output-gate Projected GRU 
 Texel  Texture element 
 TextVQA  Text Visual Quesion Answering 
 TF  Term Frequency 
 TF-IDF  Term Frequency - Inverse Document Frequency 
 TL;DR  Too Long; Didn't Read: a prefix for a summary 
 TN  Text Normalization 
 TN  True Negative [Actual = Negative; Prediction = Negative] 
 TP  Tensor Parallelism (feature chunks processed in parallel) 
 TP  True Positive [Actual = Positive; Prediction = Positive] 
 TPR  True Positive Rate 
 TPU  Tensor Processing Unit 
 TReC  Text Retrieval Conference 
 TRL  Tranformer Reinforcement Learning 
 TRPO  Trust Region Policy Optimization 
 TSNE  T-distributed Stochastic Neighbor Embedding 
 TSV  Tab Separated Values 
 TTS  Text To Speech 
 TV  Television 
 TVQA  Television Question Answering (dataset) 
 UCB  Upper Confidence Bound 
 UCF  University of Central Florida 
 ULMFiT  Universal Language Model Fine Tuning 
 UMAP  Uniform Manifold Approximation and Projection 
 URL  Uniform Resource Locator 
 US  United States 
 USA  United States of America 
 USE  Universal Sentence Encoder 
 USENIX  Unix Users Group (organization) 
 UTF-8  Unicode Transformation Format - 8-bit, where a character can be represented by a 1-byte, 2-byte, 3-byte, or 4-byte sequence; the first byte of a character determines how many bytes are used to represent the character [0-127 are 1-byte ASCII values] 
 V100  Volta 100 Nvidia GPU 
 VAD  Voice Activity Detection 
 VAE  Variational AutoEncoder 
 VGG-16  Oxford University Visual Geometry Group 16-layer network 
 VI  Variational Inference 
 ViP-LLaVA  Visual Prompt - LLaVA 
 ViT  Vision Transformer 
 vLLM  virtual LLM (inference engine) 
 VM  Virtual Machine 
 VOC  Visual Objects Challenge 
 vocoder  voice encoder 
 VPG  Vanilla Policy Gradient 
 VQA  Visual Question Answering 
 VR  Violation Rate (false negative rate for safety) 
 VR  Virtual Reality 
 VRAM  Video RAM 
 VTAB  Visual Task Adaptation Benchmark 
 W&B  Weights and Biases 
 WACV  Winter conference on Applications of Computer Vision 
 Wav  Waveform audio format 
 WER  Word Error Rate 
 WinoGrande  adversarial Winograd Schema challenge (identify the antecedent of an ambiguous term) 
 WNLI  Winograd Natural Language Inference dataset 
 WuPS  Wu and Palmer Similarity 
 XAI  X (formerly Twitter) AI 
 XGBoost  eXtreme Gradient Boosting 
 XLA  accelerated Linear Algebra 
 XLM-R  cross-lingual Language Model - RoBERTa, where 'X' represents a cross 
 XS Test  eXaggerated Safety behaviors Test 
 YFCC  Yahoo Flickr Creative Commons 
 YOLO  You Only Look Once 
 ZeroSCROLLS  Zero-shot CompaRison Over Long Language Sequences 
 ZIP  Zone Improvement Plan