在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
开源软件名称(OpenSource Name):blmoistawinde/ml_equations_latex开源软件地址(OpenSource Url):https://github.com/blmoistawinde/ml_equations_latex开源编程语言(OpenSource Language):HTML 99.9%开源软件介绍(OpenSource Introduction):Classical ML Equations in LaTeXA collection of classical ML equations in Latex . Some of them are provided with simple notes and paper link. Hopes to help writings such as papers and blogs. Better viewed at https://blmoistawinde.github.io/ml_equations_latex/ ModelRNNs(LSTM, GRU)encoder hidden state at time step , with input token embedding decoder hidden state at time step , with input token embedding
Attentional Seq2seqThe attention weight , the th decoder step over the th encoder step, resulting in context vector
is an specific attention function, which can be Bahdanau AttentionPaper: Neural Machine Translation by Jointly Learning to Align and Translate
Luong(Dot-Product) AttentionPaper: Effective Approaches to Attention-based Neural Machine Translation If and has same number of dimension. otherwise
Finally, the output is produced by:
TransformerPaper: Attention Is All You Need Scaled Dot-Product attention
where is the dimension of the key vector and query vector . Multi-head attentionwhere
Generative Adversarial Networks(GAN)Paper: Generative Adversarial Networks Minmax game objective
Variational Auto-Encoder(VAE)Paper: Auto-Encoding Variational Bayes Reparameterization trickTo produce a latent variable z such that , we sample , than z is produced by
Above is for 1-D case. For a multi-dimensional (vector) case we use:
ActivationsSigmoidRelated to Logistic Regression. For single-label/multi-label binary classification.
Tanh
SoftmaxFor multi-class single label classification.
Relu
Geluwhere is the cumulative distribution function of Gaussian distribution.
LossRegressionBelow and are dimensional vectors, and denotes the value on the th dimension of . Mean Absolute Error(MAE)
Mean Squared Error(MSE)
Huber lossIt’s less sensitive to outliers than the MSE as it treats error as square only inside an interval.
ClassificationCross Entropy
Negative LoglikelihoodMinimizing negative loglikelihood is equivalent to Maximum Likelihood Estimation(MLE). Here is a scaler instead of vector. It is the value of the single dimension where the ground truth lies. It is thus equivalent to cross entropy (See wiki).\
Hinge lossUsed in Support Vector Machine(SVM).
KL/JS divergence
RegularizationThe below can be any of the above loss. L1 regularizationA regression model that uses L1 regularization technique is called Lasso Regression.
L2 regularizationA regression model that uses L1 regularization technique is called Ridge Regression.
MetricsSome of them overlaps with loss, like MAE, KL-divergence. ClassificationAccuracy, Precision, Recall, F1
Sensitivity, Specificity and AUC
AUC is calculated as the Area Under the (TPR)- (FPR) Curve. RegressionMAE, MSE, equation above. |
2023-10-27
2022-08-15
2022-08-17
2022-09-23
2022-08-13
请发表评论