在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
开源软件名称(OpenSource Name):ritheshkumar95/im2latex-tensorflow开源软件地址(OpenSource Url):https://github.com/ritheshkumar95/im2latex-tensorflow开源编程语言(OpenSource Language):HTML 69.7%开源软件介绍(OpenSource Introduction):im2latex tensorflow implementationThis is a tensorflow implementation of the HarvardNLP paper - What You Get Is What You See: A Visual Markup Decompiler. This is also a potential solution to OpenAI's Requests For Research Problem - im2latex The paper (http://arxiv.org/pdf/1609.04938v1.pdf) provides technical details of the model. Original Torch implementation of the paper[https://github.com/harvardnlp/im2markup/blob/master/]
This is a general-purpose, deep learning-based system to decompile an image into presentational markup. For example, we can infer the LaTeX or HTML source from a rendered image. An example input is a rendered LaTeX formula: The goal is to infer the LaTeX formula that can render such an image:
Sample results from this implementationFor more results, view results_validset.html, results_testset.html files. PrerequsitesMost of the code is written in tensorflow, with Python for preprocessing. PreprocessThe proprocessing for this dataset is exactly reproduced as the original torch implementation by the HarvardNLP group Python
Optional: We use Node.js and KaTeX for preprocessing Installation InstallatonpdflatexPdflatex is used for rendering LaTex during evaluation. InstallationImageMagick convertConvert is used for rending LaTex during evaluation. InstallationWebkit2pngWebkit2png is used for rendering HTML during evaluation. Preprocessing InstructionsThe images in the dataset contain a LaTeX formula rendered on a full page. To accelerate training, we need to preprocess the images. Please download the training data from https://zenodo.org/record/56198#.WFojcXV94jA and extract into source (master) folder.
The above command will crop the formula area, and group images of similar sizes to facilitate batching. Next, the LaTeX formulas need to be tokenized or normalized.
The above command will normalize the formulas. Note that this command will produce some error messages since some formulas cannot be parsed by the KaTeX parser. Then we need to prepare train, validation and test files. We will exclude large images from training and validation set, and we also ignore formulas with too many tokens or formulas with grammar errors.
Finally, we generate the vocabulary from training set. All tokens occuring less than (including) 1 time will be excluded from the vocabulary.
Train, Test and Valid images need to be segmented into buckets based on image size (height, width) to facilitate batch processing. train_buckets.npy, valid_buckets.npy, test_buckets.npy can be generated using the DataProcessing.ipynb script
Train
Default hyperparameters used:
The train NLL drops to 0.08 after 18 epochs of training on 24GB Nvidia M40 GPU. Testpredict() function in the attention.py script can be called to predict from validation or test sets. Predict.ipynb script displays and renders the results saved by the predict() function Evaluateattention.py scores the train set and validation set after each epoch (measures mean train NLL, perplexity) Scores from this implementationWeight filesVisualizing the attention mechanism |
2023-10-27
2022-08-15
2022-08-17
2022-09-23
2022-08-13
请发表评论