• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

matze-dd/YaLafi: Yet another LaTeX filter

原作者: [db:作者] 来自: 网络 收藏 邀请

开源软件名称(OpenSource Name):

matze-dd/YaLafi

开源软件地址(OpenSource Url):

https://github.com/matze-dd/YaLafi

开源编程语言(OpenSource Language):

Python 94.9%

开源软件介绍(OpenSource Introduction):

YaLafi: Yet another LaTeX filter

Notice. The library of LaTeX macros, environments, document classes, and packages is still rather restricted, compare the list of macros. Please don't hesitate to raise an Issue, if you would like to have added something.

Summary. This Python package extracts plain text from LaTeX documents. The software may be integrated with a proofreading tool and an editor. It provides

  • mapping of character positions between LaTeX and plain text,
  • simple inclusion of own LaTeX macros and environments with tailored treatment,
  • careful conservation of text flows,
  • some parsing of displayed equations for detection of included “normal” text and of interpunction problems,
  • support of multi-language documents (experimental).

The sample Python application yalafi.shell from section Example application integrates the LaTeX filter with the proofreading software LanguageTool. It sends the extracted plain text to the proofreader, maps position information in returned messages back to the LaTeX text, and generates results in different formats. You may easily

  • create a proofreading report in text or HTML format for a complete document tree,
  • check LaTeX texts in the editors Vim, Emacs and Atom via several plugins,
  • run the script as emulation of a LanguageTool server with integrated LaTeX filtering.

For instance, the LaTeX input

Only few people\footnote{We use
\textcolor{red}{redx colour.}}
is lazy.

will lead to the text report

1.) Line 2, column 17, Rule ID: MORFOLOGIK_RULE_EN_GB
Message: Possible spelling mistake found
Suggestion: red; Rex; reds; redo; Red; Rede; redox; red x
Only few people is lazy.    We use redx colour. 
                                   ^^^^
2.) Line 3, column 1, Rule ID: PEOPLE_VBZ[1]
Message: If 'people' is plural here, don't use the third-person singular verb.
Suggestion: am; are; aren
Only few people is lazy.    We use redx colour. 
                ^^

This is the corresponding HTML report (for an example with a Vim plugin, see here):

HTML report

The tool builds on results from project Tex2txt, but differs in the internal processing method. Instead of using recursive regular expressions, a simple tokeniser and a small machinery for macro expansion are implemented; see sections Differences to Tex2txt and Remarks on implementation.

Beside the interface from section Python package interface, application Python scripts like yalafi/shell/shell.py from section Example application can access an interface emulating tex2txt.py from repository Tex2txt by 'from yalafi import tex2txt'. The pure LaTeX filter can be directly used in scripts via a command-line interface, it is described in section Command-line of pure filter.

If you use this software and encounter a bug or have other suggestions for improvement, please leave a note under category Issues, or initiate a pull request. Many thanks in advance.

Happy TeXing!

Contents

Installation
Example application
Interfaces to Vim
Interface to Emacs
Interface to Atom
Usage under Windows
Related projects

Filter actions
Fundamental limitations
Adaptation of LaTeX and plain text
Extension modules for LaTeX packages
Inclusion of own macros

Multi-file projects
Handling of displayed equations
Multi-language documents
Python package interface
Command-line of pure filter
Differences to Tex2txt
Remarks on implementation

Installation

YaLafi (at least with Python version 3.6). Choose one of the following possibilities.

  • Use python -m pip install [--user] yalafi. This installs the last version uploaded to PyPI. Module pip itself can be installed with python -m ensurepip.
  • Say python -m pip install [--user] git+https://github.com/matze-dd/YaLafi.git@master. This installs the current snapshot from here.
  • Download the archive from here and unpack it. Place yalafi/ in the working directory, or in a standard directory like /usr/lib/python3.8/ or ~/.local/lib/python3.8/site-packages/. You can also locate it somewhere else and set environment variable PYTHONPATH accordingly.

LanguageTool. On most systems, you have to install the software “manually” (1). At least under Arch Linux, you can also use a package manager (2). Please note that, for example under Ubuntu, sudo snap install languagetool will not install the components required here.

  1. The LanguageTool zip archive, for example LanguageTool-5.0.zip, can be obtained from the LanguageTool download page. Option --lt-directory of application yalafi.shell from section Example application has to point to the directory created after uncompressing the archive at a suitable place. For instance, the directory has to contain file 'languagetool-server.jar'.

  2. Under Arch Linux, you can simply say sudo pacman -S languagetool. In this case, it is not necessary to set option --lt-directory from variant 1. Instead, you have to specify --lt-command languagetool.

Back to contents

Example application

Remark. You can find examples for tool integration with Bash scripts in Tex2txt/README.md.

Example Python script yalafi/shell/shell.py will generate a proofreading report in text or HTML format from filtering the LaTeX input and application of LanguageTool (LT). It is best called as module as shown below, but can also be placed elsewhere and invoked as script. A simple invocation producing an HTML report could be:

python -m yalafi.shell --lt-directory ~/lib/LT --output html t.tex > t.html

On option '--server lt', LT's Web server is contacted. Otherwise, Java has to be present, and the path to LT has to be specified with --lt-directory or --lt-command. Note that from version 4.8, LT does not fully support 32-bit systems any more. Both LT and the script will print some progress messages to stderr. They can be suppressed with python ... 2>/dev/null.

python -m yalafi.shell [OPTIONS] latex_file [latex_file ...] [> text_or_html_file]

Option names may be abbreviated. If present, options are also read from a configuration file designated by script variable 'config_file' (one option per line, possibly with argument), unless --no-config is given. Default option values are set at the Python script beginning.

  • --lt-directory dir
    Directory of the “manual” local LT installation (for variant 1 in section Installation). May be omitted on options '--server lt' and '--textgears apikey', or if script variable 'ltdirectory' has been set appropriately. See also the script comment at variable 'ltdirectory'.
  • --lt-command cmd
    Base command to call LT (for variant 2 in section Installation). For instance, this is '--lt-command languagetool'. If an LT server has to be started, the command is invoked with option --http. Note that option '--server stop' for stopping a local LT server will not work in this case.
  • --as-server port
    Emulate an LT server listening on the given port, for an example see section Interface to Emacs. The fields of received HTML requests (settings for language, rules, categories) overwrite option values given in the command line. The internally used proofreader is influenced by options like --server. Other options like --single-letters remain effective.
  • --output mode
    Mode is one of 'plain', 'html', 'xml', 'xml-b', 'json' (default: 'plain' for text report). Variant 'html' generates an HTML report, see below for further details. Modes 'xml', 'xml-b' and 'json' are intended for Vim plugins, compare section Interfaces to Vim.
  • --link
    In an HTML report, left-click on a highlighted text part opens a Web link related to the problem, if provided by LT.
  • --context number
    Number of context lines displayed around each marked text region in HTML report (default: 2). A negative number shows the whole text.
  • --include
    Track file inclusions like \input{...}. Script variable 'inclusion_macros' contains a list of the corresponding LaTeX macro names.
  • --skip regex
    Skip files matching the given regular expression. This is useful, e.g., for the exclusion of figures on option --include.
  • --plain-input
    Assume plain-text input, do not evaluate LaTeX syntax. This cannot be used together with options --include or --replace.
  • --list-unknown
    Only print a list of unknown macros and environments seen outside of maths parts. Compare, for instance, Issue #183.
  • --language lang
    Language code as expected by LT (default: 'en-GB').
  • --encoding ienc
    Encoding for LaTeX input and files from options --define and --replace (default: UTF-8).
  • --replace file
    File with phrase replacements to be performed after the conversion to plain text; see section Phrase replacement in the plain text.
  • --define file
    Read macro definitions as LaTeX code (using \newcommand or \def). If the code invokes \documentclass or \usepackage, then the corresponding modules are loaded.
  • --documentclass class
    Load extension module for this class. See section Extension modules for LaTeX packages.
  • --packages modules
    Load these extension modules for LaTeX packages, given as comma-separated list (default: '*'). See section Extension modules for LaTeX packages.
  • --add-modules file
    Parse the given LaTeX file and prepend all modules included by macro \usepackage to the list provided in option --packages. Value of option --documentclass is overridden by macro \documentclass.
  • --extract macros
    Only check first mandatory argument of the LaTeX macros whose names are given as comma-separated list. The option only works properly for predefined macros, including those imported by options --documentclass, --define, and --packages. This is useful for check of foreign-language text, if marked accordingly. Internally used for detection of file inclusions on --include.
  • --simple-equations
    Replace a displayed equation only with a single placeholder from collections 'math_repl_display*' in file yalafi/parameters; append trailing interpunction, if present.
  • --no-specials
    Revert changes from special macros and magic comments described in section Modification of LaTeX text.
  • --disable rules
    Comma-separated list of ignored LT rules, is passed as --disable to LT (default: 'WHITESPACE_RULE').
  • --enable rules
    Comma-separated list of added LT rules, is passed as --enable to LT (default: '').
  • --disablecategories cats
    --enablecategories cats
    Disable / enable LT rule categories, directly passed to LT (default for both: '').
  • --lt-options opts
    Pass additional options to LT, given as single string in argument 'opts'. The first character of 'opts' will be skipped and must not be '-'. Example: --lt-options '~--languagemodel ../Ngrams --disablecategories PUNCTUATION'. Some options are included into HTML requests to an LT server, see script variable 'lt_option_map'.
  • --single-letters accept
    Check for single letters, accepting those in the patterns given as list separated by '|'. Example: --single-letters 'A|a|I|e.g.|i.e.||' for an English text, where the trailing '||' causes the addition of equation and language-change replacements from collections 'math_repl_*' and 'lang_change_repl_*' in file yalafi/parameters.py. All characters except '|' are taken verbatim, but '~' and '\,' are interpreted as UTF-8 non-breaking space and narrow non-breaking space.
  • --equation-punctuation mode
    This is an experimental hack for the check of punctuation after equations in English texts, compare section Equation replacements in English documents. An example is given in section Differences to Tex2txt. The abbreviatable mode values indicate the checked equation type: 'displayed', 'inline', 'all'.
    The check generates a message, if an element of an equation is not terminated by a dot '.', and at the same time is not followed by a lower-case word or another equation element, both possibly separated by a punctuation mark from ',;:'. Patterns for equation elements are given by collections 'math_repl_display*' and 'math_repl_inline*' in file yalafi/parameters.py.
  • --server mode
    Use LT's Web server (mode is 'lt') or a local LT server (mode is 'my') instead of LT's command-line tool. Stop the local server (mode is 'stop', currently only works under Linux and Cygwin).
    • LT's server: Server address is set in script variable 'ltserver'. For conditions and restrictions, please refer to https://dev.languagetool.org/public-http-api.
    • Local server: If not yet running, then start it according to script variable 'ltserver_local_cmd'. On option --lt-command, the specified command is invoked with option --http. Additional server options can be passed with --lt-server-options. See also https://dev.languagetool.org/http-server. This may be faster than the command-line tool used otherwise, especially when checking many LaTeX files or together with an editor plugin. The server will not be stopped at the end (use '--server stop').
  • --lt-server-options opts
    Pass additional options when starting a local LT server. Syntax is as for --lt-options.
  • --textgears apikey
    Use the TextGears server, see https://textgears.com. Language is fixed to American English. The access key 'apikey' can be obtained on page https://textgears.com/signup.php?givemethatgoddamnkey=please, but the key 'DEMO_KEY' seems to work for short input. The server address is given by script variable 'textgears_server'.
  • --multi-language
    Activate support of multi-language documents; compare section Multi-language documents for further related options.
  • --no-config
    Do not read config file, whose name is set in script variable 'config_file'.

Dictionary adaptation. LT evaluates the two files 'spelling.txt' and 'prohibit.txt' in directory

.../LanguageTool-?.?/org/languagetool/resource/<lang-code>/hunspell/

Additional words and words that shall raise an error can be appended here. LT version 4.8 introduced additional files 'spelling_custom.txt' and 'prohibit_custom.txt'.

HTML report. The idea of an HTML report goes back to Sylvain Hallé, who developed TeXtidote. Opened in a Web browser, the report displays excerpts from the original LaTeX text, highlighting the problems indicated by LT. The corresponding LT messages can be viewed when hovering the mouse over these marked places, see the introductory example above. With option --link, Web links provided by LT can be directly opened with left-click. Script option --context controls the number of lines displayed around each tagged region; a negative option value will show the complete LaTeX input text. If the localisation of a problem is unsure, highlighting will use yellow instead of orange colour. For simplicity, marked text regions that intertwine with other ones are separately repeated at the end. In case of multiple input files, the HTML report starts with an index.

Back to contents

Interfaces to Vim

As [Vim] is a great editor, there are several possibilities that build on existing Vim plugins or use Vim's compiler interface:

plugin vimtex | “plain Vim” | plugin vim-grammarous | plugin vim-LanguageTool | plugin ALE

Plugin vimtex

The Vim plugin [vimtex] provides comprehensive support for writing LaTeX documents. It includes an interface to YaLafi, documentation is available with :help vimtex-grammar-vlty. A copy of the corresponding Vim compiler script is editors/vlty.vim.

The following snippet demonstrates a basic vimrc setting and some useful values for vlty option field 'shell_options'.

map <F9> :w <bar> compiler vlty <bar> make <bar> :cw <cr><esc>
let g:tex_flavor = 'latex'
set spelllang=de_DE
let g:vimtex_grammar_vlty = {}
let g:vimtex_grammar_vlty.lt_directory = '~/lib/LanguageTool-5.0'
" let g:vimtex_grammar_vlty.lt_command = 'languagetool'
let g:vimtex_grammar_vlty.server = 'my'
let g:vimtex_grammar_vlty.show_suggestions = 1
let g:vimtex_grammar_vlty.shell_options =
        \   ' --multi-language'
        \ . ' --packages "*"'
        \ . ' --define ~/vlty/defs.tex'
        \ . ' --replace ~/vlty/repls.txt'
        \ . ' --equation-punctuation display'
        \ . ' --single-letters "i.\,A.\|z.\,B.\|\|"'
  • Function key 'F9' saves the file, starts the compiler, and opens the quickfix window.
  • Uncomment the line with g:vimtex_grammar_vlty.lt_command, if LanguageTool has been installed by variant 2 in section Installation. In this case, specification of g:vimtex_grammar_vlty.lt_directory is not necessary.
  • The option g:vimtex_grammar_vlty.server = 'my' usually results in faster checks for small to medium LaTeX files. Start-up time is saved, and speed benefits from the internal sentence caching of the server.
  • Saying let g:vimtex_grammar_vlty.show_suggestions = 1 causes display of LanguageTool's replacement suggestions.
  • With option --multi-language, commands from LaTeX package 'babel' switch the language for the proofreading program. See section Multi-language documents.
  • By default, the vlty compiler passes names of all necessary LaTeX packages to YaLafi, which may result in annoying warnings. In multi-file projects, these are suppressed by --packages "*" that simply loads all packages known to the filter.
  • YaLafi's expansion of project-specific macros can be controlled via option --define .... Example for defs.tex (Note that the first three lines are only necessary, if the currently edited file does not directly contain these definitions.):
    \newcommand{\zB}{z.\,B. }   % LanguageTool correctly insists on
                                % narrow space in this German abbreviation
    \newtheorem{Satz}{Satz}     % correctly expand \begin{Satz}[siehe ...]
    \LTinput{main.glsdefs}      % read database of glossaries package
  • Replacement of phrases may be performed via --replace ..., compare section Phrase replacement in the plain text.
  • Option --equation-punctuation display enables some additional interpunction checking for displayed equations in English texts, see section Example application and this example.
  • Option --single-letters ... activates search for isolated single letters. Note that only the '|' signs need to be escaped here; compare section Example application.

Here is the introductory example from above:

Vim plugin vim-vimtex

“Plain Vim”

File editors/ltyc.vim proposes a simple application to Vim's compiler interface. The file has to be copied to a directory like ~/.vim/compiler/.

For a Vim session, the component is activated with :compiler ltyc. Command :make invokes yalafi.shell, and the cursor is set to the first indicated problem. The related error message is displayed in the status line. Navigation between errors is possible with :cn and :cp, an error list is shown with :cl. The quickfix window appears on :cw.

The following snippet demonstrates a basic vimrc setting and some useful values for option 'ltyc_shelloptions'. Please refer to section Plugin vimtex for related comments.

map <F9> :w <bar> compiler ltyc <bar> make <bar> :cw <cr><esc>
let g:ltyc_ltdirectory = '~/lib/LanguageTool-5.0'
" let g:ltyc_ltcommand = 'languagetool'
let g:ltyc_server = 'my'
let g:ltyc_showsuggestions = 1
let g:ltyc_language = 'de-DE'
let g:ltyc_shelloptions =
        \   ' --multi-language'
        \ . ' --replace ~/ltyc/repls.txt'
        \ . ' --define ~/ltyc/defs.tex'
        \ . ' --equation-punctuation display'
        \ . ' --single-letters "i.\,A.\|z.\,B.\|\|"'
compiler ltyc

The screenshot resembles that from section Plugin vimtex.

Plugin vim-grammarous

For the Vim plugin [vim-grammarous], it is possible to provide an interface for checking LaTeX texts. With an entry in ~/.vimrc, one may simply replace the command that invokes LanguageTool. For instance, you can add to ~/.vimrc

let g:grammarous#languagetool_cmd = '/home/foo/bin/yalafi-grammarous'
map <F9> :GrammarousCheck --lang=en-GB<CR>

A proposal for Bash script /home/foo/bin/yalafi-grammarous (replace foo with username ;-) is given in editors/yalafi-grammarous. It has to be made executable with chmod +x .... Please adapt script variable ltdir, compare option --lt-directory in section Example application. If you do not want to have started a local LT server, comment out the line defining script variable use_server.

In order to avoid the problem described in Issue #89@vim-grammarous (shifted error highlighting, if after non-ASCII character on same line), you can set output=xml-b in yalafi-grammarous.

Troubleshooting for Vim interface. If Vim reports a problem with running LT, you can do the following. In ~/bin/yalafi-grammarous, comment out the final ... 2>/dev/null. For instance, you can just place a '#' in front: ... # 2>/dev/null. Then start, with a test file t.tex,

$ ~/bin/yalafi-grammarous t.tex

This should display some error message, if the problem goes back to running the script, Python, yalafi.shell or LanguageTool.

Here is the introductory example from above:

Vim plugin vim-grammarous

Plugin vim-LanguageTool

The Vim plugin [vim-LanguageTool] relies on the same XML interface to LanguageTool as the variant in section Plugin vim-grammarous. Therefore, one can reuse the Bash script editors/yalafi-grammarous. You can add to ~/.vimrc

let g:languagetool_cmd = '$HOME/bin/yalafi-grammarous'
let g:languagetool_lang = 'en-GB'
let g:languagetool_disable_rules = 'WHITESPACE_RULE'
map <F9> :LanguageToolCheck<CR>

Please note the general problem indicated in Issue #17. Here is again the introductory example from above. Navigation between highlighted text parts is possible with :lne and :lp.

Vim plugin vim-LanguageTool

Plugin ALE

With [ALE], the proofreader ('linter') by default is invoked as background task, whenever one leaves insert mode. You might add to ~/.vimrc

" if not yet set:
filetype plugin on
" F9: show detailed LT message for error under cursor, is left with 'q'
map <F9> :ALEDetail<CR>
" this turns off all other tex linters
let g:ale_linters = { 'plaintex': ['lty'], 'tex': ['lty'] }
" default place of LT installation: '~/lib/LanguageTool'
let g:ale_tex_lty_ltdirectory = '~/lib/LanguageTool-4.7'
" uncomment the following assignment, if LT has been installed via package
" manager; in this case, g:ale_tex_lty_ltdirectory hasn't to be specified
" let g:ale_tex_lty_command = 'languagetool'
" set to '' to disable server usage or to 'lt' for LT's Web server
let g:ale_tex_lty_server = 'my'
" default language: 'en-GB'
let g:ale_tex_lty_language = 'en-GB'
" default disabled LT rules: 'WHITESPACE_RULE'
let g:ale_tex_lty_disable = 'WHITESPACE_RULE'

Similarly to setting 'g:ale_tex_lty_disable', one can specify LT's options --enable, --disablecategories, and --enablecategories. Further options for yalafi.shell (compare section Plugin vimtex) may be passed like

let g:ale_tex_lty_shelloptions = '--single-letters "A|a|I|e.g.|i.e.||"'
                \ . ' --equation-punctuation display'

Additionally, one has to install ALE and copy or link file editors/lty.vim to directory ~/.vim/bundle/ale/ale_linters/tex/, or a similar location.

Here is again the introductory example from above. The complete message for the error at the cursor is displayed on F9, together with LT's rule ID, replacement suggestions, and the problem context (left with q). Navigation between highlighted text parts is possible with :lne and :lp, an error list is shown with :lli.

Vim plugin ALE

Back to contents

Interface to Emacs

The Emacs plugin [Emacs-langtool] may be used in two variants. First, you can add to ~/.emacs

(setq langtool-bin "/home/foo/bin/yalafi-emacs")
(setq langtool-default-language "en-GB")
(setq langtool-disabled-rules "WHITESPACE_RULE")
(require 'langtool)

A proposal for Bash script /home/foo/bin/yalafi-emacs (replace foo with username ;-) is given in editors/yalafi-emacs. It has to be made executable with chmod +x .... Please adapt script variable ltdir, compare option --lt-directory in section Example application. If you do not want to have started a local LT server, comment out the line defining script variable use_server.

Troubleshooting for Emacs interface. If Emacs reports a problem with running LT, you can apply the steps from [Troubleshooting for Vim interface] to ~/bin/yalafi-emacs.

Server interface. This variant may result in better tracking of character positions. In order to use it, you can write in ~/.emacs

(setq langtool-http-server-host "localhost"
      langtool-http-server-port 8082)
(setq langtool-default-language "en-GB")
(setq langtool-disabled-rules "WHITESPACE_RULE")
(require 'langtool)

and start yalafi.shell as server in another terminal with

$ python -m yalafi.shell --as-server 8082 [--lt-directory /path/to/LT]

The server will print some progress messages and can be stopped with CTRL-C. Further script arguments from section Example application may be given. If you add, for instance, '--server my', then a local LT server will be used. It is started on the first HTML request received from Emacs-langtool, if it is not yet running.

Installation of Emacs-langtool. Download and unzip Emacs-langtool. Place file langtool.el in directory ~/.emacs.d/lisp/. Set in your ~/.profile or ~/.bash_profile (and log in again)

export EMACSLOADPATH=~/.emacs.d/lisp:

Here is the introductory example from above:

Emacs plugin Emacs-langtool

Back to contents

Interface to Atom

For the editor [Atom], you can use the plugin [linter-yalafi]. Please note that we have not yet tested this interface.

Back to contents

Usage under Windows

Both yalafi.shell and yalafi can be directly used in a Windows command script or console. For example, this could look like

py -3 -m yalafi.shell --server lt --output html t.tex > t.html

or

"c:\Program Files\Python\Python37\python.exe" -m yalafi.shell --server lt --output html t.tex > t.html

if the Python launcher has not been installed.

Files with Windows-style line endings (CRLF) are accepted, but the text output of the pure LaTeX filter will be Unix style (LF only), unless a Windows Python interpreter is used.

Python's version for Windows by default prints Latin-1 encoded text to standard output. As this ensures proper work in a Windows command console, we do not change it for yalafi.shell when generating a text report. All other output is fixed to UTF-8 encoding.

Back to contents

Related projects

This project relates to software like

OpenDetex | pandoc | plasTeX | pylatexenc | TeXtidote | tex2txt | vscode-ltex

From these examples, currently (March 2020) only TeXtidote and vscode-ltex provide position mapping between the LaTeX input text and the plain text that is sent to the proofreading software. Both use (simple) regular expressions for plain-text extraction and are easy to install. YaLafi, on the other hand, aims to achieve high flexibility and a good filtering quality with minimal number of false positives from the proofreading software.

Back to contents

Filter actions

Here is a list of the most important filter operations. When the filter encounters a LaTeX problem like a missing end of equation, a message is printed to stderr. Additionally, the mark from 'Parameters.mark_latex_error' in file yalafi/parameters.py is included into the filter output. This mark should raise a spelling error from the proofreader at the place where the problem was detected.

  • A collection of standard LaTeX macros and environments is already included, but very probably it has to be complemented. Compare variables 'Parameters.macro_defs_latex', 'Parameters.macro_defs_python', and 'Parameters.environment_defs' in file yalafi/parameters.py.
  • The macros \documentclass and \usepackage load extension modules that define important macros and environments provided by the corresponding LaTeX packages. For other activation methods of these modules, see also section Extension modules for LaTeX packages.
  • Macro definitions with \(re)newcommand and \def (the latter only roughly approximated) in the input text are processed. Statement \LTinput{file.tex} reads macro definitions from the given file. Further own macros with arbitrary arguments can be defined on Python level, see section Inclusion of own macros.
  • Unknown macros are silently ignored, keeping their arguments with enclosing {} braces removed. They can be listed with options --unkn and --list-unknown for yalafi and yalafi.shell, respectively.
  • Environment frames \begin{...} and \end{...} are deleted. We implement tailored behaviour for environment types listed in 'Parameters.environment_defs' in file yalafi/parameters.py, see section Inclusion of own macros. For instance, environment bodies can be removed or replaced by fixed text.
  • Text in heading macros as \section{...} is extracted with added interpunction, see variable 'Parameters.heading_punct' in file yalafi/parameters.py. This suppresses false positives from LanguageTool.
  • For macros as \ref, \eqref, \pageref, and \cite, suitable placeholders are inserted.
  • Arguments of macros like \footnote are appended to the main text, separated by blank lines. This preserves text flows.
  • Inline maths material $...$ and \(...\) is replaced with text from the rotating collections 'math_repl_inline*' in file yalafi/parameters.py. Trailing interpunction from 'Parameters.math_punctuation' is appended.
  • Equation environments are resolved in a way suitable for check of interpunction and spacing. The argument of macros like \mbox and \text is included into the output text. Versions \[...\] and $$...$$ are handled like environment displaymath. See also sections Handling of displayed equations and Parser for maths material.
  • We generate numbered default \item labels for environment enumerate.
  • For \item with specified [...] label, some treatment is provided. If the text before ends with a punctuation mark from collection 'Parameters.item_punctuation' in file yalafi/parameters.py, then this mark is appended to the label. This works well for German texts, it is turned off with the setting 'item_punctuation = []'.
  • Letters with text-mode accents as '\`' or '\v' are translated to the corresponding UTF-8 characters.
  • Things like double quotes '``' and dashes '--' are replaced with the corresponding UTF-8 characters. Additionally, we replace '~' and '\,' by UTF-8 non-breaking space and narrow non-breaking space.
  • For language 'de', suitable replacements for macros like '"`' and '"=' are inserted, see method 'Parameters.init_parser_languages()' in file yalafi/parameters.py.
  • Macro \verb and environment verbatim are processed. Environment verbatim can be replaced or removed like other environments with an appropriate entry in 'Parameters.environment_defs' in yalafi/parameters.py.
  • Rare warnings from the proofreading program can be suppressed using \LTadd{...}, \LTskip{...}, \LTalter{...}{...} in the LaTeX text; compare section Adaptation of LaTeX and plain text.
  • Complete text sections, for instance parts of the LaTeX preamble, may be skipped with the special LaTeX comment '%%% LT-SKIP-BEGIN'; see section Adaptation of LaTeX and plain text.

Back to contents

Fundamental limitations

The implemented parsing mechanism can only roughly approximate the behaviour of a real LaTeX system. We assume that only “reasonable” macros are used, lower-level TeX operations are not supported. If necessary, they should be enclosed in \LTskip{...} (see section Adaptation of LaTeX and plain text) or be placed in a LaTeX file “hidden” for the filter (compare option --skip of yalafi.shell in section Example application). With little additional work, it might be possible to include some plain-TeX features like parsing of elastic length specifications. A list of remaining incompatibilities must contain at least the following points.

  • Mathematical material is represented by simple replacements. As the main goal is application of a proofreading software, we have deliberately taken this approach.
  • Parsing does not cross file boundaries. Tracking of file inclusions is possible though.
  • Macros depending on (spacing) lengths may be treated incorrectly.
  • Character '@' always has category 'letter'. See Issue #183.

Back to contents

Adaptation of LaTeX and plain text

In order to suppress unsuitable but annoying messages from the proofreading tool, it is sometimes necessary to modify the input text. You can do that in the LaTeX code, or after filtering in the plain text.

Modification of LaTeX text

The following operations can be deactivated with options --nosp and --no-specials of yalafi and yalafi.shell, respectively. For instance, macro \LTadd will be defined, but it will not add its argument to the plain text.

Special macros. Small modifications, for instance concerning interpunction, can be made with the predefined macros \LTadd, \LTalter and \LTskip. In order to add a full stop for the proofreader only, you would write

... some text\LTadd{.}

For LaTeX itself, the macros also have to be defined. A good place is the document preamble. (For the last line, compare section Inclusion of own macros.)

\newcommand{\LTadd}[1]{}
\newcommand{\LTalter}[2]{#1}
\newcommand{\LTskip}[1]{#1}
\newcommand{\LTinput}[1]{}

The LaTeX filter will ignore these statements. In turn, it will include the argument of \LTadd, use the second argument of \LTalter, and neglect the argument of \LTskip. The macro names for \LTadd etc. are defined by variables 'Parameters.macro_filter_add' etc. in file yalafi/parameters.py.

Special comments. Mainly the document preamble often contains statements not properly processed “out-of-the-box”. Placing the critical parts in \LTskip{...} may lead to problems, as the statements now are executed slightly differently by the TeX system. As “brute-force” variant, the LaTeX filter therefore ignores input enclosed in comments starting with %%% LT-SKIP-BEGIN and %%% LT-SKIP-END. Note that the single space after %%% is significant. The opening special comment is given in variable 'Parameters.comment_skip_begin' of file yalafi/parameters.py.

A preamble could look as follows.

\documentclass{article}
%%% LT-SKIP-BEGIN
... disturbing stuff ...
%%% LT-SKIP-END
\title{A paper}
\begin{document}

热门推荐
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap