Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
724 views
in Technique[技术] by (71.8m points)

python - How to install module for BeautifulSoup XML parsing?

In this answer, I was told to not use BeautifulSoup(xmlData, 'html.parser') for XML parsing but to use BeautifulSoup(xmlData, 'xml'). That parser, however, does not come with BeautifulSoup.

As per one of the comments, I tried:

python -m pip install lxml

But got:

Collecting lxml
  Using cached lxml-3.6.4.tar.gz
Installing collected packages: lxml
  Running setup.py install for lxml ... error
    Complete output from command D:SOFTPython3python.exe -u -c "import setuptools, tokenize;__file__='C:\U
sers\myuser\AppData\Local\Temp\pip-build-hl9fxzny\lxml\setup.py';f=getattr(tokenize, 'open', open)(__fi
le__);code=f.read().replace('
', '
');f.close();exec(compile(code, __file__, 'exec'))" install --record C:
UsersmyuserAppDataLocalTemppip-ivemv19a-recordinstall-record.txt --single-version-externally-managed --
compile:
    Building lxml version 3.6.4.
    Building without Cython.
    ERROR: b"'xslt-config' is not recognized as an internal or external command,
operable program or batch
file.
"
    ** make sure the development packages of libxml2 and libxslt are installed **

    Using build configuration of libxslt
    running install
    running build
    running build_py
    creating build
    creating buildlib.win32-3.5
    creating buildlib.win32-3.5lxml
    copying srclxmluilder.py -> buildlib.win32-3.5lxml
    copying srclxmlcssselect.py -> buildlib.win32-3.5lxml
    copying srclxmldoctestcompare.py -> buildlib.win32-3.5lxml
    copying srclxmlElementInclude.py -> buildlib.win32-3.5lxml
    copying srclxmlpyclasslookup.py -> buildlib.win32-3.5lxml
    copying srclxmlsax.py -> buildlib.win32-3.5lxml
    copying srclxmlusedoctest.py -> buildlib.win32-3.5lxml
    copying srclxml\_elementpath.py -> buildlib.win32-3.5lxml
    copying srclxml\__init__.py -> buildlib.win32-3.5lxml
    creating buildlib.win32-3.5lxmlincludes
    copying srclxmlincludes\__init__.py -> buildlib.win32-3.5lxmlincludes
    creating buildlib.win32-3.5lxmlhtml
    copying srclxmlhtmluilder.py -> buildlib.win32-3.5lxmlhtml
    copying srclxmlhtmlclean.py -> buildlib.win32-3.5lxmlhtml
    copying srclxmlhtmldefs.py -> buildlib.win32-3.5lxmlhtml
    copying srclxmlhtmldiff.py -> buildlib.win32-3.5lxmlhtml
    copying srclxmlhtmlElementSoup.py -> buildlib.win32-3.5lxmlhtml
    copying srclxmlhtmlformfill.py -> buildlib.win32-3.5lxmlhtml
    copying srclxmlhtmlhtml5parser.py -> buildlib.win32-3.5lxmlhtml
    copying srclxmlhtmlsoupparser.py -> buildlib.win32-3.5lxmlhtml
    copying srclxmlhtmlusedoctest.py -> buildlib.win32-3.5lxmlhtml
    copying srclxmlhtml\_diffcommand.py -> buildlib.win32-3.5lxmlhtml
    copying srclxmlhtml\_html5builder.py -> buildlib.win32-3.5lxmlhtml
    copying srclxmlhtml\_setmixin.py -> buildlib.win32-3.5lxmlhtml
    copying srclxmlhtml\__init__.py -> buildlib.win32-3.5lxmlhtml
    creating buildlib.win32-3.5lxmlisoschematron
    copying srclxmlisoschematron\__init__.py -> buildlib.win32-3.5lxmlisoschematron
    copying srclxmllxml.etree.h -> buildlib.win32-3.5lxml
    copying srclxmllxml.etree_api.h -> buildlib.win32-3.5lxml
    copying srclxmlincludesc14n.pxd -> buildlib.win32-3.5lxmlincludes
    copying srclxmlincludesconfig.pxd -> buildlib.win32-3.5lxmlincludes
    copying srclxmlincludesdtdvalid.pxd -> buildlib.win32-3.5lxmlincludes
    copying srclxmlincludesetreepublic.pxd -> buildlib.win32-3.5lxmlincludes
    copying srclxmlincludeshtmlparser.pxd -> buildlib.win32-3.5lxmlincludes
    copying srclxmlincludes
elaxng.pxd -> buildlib.win32-3.5lxmlincludes
    copying srclxmlincludesschematron.pxd -> buildlib.win32-3.5lxmlincludes
    copying srclxmlincludesree.pxd -> buildlib.win32-3.5lxmlincludes
    copying srclxmlincludesuri.pxd -> buildlib.win32-3.5lxmlincludes
    copying srclxmlincludesxinclude.pxd -> buildlib.win32-3.5lxmlincludes
    copying srclxmlincludesxmlerror.pxd -> buildlib.win32-3.5lxmlincludes
    copying srclxmlincludesxmlparser.pxd -> buildlib.win32-3.5lxmlincludes
    copying srclxmlincludesxmlschema.pxd -> buildlib.win32-3.5lxmlincludes
    copying srclxmlincludesxpath.pxd -> buildlib.win32-3.5lxmlincludes
    copying srclxmlincludesxslt.pxd -> buildlib.win32-3.5lxmlincludes
    copying srclxmlincludesetree_defs.h -> buildlib.win32-3.5lxmlincludes
    copying srclxmlincludeslxml-version.h -> buildlib.win32-3.5lxmlincludes
    creating buildlib.win32-3.5lxmlisoschematron
esources
    creating buildlib.win32-3.5lxmlisoschematron
esources
ng
    copying srclxmlisoschematron
esources
ngiso-schematron.rng -> buildlib.win32-3.5lxmlisoschematron
resources
ng
    creating buildlib.win32-3.5lxmlisoschematron
esourcesxsl
    copying srclxmlisoschematron
esourcesxslRNG2Schtrn.xsl -> buildlib.win32-3.5lxmlisoschematron
eso
urcesxsl
    copying srclxmlisoschematron
esourcesxslXSD2Schtrn.xsl -> buildlib.win32-3.5lxmlisoschematron
eso
urcesxsl
    creating buildlib.win32-3.5lxmlisoschematron
esourcesxsliso-schematron-xslt1
    copying srclxmlisoschematron
esourcesxsliso-schematron-xslt1iso_abstract_expand.xsl -> buildlib.win
32-3.5lxmlisoschematron
esourcesxsliso-schematron-xslt1
    copying srclxmlisoschematron
esourcesxsliso-schematron-xslt1iso_dsdl_include.xsl -> buildlib.win32-
3.5lxmlisoschematron
esourcesxsliso-schematron-xslt1
    copying srclxmlisoschematron
esourcesxsliso-schematron-xslt1iso_schematron_message.xsl -> buildlib.
win32-3.5lxmlisoschematron
esourcesxsliso-schematron-xslt1
    copying srclxmlisoschematron
esourcesxsliso-schematron-xslt1iso_schematron_skeleton_for_xslt1.xsl ->
 buildlib.win32-3.5lxmlisoschematron
esourcesxsliso-schematron-xslt1
    copying srclxmlisoschematron
esourcesxsliso-schematron-xslt1iso_svrl_for_xslt1.xsl -> buildlib.win3
2-3.5lxmlisoschematron
esourcesxsliso-schematron-xslt1
    copying srclxmlisoschematron
esourcesxsliso-schematron-xslt1
eadme.txt -> buildlib.win32-3.5lxmli
soschematron
esourcesxsliso-schematron-xslt1
    running build_ext
    building 'lxml.etree' extension
    error: Unable to find vcvarsall.bat

    ----------------------------------------
Command "D:SOFTPython3python.exe -u -c "import setuptools, tokenize;__file__='C:\Users\myuser\AppData\L
ocal\Temp\pip-build-hl9fxzny\lxml\setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().repl
ace('
', '
');f.close();exec(compile(code, __file__, 'exec'))" install --record C:UsersmyuserAppDataLo
calTemppip-ivemv19a-recordinstall-record.txt --single-version-externally-managed --compile" failed with err
or code 1 in C:UsersmyuserAppDataLocalTemppip-build-hl9fxznylxml

I am using Python 3.5.2 and would like something that will work right out of pip, meaning won't need to be compiled separately.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You would need a compiler on Windows to install lxml through pip.

Some unofficial builds are available here: http://www.lfd.uci.edu/~gohlke/pythonlibs/#lxml

Find URL for the right wheel package then this should work:

pip install http://url_to_wheel

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...