Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
600 views
in Technique[技术] by (71.8m points)

python - Task output encoding in VSCode

I'm learning BeautifullSoup with Visual Studio Code and when I run this script:

import requests
from bs4 import BeautifulSoup
from fake_useragent import UserAgent

ua = UserAgent()
header = {'user-agent':ua.chrome}
google_page = requests.get('https://www.google.com',headers=header)

soup = BeautifulSoup(google_page.content,'lxml') # html.parser

print(soup.prettify())

And I'm getting the following error:

Traceback (most recent call last): File "c: ... intro-to-soup-2.py", line 13, in print(soup.prettify()) File "C: ... LocalProgramsPythonPython36-32libencodingscp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character 'U0001f440' in position 515: character maps to

If I force the encoding for utf-8 in the soup variable I won't be abble to use prettify as it doesn't work with strings... Also tried using # -- coding: utf-8 -- on the first line of code without sucess.

Here is my tasks.json for this project:

{
// See https://go.microsoft.com/fwlink/?LinkId=733558
// for the documentation about the tasks.json format
"version": "0.1.0",
"command": "python",
"isShellCommand": true,
"args": ["${file}"],
"files.encoding": "utf8",
// Controls after how many characters the editor will wrap to the next line. Setting this to 0 turns on viewport width wrapping (word wrapping). Setting this to -1 forces the editor to never wrap.
"editor.wrappingColumn": 0, // default value is 300
// Controls the font family.
"editor.fontFamily": "Consolas, 'Malgun Gothic', '?? ??','Courier New', monospace",
// Controls the font size.
"editor.fontSize": 15,
"showOutput": "always"
}

The exact same code is running in PyCharm without any problems. Any ideas how I can fix this in Visual Studio Code?

Here's my "pip freeze" result:

astroid==1.5.3
beautifulsoup4==4.5.3
colorama==0.3.9
fake-useragent==0.1.7
html5lib==0.999999999
isort==4.2.15
lazy-object-proxy==1.3.1
lxml==3.7.2
mccabe==0.6.1
pylint==1.7.1
requests==2.12.5
selenium==3.4.3
six==1.10.0
webencodings==0.5
wrapt==1.10.10
xlrd==1.0.0
XlsxWriter==0.9.6

Thank you for your time,

Eunito.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The problem here seems to be the encoding the python interpreter believes stdout/stderr support. For some reason (arguably, a bug in VSCode) this is set to some platform-specific value (cp1252 in windows for you, I was able to reproduce the issue on OS X and got ascii) instead of utf-8 which the VSCode output window supports. You can modify your task.json to look something like this to address this - it sets an environment variable forcing the Python interpreter to use utf8 for output.

{
    // See https://go.microsoft.com/fwlink/?LinkId=733558
    // for the documentation about the tasks.json format
    "version": "0.1.0",
    "command": "python3",
    "isShellCommand": true,
    "args": ["${file}"],
    "showOutput": "always",
    "options": {
        "env": {
            "PYTHONIOENCODING":"utf-8"
        }
    }
}

The relevant bit is the "options" dictionary.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...