Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
812 views
in Technique[技术] by (71.8m points)

c# - Parsing UTF8 encoded data from a Web Service

I'm parsing the date from http://toutankharton.com/ws/localisations.php?l=75

As you can see, it's encoded (<name>Paris 2ème</name>).

My code is the following :

using (var reader = new StreamReader(stream, Encoding.UTF8))
            {
                var contents = reader.ReadToEnd();

                XElement cities = XElement.Parse(contents);

                    var t = from city in cities.Descendants("city")
                                                    select new City
                                                    {
                                                        Name = city.Element("name").Value,
                                                        Insee = city.Element("ci").Value,
                                                        Code = city.Element("code").Value,
                                                    };
            }

Isn't new StreamReader(stream, Encoding.UTF8) sufficient ?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

That looks like something that happens if you take utf8-bytes and output them with a incompatible encoding like ISO8859-1. Do you know what the real character is? Going back, using ISO8859-1 to get a byte array, and UTF8 to read it, gives "è".

var input = "?¨";
var bytes = Encoding.GetEncoding("ISO8859-1").GetBytes(input);
var realString = Encoding.UTF8.GetString(bytes);

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...