Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
460 views
in Technique[技术] by (71.8m points)

c++ - string conversion with boost locale: different behaviour on windows and linux

This is my sample code:

#pragma execution_character_set("utf-8")

#include <boost/locale.hpp>
#include <boost/algorithm/string/case_conv.hpp>
#include <iostream>

int main()
{
    std::locale loc = boost::locale::generator().generate("");
    std::locale::global(loc);

#ifdef MSVC
    std::cout << boost::locale::conv::from_utf("grü?en vs ", "ISO8859-15");
    std::cout << boost::locale::conv::from_utf(boost::locale::to_upper("grü?en"), "ISO8859-15") << std::endl;
    std::cout << boost::locale::conv::from_utf(boost::locale::fold_case("grü?en"), "ISO8859-15") << std::endl;
    std::cout << boost::locale::conv::from_utf(boost::locale::normalize("grü?en", boost::locale::norm_nfd), "ISO8859-15") << std::endl;
#else
    std::cout << "grü?en vs ";
    std::cout << boost::locale::to_upper("grü?en") << std::endl;
    std::cout << boost::locale::fold_case("grü?en") << std::endl;
    std::cout << boost::locale::normalize("grü?en", boost::locale::norm_nfd) << std::endl;
#endif

    return 0;
}

Output on Windows 7 is:

grü?en vs GRü?EN
grü?en
gru?en

Output on Linux (openSuSE 12.3) is:

grü?en vs GRüSSEN
grüssen
grü?en

On Linux the german letter '?' is converted to 'SS' as predicted, while this character remains unchanged on Windows.

Question: why is this so? How can I correct the conversion?

Some notes: Windows console codepage is set to 1252. In both cases locales are set to de_DE. I tried to replace the default locale setting in the listing above by "de_DE.UTF-8" - without any effect. On Windows this code is compiled with Visual Studio 2013, on Linux with GCC 4.7, c++11 enabled.

Any suggestions are appreciated - thanks in advance for your support!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Windows doesn't do this conversion because "it would be too confusing" for developers if the string length changed all of a sudden. And boost presumably just delegates all the Unicode conversions to the underlying Windows APIs

Source

I guess the robust way to handle it would be to use a third-party Unicode library such as ICU.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...