Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
375 views
in Technique[技术] by (71.8m points)

utf 8 - UTF-8 encoding of umlauts/acute accent in popovers/tooltips of kableextra in R Markdown

I want to display words with umlauts (i.e. ??ü) and accents (e.g. éè) in a tooltip in a kableextra table. However, something with the encoding seems to go wrong. See:

---
title: "R Markdown - Test umlaut"
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(kableExtra)
library(dplyr)
```

If I create this simple table, I get a warning and Chinese (?) letters in the tooltip. If I do the same with popover = "café" I get the same warning and no popover at all.

```{r kableextra}

x <- tibble(a = "Mot?rhead", b = "Mot?rfeet", c = "café", d = "olé") %>% 
  kbl() %>%
  kable_paper(full_width = F)

x %>% column_spec(3, tooltip = "café")

```

## Warning in `xml_attr<-.xml_node`(`*tmp*`, t, value = tooltip_list[t]): string is
## not in UTF-8 [1303]

enter image description here

What puzzles me is that the umlauts/accents are correctly displayed in the cells of the tables but not in the tooltip/popover.

Now I found that the problem can be solved using enc2utf8:

```{r kableextra2}

x %>% column_spec(3, tooltip = enc2utf8("café"))

```

enter image description here

What I find strange is that the string is provided via RStudio so should it not be encoded in utf-8 anyways? I also tried File -> Save with Encoding... -> utf-8. This did not help.

Is the problem with kableextra? Is there a way to solve it more elegantly? I do not really like my solution.

Sessioninfo:

R version 4.0.3 (2020-10-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] dplyr_1.0.2      kableExtra_1.3.1

loaded via a namespace (and not attached):
 [1] rstudioapi_0.13   knitr_1.30        xml2_1.3.2        magrittr_2.0.1    tidyselect_1.1.0 
 [6] rvest_0.3.6       munsell_0.5.0     colorspace_2.0-0  viridisLite_0.3.0 R6_2.5.0         
[11] rlang_0.4.9       highr_0.8         stringr_1.4.0     httr_1.4.2        tools_4.0.3      
[16] webshot_0.5.2     xfun_0.19         tinytex_0.28      ellipsis_0.3.1    htmltools_0.5.0  
[21] yaml_2.2.1        digest_0.6.27     tibble_3.0.4      lifecycle_0.2.0   crayon_1.3.4     
[26] purrr_0.3.4       vctrs_0.3.5       rsconnect_0.8.16  glue_1.4.2        evaluate_0.14    
[31] rmarkdown_2.5     stringi_1.5.3     pillar_1.4.7      compiler_4.0.3    generics_0.1.0   
[36] scales_1.1.1      pkgconfig_2.0.3 

RStudio Version 1.3.1093
question from:https://stackoverflow.com/questions/65857334/utf-8-encoding-of-umlauts-acute-accent-in-popovers-tooltips-of-kableextra-in-r-m

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This does appear to be a bug in kableExtra, fixed here: https://github.com/haozhu233/kableExtra/pull/584. The issue is indicated by the warning messages: kableExtra sets some XML attributes using your input. The xml2 package wants those strings to be in UTF-8 encoding, but by default, most Windows systems use some other encoding.

Maybe this should be fixed in xml2 instead, but at least with that patch, you can work around the issue.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...