charsets package:tools R Documentation _C_o_n_v_e_r_s_i_o_n _T_a_b_l_e_s _b_e_t_w_e_e_n _C_h_a_r_a_c_t_e_r _S_e_t_s _D_e_s_c_r_i_p_t_i_o_n: 'charset_to_Unicode' is a matrix of Unicode points with columns for the common 8-bit encodings. 'Adobe_glyphs' is a dataframe which gives Adobe glyph names for Unicode points. It has two character columns, '"adobe"' and '"unicode"' (a 4-digit hex representation). _U_s_a_g_e: charset_to_Unicode Adobe_glyphs _D_e_t_a_i_l_s: 'charset_to_Unicode' is an integer matrix of class 'c("noquote", "hexmode")' so prints in hexadecimal. The mappings are those used by 'libiconv': there are differences in the way quotes and minus/hyphen are mapped between sources (and the postscript encoding files use a different mapping). 'Adobe_glyphs' include all the Adobe glyph names which correspond to single Unicode characters. It is sorted by Unicode point and within a point alphabetically on the glyph(there can be more than one name for a Unicode point). The data are in the file 'R_HOME/share/encodings/Adobe_glyphlist'. _S_o_u_r_c_e: _E_x_a_m_p_l_e_s: ## find Adobe names for ISOLatin2 chars. latin2 <- charset_to_Unicode[, "ISOLatin2"] aUnicode <- as.numeric(paste("0x", Adobe_glyphs$unicode, sep="")) keep <- aUnicode %in% latin2 aUnicode <- aUnicode[keep] aAdobe <- Adobe_glyphs[keep, 1] ## first match aLatin2 <- aAdobe[match(latin2, aUnicode)] ## all matches bLatin2 <- lapply(1:256, function(x) aAdobe[aUnicode == latin2[x]]) format(bLatin2, justify="none")