RVHTML and UTF-8

General TRichView support forum. Please post your questions here
Post Reply
martindholmes
Posts: 131
Joined: Mon Aug 29, 2005 12:03 pm

RVHTML and UTF-8

Post by martindholmes »

Hi there,

When I import an HTML file (using RVHTML) with a UTF-8 byte-order mark at the beginning (EF BB BF), the BOM is shown in my TRichViewEdit component, like this:



as if it were a sequence of single-byte characters. Any Unicode text is also garbled. The file contains not only the BOM, but also a meta tag specifying the encoding:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

Can anyone tell my how to get RVHTML to treat Unicode text correctly?

All help appreciated,
Martin
Sergey Tkachenko
Site Admin
Posts: 17565
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko »

RVHTML cannot autodetect UTF-8 encoding. Encoding property must be set to rvhtmleUTF8.
Make sure that you have the latest version of RVHTML - 0.0024
martindholmes
Posts: 131
Joined: Mon Aug 29, 2005 12:03 pm

Post by martindholmes »

Does rvhtml detect any encodings at all, or do I need to parse the file and figure out the encoding before passing it to rvhtml?

Cheers,
Martin
Sergey Tkachenko
Site Admin
Posts: 17565
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko »

No, it detects no encoding. It assumes that file is either in UTF-8 or in DEFAULT_CHARSET.
Post Reply