GB18030 -> Unicode Codepoint Conversion

Hello,

Anyone have any experience of doing this? I gather that a lookup table is needed since there is no neat way of converting a GB18030 codepoint (which can be 1/2/4 bytes in length) to a Unicode codepoint? We have been supplied with the glyphs to cover GB18030 requirements, but ordered as per Unicode. I don't believe reordering the glyphs (to GB18030) is an option because we need to support UTF16 reception for some markets.

Regards, Richard.

Reply to
Richard Phillips
Loading thread data ...

In the past I've used the codepage support in Windows .NET to produce translation tables from various multibyte character sets to/from Unicode.

Peter

Reply to
Peter Dickerson

I have absolutely no idea how to do that! Any hints?

I'll look into it tomorrow, though.

Cheers, Richard.

Reply to
Richard Phillips

Try this sort of thing...

Encoding enc_cn = Encoding.GetEncoding(936); byte[] bytes = enc_cn.GetBytes("A".ToCharArray());

more useful in your case would be to feed in individual characters (which are Unicode) and get out an array of bytes that are the codepage 936 equivalent. I think this is GB18030 or at least extended GB2312-80 (GBK).

see also

formatting link

OK, and now I have to wash my mouth out...

Peter

Reply to
Peter Dickerson

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.