Mailing List archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[linux-dvb] Re: DVB character coding...



From: "Jesper Sörensen" <jesper@datapartner.se>
> The wording in annex A isn't that good and I had some problems figuring
> out what they meant too. Anyway, I wouldn't look too carefully at those
> tables. I think what they mean is that unless some other coding is
> specified you should use Latin-1 (ISO 8859-1) which makes sense since it
> is the most widely used coding in the west and on the net. The 0xE9 will
> then indeed be mapped into "é" like expected.

I think the wording is pretty clear:

| Annex A (normative):
| Coding of text characters
[...]
| if the first byte of the text field has a value in the range "0x20"
| to "0xFF" then this and all subsequent bytes in the text item are
| coded using the default character coding table (table 00 - Latin
| alphabet) of figure A.1

Figure A.1 is a superset of ISO/IEC 6937, *not* any of the ISO/IEC 8859-x
tables. Using this table, the character "é" would have to be composed with
the sequence 0xC2 0x65.

Note that this is a _normative_ Annex, i.e. this is part of the standard,
not an option. It does appear, though, that not even professional tools
properly implement character encoding/decoding that fully complies with
this standard...

Regards,
--
Robert Schlabbach
e-mail: robert_s@gmx.net
Berlin, Germany





Home | Main Index | Thread Index