Mailing List archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[linux-dvb] Re: DVB character coding...



Robert Schlabbach wrote:
> From: "Jesper Sörensen" <jesper@datapartner.se>
> > The wording in annex A isn't that good and I had some problems figuring
> > out what they meant too. Anyway, I wouldn't look too carefully at those
> > tables. I think what they mean is that unless some other coding is
> > specified you should use Latin-1 (ISO 8859-1) which makes sense since it
> > is the most widely used coding in the west and on the net. The 0xE9 will
> > then indeed be mapped into "é" like expected.
> 
> I think the wording is pretty clear:
> 
> | Annex A (normative):
> | Coding of text characters
> [...]
> | if the first byte of the text field has a value in the range "0x20"
> | to "0xFF" then this and all subsequent bytes in the text item are
> | coded using the default character coding table (table 00 - Latin
> | alphabet) of figure A.1
> 
> Figure A.1 is a superset of ISO/IEC 6937, *not* any of the ISO/IEC 8859-x
> tables. Using this table, the character "é" would have to be composed with
> the sequence 0xC2 0x65.
> 
> Note that this is a _normative_ Annex, i.e. this is part of the standard,
> not an option. It does appear, though, that not even professional tools
> properly implement character encoding/decoding that fully complies with
> this standard...

My experience is that a lot of broadcasters violate the
standard and assume ISO 8859-1. Sad.

Johannes




Home | Main Index | Thread Index