Mailing List archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[linux-dvb] Re: DVB character coding...



Gerd Knorr wrote:
> "Robert Schlabbach" <robert_s@gmx.net> writes:
> 
> > But the codings 0x12 and 0x13 bring up another problem: In KSC5601 and
> > GB2312, the codes 0x80 through 0x9F are used as lead bytes - but DVB
> > defines them as control codes. Obviously you can't have the same byte serve
> > two different meanings. I suppose there simply are no control codes for
> > these character codings?
> 
> My code does control code handling for codings < 0x10 only.  Don't
> remember why I did it that way though.  Maybe because the specs sayed
> so, but also might be because its unclear how control codes are
> supposed to work with the multibyte encodings.

Hm, for 0x13 I found the following comment in our code:

FIXME: document 595.doc on dvb.org states:
1. If the value of leading byte is "0x13";  then  the  remaining  bytes  are
   coded in pairs with the Big5 subset of Unicode 3.0. This Big5 subset  can
   be round-trip transcoded to the Big5 character standard [5] without  loss
   of information. This Big5 subset  of  Unicode  3.0  contains  all  13,053
   characters of Big5 character standard [5].

(I don't know who wrote that or what 595.doc is.)

For double byte char sets the control codes are 0xe080 ... 0xe09f.

> > <RTL> Television - Long name is "RTL Television", short name is "RTL"
> > <S>uper< RTL>    - Long name is "Super RTL", short name is "S RTL"
> > 
> > This is something I didn't know before...
> 
> Intesting, I didn't know either ;)

I did ;-) However, I was too lazy to implement it in dvbscan
and instead put the following comment there:
        /* remove control characters (FIXME: handle short/long name) */

> BTW: I've seen ^Z (0x1a) in eit descriptions, what the heck does that
> mean?  I've noticed because xml parsers refuse to accept the files if
> you stuff that as-is into a xml file ...

Probably a M$-DOS EOF?

Johannes




Home | Main Index | Thread Index