[vdr] VDR-1.3.41: speedup for cVideoRepacker
rnissl at gmx.de
Wed Feb 1 23:59:52 CET 2006
Jon Burgess wrote:
>> I don't think that it is worth a try as it tests every byte while the
>> above code tests most of the time only every third byte.
> I agree that your algorithm is clever and does greatly cut down the
> number of comparisons as compared to the old code.
> The glibc memchr() implementation does the comparisons 4 bytes at a time
> using a clever algorithm. It also has assembler optimised variants for
> some CPU's. I don't think that only doing a comparison of every 3rd byte
> wins you anything over memchr().
> I believe the bulk of the time taken by the routine is transferring all
> the data from memory into the CPU. Every byte of the data will have to
> be read into the CPU caches due to cacheline effects. I believe that the
> asm optimisations will take into account the possibilities of
> speculative readahead etc. I've not looked into the assembler to see
> whether it actually exploits this.
> I've atached the quickly hacked up test program that I wrote. The output
> is the time taken for many iterations of the 2 different algorithms.
> For me the difference is within the measurement noise. It certainly
> isn't any slower. I'd be interested to know whether it makes any
> difference on your EPIA, both in the test program and in VDR.
You were right. Using memchr() reduces CPU load on my 600 MHz EPIA
System by 1 % for channel ZDF and by 4 % for the HDTV channel HDFORUM.
The numbers were taken by just running VDR in transfer mode for the
mentioned channel (= no xine attached to VDR).
I also gave memmem() a try but the CPU load was increased by this change.
Attached you'll find an updated patch according to your suggestion.
Dipl.-Inform. (FH) Reinhard Nissl
mailto:rnissl at gmx.de
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 15722 bytes
Desc: not available
Url : http://www.linuxtv.org/pipermail/vdr/attachments/20060201/69a084df/vdr-1.3.41-remux2.bin
More information about the vdr