Mailing List archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[vdr] Re: Display scanning while access recordings



Hi Klaus,

> I tried using the ftw() function, but it would appear that this takes
> a lot longer than the 'find' command.
> 
> I took the following test program:
> 
> ------------------------------------------
> #include <ftw.h>
> #include <stdio.h>
>   
> int Filter(const char *Name, const struct stat *Stat, int Status)
> {
>   printf("%s\n", Name);
>   return 0;
> }
> 
> int main(void)
> { 
>   ftw("/", Filter, 10);
>   return 0;
> } 
> ------------------------------------------
> 
> and compiled it with 'g++' into 'a.out. Then I did this:
> 
> ------------------------------------------
> kls@panther:/home/kls/vdr/VDR > time find / 2> /dev/null | wc -l
>  144488
> 
> real    0m0.780s
> user    0m0.260s
> sys     0m0.520s
> kls@panther:/home/kls/vdr/VDR > time find / 2> /dev/null | wc -l
>  144518
> 
> real    0m0.772s
> user    0m0.260s
> sys     0m0.520s
> kls@panther:/home/kls/vdr/VDR > time a.out | wc -l
>  144423
> 
> real    0m2.765s
> user    0m0.520s
> sys     0m2.050s
> kls@panther:/home/kls/vdr/VDR > time a.out | wc -l
>  144423
> 
> real    0m2.576s
> user    0m0.340s
> sys     0m2.240s
> ------------------------------------------

The reason for this is that ftw() in Linux always does the stat() call
with the full pathname, the find actually goes into the directory and
does the stat call there.

> 
> To make sure both calls find the entire directory structure in memory
> I repeated the calls several times.
> 
> Looking at these numbers I don't think it is a good idea to use ftw()
> instead of 'find'. Or am I missing something here?

IMHO, we don't have a used cpu time problem here. Look at the above
results. The find for the complete filesystem was finished in less than
a second. The video subtree is normally much smaller, but a lot of
people have noticed that it takes sometimes several seconds to display
the recording menue.

The problem is the buffer cache of the system. When reading/writing the
the large video files the buffers containing the cached information of
the inodes are thrown out of the buffers. When you do the find first the
find command has to be paged in an then all the disk blocks containing
the inode information have to be read in again. This increases the real
time considerably which is then factors of 100 higher than the combined
user+sys times. When you have done the read once and exit the menues and
enter the recording menu again immediately, then it is much faster
because the data is already in the buffers an doesn't has to be read
again.

> 
> Maybe the time difference won't hurt when there are only few directories
> to traverse (as is usually the case with VDR's /video), but since we are
> looking for ways to speed up things, I guess it's not a good idea to switch
> to a method that is known to be slower...

The question is whether the ftw() function is really slower regarding
the situation where we have this problem. Saving to exec find alone may
compensate much more for stat() with the long pathname. 

> 
> At first I liked the idea of an internal function instead of an external
> command, because it would get rid of another dependency, but after these
> tests I guess I'll stay with the 'find' - unless somebody can tell me that
> I did something wrong.

You did nothing wrong exept you didn't compare the result in the
situation where the delay happens. As you know I am using 
"ls -d */* */*/*" instead of the find command and this is noticable
faster. The reason for this is that the leaves of the tree (video files)
are not read in and checked. When we know about the structure of the
tree like with the video directory it may be even better to use this
knowledge to optimize the situation. What about reading only that parts
of the tree with getdents that are relevant?

Of course, this all will not help fully against the problem having to
read information from the disk again. For this you will either have to
cache and update this information always in memory, which may cause that
you have stale information if someone manipulates the tree by hand. Or,
you start the find cmd in background in advance, i.e. when entering the
main menue or at some other places to recache the tree info in the
buffer cache. When the recording menu is entered then the find started
there will be much faster.

Emil



Home | Main Index | Thread Index