Mailing List archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[vdr] Re: system hangs after deleting records in vdr



>>>>My system is a 1GHZ Thunderbird, NMC 8TAX+ MB, 256MB Ram,
>>>>>>Hauppauge Rev 1.3 Sat card, 80 Maxtor disk, Suse 7.2.
>>Suse 7.2 comes with kernel 2.4.4 an that one I use

--- cut ---

>>>What kind of IDE cable do you have?
>>>How long is the cable?
>>UDMA cable, that comes with the mainboard, very short
>
>>>Doesn't have the NMC the buggy VIA IDE Chip set? (I don't know, >some
>>>days ago there were some problems)
>>I have the newest bios update
>
>>>Did you try to make kernels 10 times or more in a script as a >test?
>>>(On that disk)
>>Yes, it runs one day, no problems;
>>I checked the memory with the linux memtest program, no problems.
>
>If you can, change the memory strip to make sure that
>this is not weak.
I checked it with very good memtest programs, it's ok


>>Do you really think, that this could be a problem with the
>>disk or chipset?
>
>Your log file "unprintables" shows that the machine
>crashed very hard.
>Too the corruption of the fat could be an indicator
>for hardware problems.
>Sometimes this is caused by too long IDE cables.
They are very short!

>It is very unlikely that an application like vdr
>would be able to crash the box so hard just by deleting.
>But:
>The deleting of such big files too really long.
>(sometimes more than these "famous" 10sec a user will wait.)

Till last month I used the vat filesystem. Here I often had the problem, that
it becomes readonly, so I wrote a script that checks every second the state of
the filesystem and in case of a ro it remounts it rw.

When the system crashes, I run the Windows 2000 check-disk-program and that
found crosslinked files, and it restores previous (by vdr) deleted directories.

But the strange thing: I formated the disk with reiserfs, did all the tests,
like daylong kernelcompiling, filling the disk with data and so on nothing 
happens, and then the same thing happens again (as shown in my first posting):

Oct 12 23:36:20 video vdr[745]: removing /video/jag/2001-09-16.14.58.99.99.del/marks.vdr
Oct 12 23:36:20 video vdr[745]: removing /video/jag/2001-09-16.14.58.99.99.del/index.vdr
Oct 12 23:36:20 video vdr[745]: removing /video/jag/2001-09-16.14.58.99.99.del/001.vdr
Oct 12 23:36:23 video vdr[745]: removing /video/jag/2001-09-16.14.58.99.99.del/resume.vdr
Oct 12 23:36:23 video vdr[745]: removing /video/jag/2001-09-16.14.58.99.99.del
Oct 12 23:36:23 video vdr[745]: removing /video/jag
Oct 12 23:36:23 video vdr[745]: max. latency time 3 seconds
### here unprintable characters, something dies! ###
Oct 12 23:58:23 video sshd[376]: Server listening on :: port 22.
Oct 12 23:58:26 video /usr/sbin/cron[594]: (CRON) STARTUP (fork ok) 
Oct 12 23:58:26 video kernel: klogd 1.3-3, log source = /proc/kmsg started.
Oct 12 23:58:26 video kernel: Inspecting /System.map
Oct 12 23:58:26 video kernel: Loaded 10487 symbols from /System.map.

and at 23:58:23 the machine was up again.

To find something about these strange things, I created anonther script, that
writes every second the current time to a logfile. If the machine was restarted,
the last logfile (and so the last time, the system was alive) was kept.
And in this case, the last logfile entry was at 23:47:45!??
So some minutes, parts of the system work.

I saw, Axel knows this problem. I don't think too, it's a hardware problem, 
but where can it come from?

I think one possible problem is that deleting such huge files (about 2GB)
takes some time and the deleting of the directory comes to fast (in my case, the
deleting of the vdr-file needs 3 seconds, then the directory was deleted).

One hint is that in the vat-case a checkdisk restores the deleted directory
again. I run reiserfsck on my now reiserfs formated disk an found errors after the
crash like

"block 142213xx is not marked as used in the disk bitmap" for many blocks
or
"bad_indirect_item: block 20383: item 150 153 0x51e07001 IND, len 4048, entry
count 0, fsck need 0, format old has a pointer 44 to the block 14221330 which is
in tree already"

and the were definitely caused by that crash, because come old records show some seconds of a new one (cross linked blocks)!

Has anybody ideas?

Harald



--  
Tipp: Neuer Gewinnspiel-Service sorgt fuer Furore! 
Wer hier nicht gewinnt, dem ist nicht mehr zu helfen...  

http://shortwin.de/index.cfm?pp_ID=18648

Home | Main Index | Thread Index