[linux-dvb] [PATCH] dvb-core: Fix
several locking related problems.
simon at fire.lp0.eu
Mon Mar 5 19:04:44 CET 2007
On 05/03/07 17:19, Oliver Endriss wrote:
> Simon Arlott wrote:
>> On Mon, March 5, 2007 11:19, Johannes Stezenbach wrote:
>>> On Mon, Mar 05, 2007 at 01:58:14AM +0100, Oliver Endriss wrote:
>>>> Simon Arlott wrote:
>>>>> Is any part of the patch going to be applied? I mentioned this
>>>>> problem in September last year and it looks like it's existed for
>>>>> years (the semaphore locking did the same thing).
>>>> Well, I hoped that someone more familiar with the demuxer stuff would
>>>> comment on the patch. I am not very happy about using non-interruptible
>>>> lock operations...
>> Why? If there are deadlocks these should be fixed, not ignored.
> Yes, they should be fixed, but they should not require a reboot.
What!? The fix for a deadlock is not a reboot... perhaps you misunderstood
what I meant - whatever ends up holding the lock forever should be fixed.
> From 'Linux Device Drivers' (replace 'down' by 'mutex_lock'):
> | ...
> | down decrements the value of the semaphore and waits as long as need
> | be. down_ interruptible does the same, but the operation is
> | interruptible. The interruptible version is almost always the one you
> | will want; it allows a user-space process that is waiting on a
> | semaphore to be interrupted by the user. You do not, as a general
> | rule, want to use noninterruptible operations unless there truly is no
> | alternative. Noninterruptible operations are a good way to create
> | unkillable processes (the dreaded D state seen in ps), and annoy
> | your users. Using down_interruptible requires some extra care,
I don't see this advice in Documentation/mutex-design.txt (although it
doesn't advise either way) or in Documentation/ at all.
> | however, if the operation is interrupted, the function returns a
> | nonzero value, and the caller does not hold the semaphore. Proper use
> | of down_interruptible requires always checking the return value and
Until you do this, you can't use down_interruptible.
> | responding accordingly.
> | ...
>>>> Anyway, I'll apply the patch to HG master if you submit an updated patch:
>>>> - Please add a line of comment to each mutex_lock() stating _why_ the
>>>> non-interruptible lock has to be used at this place.
>> What's the point of doing that?
> The point is to document why we do not use the interruptible version.
> Next year someone might submit a patch replacing mutex_lock by
> mutex_lock_interruptible, and no one remembers why mutex_lock is
> required at this place.
There is no danger of this, if someone submits a patch replacing it with
mutex_lock_interruptible and doesn't take into account every possible
calling function's check of the return value then their patch needs
changing before it can be accepted.
>>> IMHO using mutex_lock_interruptible() is almost always wrong.
>>> The only use it has in dvb-core is to recover from locking bugs --
>>> if it deadlocks you can Ctrl-C out of it
>>> (instead of being left with a non-killable program -> reboot).
The key phrase here is "locking *bugs*", if the code can lock forever
it needs to be fixed. The real cause of bugs will never be found if
interruptible locking is used everywhere when when it's not necessary :(
Also, this is in dvb-core not actual card drivers - why would there be
locking bugs in dvb-core itself?
>> This is what lockdep is for.
> Is lockdep activated in distribution kernels?
No, but developers could use it while testing. It only reports locking
bugs by checking how locks depend on other locks (see lockdep-design.txt),
I don't think it would spot code that just locked a mutex and then waited
forever (which is really bad code anyway).
More information about the linux-dvb