<br><br><div class="gmail_quote">On Sun, Dec 28, 2008 at 3:36 PM, Andy Walls <span dir="ltr"><<a href="mailto:awalls@radix.net">awalls@radix.net</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div><div></div><div class="Wj3C7c">On Sat, 2008-12-27 at 10:40 -0600, Mark Jenks wrote:<br>
> G'morning all! (at least it's morning here.)<br>
><br>
> I have a running Mythtv server that is running Suse 10.3 with a<br>
> hvr-1250 just fine on Kernel 2.6.24, and haven't had any problems at<br>
> all.<br>
><br>
> I tried to install a hvr-1800 in it yesterday, and I get a kernel oops<br>
> on it and X won't start. I compiled up a 2.6.27.10 kernel for it,<br>
> and moved to that, and I still get the oops. Checked my vmalloc and<br>
> I am fine, but increased it anyways to 384 just for grins.<br>
><br>
> I compiled v4l-dvb-cae6de452897 up against the 2.6.24, and the 2.6.27<br>
> kernels without any changes. Server boots just fine without the<br>
> 1800, but with I get the oops.<br>
><br>
> The only thing that I can see, is that the 1250 and the 1800 look to<br>
> be using the same interrupt.<br>
><br>
> Here is more than enough debug info, I hope. :)<br>
><br>
> Thanks!<br>
><br>
> -Mark<br>
><br>
><br>
> BUG: unable to handle kernel NULL pointer dereference at 000001a0<br>
> IP: [<f8e5a594>] :cx23885:video_open+0x2c/0x150<br>
> *pde = 00000000<br>
> Oops: 0000 [#1] SMP<br>
> Modules linked in: iptable_filter ip_tables ip6_tables x_tables<br>
> cpufreq_conservative cpufreq_userspace cpufreq_powersave powernow_k8<br>
> xfs loop dm_mod cx25840 mt2131 s5h1409 nvidia(P) cx23885<br>
> v4l2_compat_ioctl32 cx2341x videobuf_dma_sg button videobuf_dvb<br>
> dvb_core videobuf_core v4l2_common snd_hda_intel snd_usb_audio<br>
> snd_usb_lib snd_mpu401 snd_cs4232 snd_opl3_lib snd_cs4231_lib snd_pcm<br>
> ohci1394 videodev v4l1_compat osst agpgart btcx_risc rtc_cmos<br>
> i2c_nforce2 snd_timer ieee1394 snd_mpu401_uart tveeprom sr_mod<br>
> snd_hwdep i2c_core rtc_core rtc_lib parport_pc parport st lirc_mceusb2<br>
> snd_rawmidi snd_seq_device snd k8temp hwmon cdrom forcedeth soundcore<br>
> snd_page_alloc lirc_dev sg usbhid hid ff_memless ohci_hcd ehci_hcd<br>
> usbcore sd_mod edd ext3 mbcache jbd fan aic7xxx scsi_transport_spi<br>
> sata_nv pata_amd libata scsi_mod dock thermal processor thermal_sys<br>
><br>
> Pid: 3178, comm: X Tainted: P (2.6.27.10-default #3)<br>
> EIP: 0060:[<f8e5a594>] EFLAGS: 00013287 CPU: 1<br>
> EIP is at video_open+0x2c/0x150 [cx23885]<br>
> EAX: 00000000 EBX: 00000000 ECX: f7a9f000 EDX: f7a0e000<br>
> ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: f764de90<br>
> DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068<br>
> Process X (pid: 3178, ti=f764c000 task=f7398c00 task.ti=f764c000)<br>
> Stack: f7a6e540 00000000 f7b16538 00000000 f7bc30a0 c016bee5 f7a6e540<br>
> 00000000<br>
> f7a6e540 f7bc30a0 00000000 c016bdd9 c01683cd f701ebc0 f6d756c0<br>
> f764df14<br>
> f7a6e540 f764df14 00000003 c01684d8 f7a6e540 00000000 00000000<br>
> f764df14<br>
> Call Trace:<br>
> [<c016bee5>] chrdev_open+0x10c/0x122<br>
> [<c016bdd9>] chrdev_open+0x0/0x122<br>
> [<c01683cd>] __dentry_open+0x10d/0x1fc<br>
> [<c01684d8>] nameidata_to_filp+0x1c/0x2c<br>
> [<c0172986>] do_filp_open+0x33d/0x63e<br>
> [<f9b7d8ce>] _nv004117rm+0x9/0x12 [nvidia]<br>
> [<c01582f8>] handle_mm_fault+0x2b3/0x5dd<br>
> [<c017ab2d>] alloc_fd+0x57/0xd3<br>
> [<c01681e8>] do_sys_open+0x3f/0xb8<br>
> [<c01682a5>] sys_open+0x1e/0x23<br>
> [<c01037ad>] sysenter_do_call+0x12/0x21<br>
> =======================<br>
> Code: 31 ed 57 31 ff 56 31 f6 53 83 ec 04 89 14 24 8b 58 34 e8 16 18<br>
> 46 c7 8b 15 d0 ad e6 f8 81 e3 ff ff 0f 00 eb 49 8b 82 84 0d 00 00 <39><br>
> 98 a0 01 00 00 75 07 89 d6 bf 01 00 00 00 8b 82 88 0d 00 00<br>
> EIP: [<f8e5a594>] video_open+0x2c/0x150 [cx23885] SS:ESP 0068:f764de90<br>
> ---[ end trace c26ff07c077248e0 ]---<br>
<br>
</div></div>Mark,<br>
<br>
Using the same interrupt isn't the problem.<br>
<br>
Here's the gory translation of the Ooops data:<br>
<br>
<br>
The problem is tripped in cx23885-video.c:video_open():<br>
<br>
777 static int video_open(struct inode *inode, struct file *file)<br>
778 {<br>
779 int minor = iminor(inode);<br>
780 struct cx23885_dev *h, *dev = NULL;<br>
781 struct cx23885_fh *fh;<br>
782 struct list_head *list;<br>
783 enum v4l2_buf_type type = 0;<br>
784 int radio = 0;<br>
785<br>
786 lock_kernel();<br>
787 list_for_each(list, &cx23885_devlist) {<br>
788 h = list_entry(list, struct cx23885_dev, devlist);<br>
789 if (h->video_dev->minor == minor) {<br>
790 dev = h;<br>
791 type = V4L2_BUF_TYPE_VIDEO_CAPTURE;<br>
792 }<br>
793 if (h->vbi_dev &&<br>
794 h->vbi_dev->minor == minor) {<br>
795 dev = h;<br>
796 type = V4L2_BUF_TYPE_VBI_CAPTURE;<br>
797 }<br>
[...]<br>
<br>
Also note the list_entry() & list_for_each() macro definitions:<br>
<br>
425 #define list_entry(ptr, type, member) \<br>
426 container_of(ptr, type, member)<br>
[...]<br>
444 #define list_for_each(pos, head) \<br>
445 for (pos = (head)->next; prefetch(pos->next), pos != (head); \<br>
446 pos = pos->next)<br>
<br>
<br>
<br>
The code bytes dumped in the Oops disassemble to:<br>
<br>
1: 31 ed xor %ebp,%ebp<br>
3: 57 push %edi<br>
4: 31 ff xor %edi,%edi<br>
6: 56 push %esi<br>
7: 31 f6 xor %esi,%esi<br>
9: 53 push %ebx<br>
a: 83 ec 04 sub $0x4,%esp<br>
d: 89 14 24 mov %edx,(%esp)<br>
10: 8b 58 34 mov 0x34(%eax),%ebx <--- line 779: minor = iminor(inode);<br>
13: e8 16 18 46 c7 call 0xc746182e <--- line 786: lock_kernel()<br>
18: 8b 15 d0 ad e6 f8 mov 0xf8e6add0,%edx <--- line 445: list = (&cx23885_devlist)->next;<br>
1e: 81 e3 ff ff 0f 00 and $0xfffff,%ebx <--- line 779: minor = iminor(inode);<br>
24: eb 49 jmp 0x6f <--- jmp to for loop condition check: line 445: prefetch(list->next), list != &cx23885_devlist;<br>
26: 8b 82 84 0d 00 00 mov 0xd84(%edx),%eax <--- line 426 & 789: h = container_of(list, struct cx23885_dev, devlist); if (h->video_dev...<br>
2c: 39 98 a0 01 00 00 cmp %ebx,0x1a0(%eax) <--- Ooops occurs here: line 789: if (h->video_dev->minor == minor) {<br>
32: 75 07 jne 0x3b<br>
34: 89 d6 mov %edx,%esi <--- line 790: dev = h;<br>
36: bf 01 00 00 00 mov $0x1,%edi <--- line 791: type = V4L2_BUF_TYPE_VIDEO_CAPTURE;<br>
3b: 8b 82 88 0d 00 00 mov 0xd88(%edx),%eax <--- line 793: if (h->vbi_dev ...<br>
<br>
<br>
So "h->video_dev" (I think) was "NULL" in this call to video_open().<br>
This is a problem with the creation or manipulation of the "struct<br>
cx23885_dev" members of the "cx23885_devlist".<br>
<br>
This appears to be a problem with this list iteration in<br>
cx23885-video.c:video_open().<br>
<br>
If one of these devices only has DVB support and no analog V4L support,<br>
then it would make sense why one of them would have "h->video_dev" set<br>
to NULL. The device shouldn't have a V4L2 "video_dev" if it doesn't<br>
support analog (V4L2) devices. I believe the 1800 supports analog video<br>
but the 1250 does not (someone correct me on this if I'm wrong - I'm no<br>
expert on these devices).<br>
<br>
The iteration loop in video_open() needs to be careful about NULL<br>
pointer dereference of h->video_dev for DVB only devices.<br>
<br>
Try this patch:<br>
<br>
diff -r cae6de452897 linux/drivers/media/video/cx23885/cx23885-video.c<br>
--- a/linux/drivers/media/video/cx23885/cx23885-video.c Fri Dec 26 08:07:39 2008 -0200<br>
+++ b/linux/drivers/media/video/cx23885/cx23885-video.c Sun Dec 28 16:34:04 2008 -0500<br>
@@ -786,7 +786,8 @@ static int video_open(struct inode *inod<br>
lock_kernel();<br>
list_for_each(list, &cx23885_devlist) {<br>
h = list_entry(list, struct cx23885_dev, devlist);<br>
- if (h->video_dev->minor == minor) {<br>
+ if (h->video_dev &&<br>
+ h->video_dev->minor == minor) {<br>
dev = h;<br>
type = V4L2_BUF_TYPE_VIDEO_CAPTURE;<br>
}<br>
<br>
<br>
<br>
If it doesn't work you'll need to find someone with access to a HVR-1250<br>
and HVR-1800 in the same machine to do more interactive debugging (Andy<br>
Walls' thought experiments can only take one so far....).<br>
<br>
I can't help further since I don't have any CX23885 based cards.<br>
<br>
Regards,<br>
Andy<br>
<div><div></div><div class="Wj3C7c"></div></div></blockquote><div><br>Andy,<br> </div><div>You are correct. They are both are cx23885 cards, and only one of them has an analog input to it. The 1250 is a DVB and the 1800 is DVB, but is a MCE card with analog(svideo, etc), in.<br>
<br>I will give your patch a try tomorrow. I'm kind of tired of pulling my media computer, putting in and replacing the card 20 times in one day trying to figure this out.<br><br>Thanks!<br><br>-Mark <br></div></div>
<br>