一、实验环境
1、软件
a) Vmware版本:Vmware Workstation 12.5.7
b) Ubuntu版本:9.10
c) 内核版本:2.6.31.14
d) gcc版本:4.4.1
e) gdb版本:7.0
2、摄像头硬件
百问网自制uvc摄像头
3、排查过程中,使用到的工具
a) printk
b) objdump
c) strace
d)gdb
二、前言
用C语言写程序时,如果定义一个带返回值的函数,但在函数体最后却缺少了return 语句, 程序编译并运行起来后,有时会产生意想不到的严重后果!这事以前只在教科书里看到过,纸上得来终觉浅,所以一直没当回事。但这次在学习韦东山嵌入式培训视频(3期项目实战之USB摄像头监控)时,真切地接受了一次教训。兹记录下整个入坑和出坑的经过,希望对自己和大家都有所助益。
三、现象描述
仿照视频教程,自己写了一个简化版的uvc摄像头驱动。在insmod my_uvc.ko,然后运行xawtv时,不幸发生了内核Oops,详细信息如下:
[ 657.966482] BUG: unable to handle kernel paging request at fffffff4
[ 657.966486] IP: [ [ 657.966491] *pde = 0081d067 *pte = 00000000 [ 657.966493] Oops: 0002 [#1] SMP [ 657.966495] last sysfs file: /sys/devices/virtual/video4linux/video0/dev [ 657.966498] Modules linked in: my_uvc nfsd exportfs nfs lockd nfs_acl auth_rpcgss sunrpc snd_ens1371 gameport ... 此处省略无关信息若干行 [ 657.966519] [ 657.966522] Pid: 5059, comm: xawtv.bin Not tainted (2.6.31-14-generic #48-Ubuntu) VMware Virtual Platform [ 657.966523] EIP: 0060:[ [ 657.966525] EIP is at __ticket_spin_lock+0x8/0x20 [ 657.966527] EAX: fffffff4 EBX: 00200282 ECX: fffffff4 EDX: 00000100 [ 657.966528] ESI: deb69bf4 EDI: defcba00 EBP: deb69ad0 ESP: deb69ad0 [ 657.966529] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 657.966531] Process xawtv.bin (pid: 5059, ti=deb68000 task=df319920 task.ti=deb68000) [ 657.966532] Stack: [ 657.966533] deb69ad8 c0127c38 deb69aec c05707da fffffff4 deb69bf4 defcba00 deb69b00 [ 657.966536] <0> c015c58d 00000000 defcba00 defcba00 deb69b24 c01f64e0 fffffff4 00012f50 [ 657.966539] <0> deb47b80 deb69b24 deb69bd0 ffffffa8 defcba00 deb69b34 e0a8b234 de979a00 [ 657.966543] Call Trace: [ 657.966545] [ [ 657.966548] [ [ 657.966550] [ [ 657.966553] [ [ 657.966556] [ [ 657.966559] [ [ 657.966562] [ ... 此处省略无关信息若干行 [ 657.966650] [ [ 657.966652] [ [ 657.966653] Code: ff ff 90 b9 5a 7a 12 c0 b8 5d 7a 12 c0 e9 59 ff ff ff 90 b9 60 7a 12 c0 b8 63 7a 12 c0 e9 49 ff ff ff 90 55 ba 00 01 00 00 89 e5 <3e> 66 0f c1 10 38 f2 74 06 f3 90 8a 10 eb f6 5d c3 8d b4 26 00 [ 657.966672] EIP: [ [ 657.966676] CR2: 00000000fffffff4 [ 657.966682] ---[ end trace 672c8069f4e9d743 ]--- 四、排查过程 1、回忆起之前曾经成功用这个my_uvc驱动过xawtv,而这次发生kernel oops的代码,唯一的改动就是在myuvc_vidioc_try_fmt_vid_cap里加了一句: printk(KERN_CRIT'frames[frame_idx].width:%d, frames[frame_idx].height:%dn',frames[frame_idx].width, frames[frame_idx].height); 但为什么一个printk会造成kernel oops呢?一头雾水。。。 2、线索1:用objdump查看反汇编代码 a) 由于缺少return语句,从而导致kernel oops的代码 0000036d 36d: 55 push %ebp 36e: 89 e5 mov %esp,%ebp 370: 53 push %ebx 371: 83 ec 0c sub $0xc,%esp 374: 89 cb mov %ecx,%ebx 376: 83 39 01 cmpl $0x1,(%ecx) 379: 75 51 jne 3cc 37b: 81 79 0c 4d 4a 50 47 cmpl $0x47504a4d,0xc(%ecx) 382: 75 48 jne 3cc 384: c7 41 04 40 01 00 00 movl $0x140,0x4(%ecx) ;f->fmt.pix.width = frames[frame_idx].width; 38b: c7 41 08 f0 00 00 00 movl $0xf0,0x8(%ecx) ;f->fmt.pix.height = frames[frame_idx].height; 392: c7 44 24 08 f0 00 00 movl $0xf0,0x8(%esp) 399: 00 39a: c7 44 24 04 40 01 00 movl $0x140,0x4(%esp) 3a1: 00 3a2: c7 04 24 00 02 00 00 movl $0x200,(%esp) 3a9: e8 fc ff ff ff call 3aa 3ae: c7 43 14 00 00 00 00 movl $0x0,0x14(%ebx) ;f->fmt.pix.bytesperline = 0; 3b5: c7 43 18 00 2e 01 00 movl $0x12e00,0x18(%ebx) ;f->fmt.pix.sizeimage = dwMaxVideoFrameSize; 3bc: c7 43 10 01 00 00 00 movl $0x1,0x10(%ebx) ;f->fmt.pix.field = V4L2_FIELD_NONE; 3c3: c7 43 1c 08 00 00 00 movl $0x8,0x1c(%ebx) ;f->fmt.pix.colorspace = V4L2_COLORSPACE_SRGB; 3ca: eb 05 jmp 3d1 3cc: b8 00 00 00 00 mov $0x0,%eax ;return 0; 3d1: 83 c4 0c add $0xc,%esp 3d4: 5b pop %ebx 3d5: 5d pop %ebp 3d6: c3 ret b) 而加上了return语句,正常运行的代码 0000036d 36d: 55 push %ebp 36e: 89 e5 mov %esp,%ebp 370: 53 push %ebx 371: 83 ec 0c sub $0xc,%esp 374: 89 cb mov %ecx,%ebx 376: 83 39 01 cmpl $0x1,(%ecx) 379: 75 4f jne 3ca 37b: 81 79 0c 4d 4a 50 47 cmpl $0x47504a4d,0xc(%ecx) 382: 75 46 jne 3ca 384: c7 41 04 40 01 00 00 movl $0x140,0x4(%ecx) ;f->fmt.pix.width = frames[frame_idx].width; 38b: c7 41 08 f0 00 00 00 movl $0xf0,0x8(%ecx) ;f->fmt.pix.height = frames[frame_idx].height; 392: c7 44 24 08 f0 00 00 movl $0xf0,0x8(%esp) 399: 00 39a: c7 44 24 04 40 01 00 movl $0x140,0x4(%esp) 3a1: 00 3a2: c7 04 24 00 02 00 00 movl $0x200,(%esp) 3a9: e8 fc ff ff ff call 3aa 3ae: c7 43 14 00 00 00 00 movl $0x0,0x14(%ebx) ;f->fmt.pix.bytesperline = 0; 3b5: c7 43 18 00 2e 01 00 movl $0x12e00,0x18(%ebx) ;f->fmt.pix.sizeimage = dwMaxVideoFrameSize; 3bc: c7 43 10 01 00 00 00 movl $0x1,0x10(%ebx) ;f->fmt.pix.field = V4L2_FIELD_NONE; 3c3: c7 43 1c 08 00 00 00 movl $0x8,0x1c(%ebx) ;f->fmt.pix.colorspace = V4L2_COLORSPACE_SRGB; 3ca: b8 00 00 00 00 mov $0x0,%eax ;由于C代码加了return语句,导致汇编代码没有旁路掉下面这句mov $0x0,%eax 3cf: 83 c4 0c add $0xc,%esp 3d2: 5b pop %ebx 3d3: 5d pop %ebp 3d4: c3 ret 经过对比,可以看出:由于C代码缺少return 语句,导致汇编代码里在函数返回前,没有正确的给eax赋0值,从而myuvc_vidioc_try_fmt_vid_cap的调用者实际得到了一个错误的返回值。 那么,是谁调用了myuvc_vidioc_try_fmt_vid_cap呢?经查,有两处: i) my_uvc驱动里的myuvc_vidioc_s_fmt_vid_cap ii)xawtv通过系统调用ioctl( VIDIOC_TRY_FMT)间接调用了驱动的函数 通过strace –o /dev/ttyS1 xawtv记录的日志,发现:确实是xawtv通过ioctl( VIDIOC_TRY_FMT)调用了myuvc_vidioc_try_fmt_vid_cap,并且确实得到了一个错误的返回值,相关日志信息如下: ioctl(4, VIDIOC_TRY_FMT, 0xbfff95b0) = 79 //myuvc_vidioc_try_fmt_vid_cap由于最后缺少return语句,导致返回了非0值(至于为什么是79,且看下文)
上一篇:关于在嵌入式Linux下编译dhcp报错“cannot check for file existence when cross compiling”的初步研究
下一篇:关于linux可安装模块的装载地址的研究