Linux USB frequent reconnects – workaround

I’ve been running into problems recently (since several months ago at least), with USB hardware on my Thinkpad T40 running Ubuntu Hoary Dapper; in particular, every time I plug in my iPod or one of my USB hard disks nowadays, I get this:

[5008549.187000] usb 4-3: USB disconnect, address 14
[5008550.143000] usb 4-3: new high speed USB device using ehci_hcd and address 18
[5008552.643000] usb 4-3: new high speed USB device using ehci_hcd and address 27
[5008557.393000] usb 4-3: new high speed USB device using ehci_hcd and address 43
[5008557.893000] usb 4-3: new high speed USB device using ehci_hcd and address 44
[5008558.643000] usb 4-3: new high speed USB device using ehci_hcd and address 46
[5008558.895000] ehci_hcd 0000:00:1d.7: port 3 reset error -110
[5008558.896000] hub 4-0:1.0: hub_port_status failed (err = -32)
[5008559.893000] usb 4-3: new high speed USB device using ehci_hcd and address 48
[5008562.643000] usb 4-3: new high speed USB device using ehci_hcd and address 58
[5008563.143000] usb 4-3: new high speed USB device using ehci_hcd and address 59
[5008563.643000] usb 4-3: new high speed USB device using ehci_hcd and address 60
[5008570.143000] usb 4-3: new high speed USB device using ehci_hcd and address 85

This repeats ad infinitum until the USB device is disconnected.

I had this down as a hardware issue (since it started happening just after warranty expiration ;), but some accidental googling revealed several other cases – and a workaround:

sudo modprobe -r ehci-hcd

Run that repeatedly, each time replugging the device and monitoring dmesg via watch -n 1 ‘dmesg | tail’ in a window, until the device is finally recognised as a USB hard disk. It generally seems to take 3 or 4 attempts, in my experience.

This LKML thread suggests hardware changes can cause it, but this hardware hasn’t changed in years. Annoying.

Anyway, this is ongoing. This tip seems to help, but it might be just treating a symptom, I don’t know — just posting for google and posterity… and to moan, of course :(

This entry was posted in Uncategorized and tagged , , , , , , , . Bookmark the permalink. Both comments and trackbacks are currently closed.

31 Comments

  1. Posted December 26, 2006 at 17:52 | Permalink

    I also see this on my T42. I think it started happening when the kernel was upgraded from 2.6.16 to .17. I am running .18 right now in Etch and its still the same problem. I see this happening in Kubuntu Edgy and Etch. I would be curious if anyone has different results with a non Debian based distro. I remember having a problem with the ehci_hcd module a long time ago (probably with the 2.4 kernel) so I just unloaded that and found that it worked so I have not looked too much further. This work around means that you are only using USB 1.1 which is OK for small stuff but gets annoying with big transfers.

  2. Posted December 27, 2006 at 14:13 | Permalink

    thanks for the info, dhuv. I think I’ll try downgrading, or fixing this bug in my kernel config; this is driving me nuts, esp since the workaround frequently doesn’t work, anyway. After a time, the messages start reappearing and the device is lost. (this is especially bad news for my USB-connected backup hard drive, which gives write errors. I don’t want to have to deal with corruption there! gulp)

    this ubuntu bug hints that it might have had something to do with this –

    The problem was introduced in 2.6.15-24.41, with this change:

    • usb: Enable CONFIG_USB_EHCI_SPLIT_ISO and CONFIG_USB_EHCI_ROOT_HUB_TT
      • Malone #28840
      • Malone #49367

    I’m going to try that.

  3. Posted December 27, 2006 at 17:55 | Permalink

    Please update this and let us know what works for you.

  4. bda
    Posted December 27, 2006 at 19:15 | Permalink

    I have an iriver iHP-120 and it is detected without problems, but the connection is lost when transferring more than a couple of files:

    usb 4-5: reset high speed USB device using ehci_ hcd and address 3 sd 0:0:0:0: scsi: Device offlined – not ready af ter error recovery sd 0:0:0:0: SCSI error: return code = 0×00050000 end_request: I/O error, dev sda, sector 1844085 sd 0:0:0:0: rejecting I/O to offline device FAT: Directory bread(block 1842577) failed

    etc etc…

    At first I thought the player was dying, but it seems to work fine under windows XP. I noticed that I enabled this in 2.6.19:

    CONFIG_USB_EHCI_TT_NEWSCHED=y

    I’m recompiling without this to see if it fixes the problem

  5. Posted December 28, 2006 at 12:02 | Permalink

    update: I recompiled without CONFIG_USB_EHCI_SPLIT_ISO, and CONFIG_USB_EHCI_ROOT_HUB_TT — with no luck. The issue still appears. :(

    CONFIG_USB_EHCI_TT_NEWSCHED=y doesn’t appear in my Ubuntu Dapper 2.6.15-27 kernel, so that’s not it for me at least.

    I know it used to work with my hand-built vanilla 2.6.10, and no longer works with the dapper default 2.6.15 kernel — so I’d guess something between those two, or their configs…

  6. bda
    Posted December 28, 2006 at 16:47 | Permalink

    I had no luck with CONFIG_USB_EHCI_TT_NEWSCHED anyway. I ran a bad block check under windows, and it found tons of them. I guess windows just happened to be copying to good parts of the drive when I was testing. Anyway it appears to be unrelated to the problem you are having.

  7. Posted January 3, 2007 at 23:03 | Permalink

    Okay, this is mostly OT but I just have to know: what’s “accidental googling”? Did you spill some coffee on the keyboard and it just came up, or something?

    INQUIRING BOTS^H^H^H^HMINDS WANT TO KNOW

  8. Posted January 4, 2007 at 12:20 | Permalink

    hey Yoz! ah you know what I mean!

    Basicaily, I had assumed it was a hardware problem, and was googling for workarounds on that assumption — but the googling revealed it’s likely software origins. Not quite “accidental” really, but it sounds better that way ;)

  9. w
    Posted January 6, 2007 at 12:37 | Permalink

    Hi all,

    I had the same problem and it seems that rmmod ehci_hcd;modprobe ehci_hcd fixes it for me now.

    all info is here: http://erik.ok.ee/blog/?op=show&id=470

  10. Alejandro
    Posted February 15, 2007 at 14:32 | Permalink

    Hello

    Kubuntu Edgy+security updates on an IBM T40. This is a FAT+USB problem. 100% sure. I have a Fujitsu 2.5” 80GB hdd in a thermaltake USB 2.0 enclosure. I have 2 partitions. One with FAT32 and one with Ext3. If i have the FAT32 partition mounted and copy something (a few megabytes) to the FAT32 or EXT3 partition the drive disconnects:

    [17260974.972000] EXT3-fs error (device sda2): ext3_find_entry: reading directory #2 offset 0 … [17260975.804000] FAT: Directory bread(block 32815) failed [17260975.908000] 7:0:0:0: rejecting I/O to dead device

    If the FAT32 partition is unmounted and I copy GBs of data to the Ext3 partition, everything is ok. Hours, days of stability. No problem at all.

    Sometimes the drive does not dies, but the written data to the FAT partiton has random curruption every 3-4 GB one file is wrong (md5sum check). If the drive does not dies the same data written to the ext3 is not corrupted.

    The not practical solution is to not use FAT with USB devices.

  11. Posted February 15, 2007 at 15:15 | Permalink

    Well, I can workaround the problem by removing the ehci_hcd module. Then I have no problems with the FAT32 drive.

    You may be correct that its a problem with FAT32, but I think you will also find stability with FAT32 if you do NOT use the ehci_hcd module.

    I will test things out by formatting my USB drive today with ext3 and trying again.

  12. Alejandro
    Posted February 15, 2007 at 18:22 | Permalink

    Just tried with 2.6.20 vanilla and the problem remains. Not using the ehci_hcd is not an option for me, i move a lot of files, some very big with my disk. Waiting one day to copy a file is not very funny.

  13. PeterJordan
    Posted February 17, 2007 at 20:25 | Permalink

    I have the same problem since 2.6.19, currently using 2.6.20, with my Fingerprintreader on ThinkPad R60 under debian etch.

  14. Marcelo
    Posted April 20, 2007 at 04:16 | Permalink

    Same problem here too. Gentoo with kernel 2.6.20-gentoo-r6. I have a usb external disk with 2 partition in it. First is a FAT32 and the other a EXT3. I don’t have problems copying files, only reading with this kind of errors:

    reset high speed USB device ehci_hcd attempt to access beyond end of device sda2: rw=0, want=4261850736, limit=127042020

  15. Maxei
    Posted May 5, 2007 at 17:38 | Permalink

    I am running SLED 10 and I have a similar problem with my USB hard-drive. The problem is when I copy lots of music files to the USB HD, it disconnects after some time and the files are not all transferred, but only a fraction of them. So, I copy a few of them each time, by small bundles. This way the USB HD does not disconnect but it very, very frequently stalls, probably after every file copied, resulting that copying a few number files takes long, long minutes. The USB HD is FAT32 formated. I guess I should format it to ext3? Anyways, there seems to be a bug in the kernel, according to this: https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.20/+bug/61235

  16. Alejandro
    Posted May 7, 2007 at 14:49 | Permalink

    Before i said that it was a FAT problem. That is not true. I have problems without the vfat module loaded too. I have made the same tests in my personal Dell Inspiron 8600 (Pentium M 2Ghz) with the same Kubuntu distro and i have copied GBs of data and no problem with the same devices that i have problems in a IBM Thinkpad T40 (Pentium M 1.5Ghz). With and without a usb hub (i used the same hub to test). So i think is a problem with the chip that the IBM T40 has.

  17. Posted May 8, 2007 at 12:43 | Permalink

    yes Alejandro — I’m starting to think this is a problem with the hardware. :(

  18. Posted May 8, 2007 at 14:21 | Permalink

    Here is the revelant section of lspci -vvv on my T42. I wonder if we can figure out what specifically is it that is causing the problem.

    http://pastebin.ca/477065

  19. Posted May 8, 2007 at 16:06 | Permalink

    here’s my lspci -vvv USB-specific components: http://pastebin.ca/477230

    pretty similar (although note that my IRQs are unshared).

  20. Posted May 8, 2007 at 16:13 | Permalink

    has anyone tried setting the max_sectors on the block device to 128, as described at this bug: http://bugs.gentoo.org/show_bug.cgi?id=177266 ?

  21. Teo
    Posted May 14, 2007 at 21:27 | Permalink

    Hi, I’ve the same problem with debian testing. Kernel is 2.6.18 … I can’t think of a solution since I’ve tried all you said with no luck. I just wonder now: has any of you got a wi-fi router close to the hdd? Is there some concrete possibility of interference between the radio source and cable? It would be quite easy to test… but for the simple fact that my losing the device is random. Thanks a lot for any comment, Teo

  22. Jansen Sena
    Posted June 4, 2007 at 14:06 | Permalink

    == PROBLEM SOLVED TO ME ==

    Hi friends,

    I was having the same problem. My external hard drive stopped work after try to copy a lot of file to it. I got this message in “dmesg”:

    “rejecting I/O to dead device”

    I am able to copy a couple of large files (like DVD images) without problems. The problem just happens when I try to copy a lot of files (small or large ones).

    I am using a Compaq Presario V2000 laptop with Ubuntu Feisty Fawn (7.04) and one Iomega External HD (320GB). In the first time, I removed the ehci_hcd using the followed command as commented before:

    modprobe -r ehci_hcd

    This kernel module is the responsible for the USB 2.0 support. After ran this command, I was able to copy a lot of files to my external hard drive and it was not suddenly dead anymore. But, as I removed the USB 2.0 support the transfer was running in a very low rate and it was not a definitive solution because I wanted to copy more than 120GB in one time.

    Then, I loaded the ehci_module again and tried to change the value included in the max_sectors from 240 (default) to 128, 64 and 32. In my desktop (a IBM Think Centre with Ubuntu Fesity Fawn) this solution worked perfectly! But, in my laptop the problem stays happening. Then, I tried to change the file system from VFAT to EXT3, but the problem was not solved.

    Finally, on last sunday night, I recompiled my kernel following these steps:

    1) I installed the 2.6.20 kernel souce code using apt-get:

         # apt-get install linux-source-2.6.20
    

    2) I loaded the more recent kernel configuration copying the /boot/config-2.6.20-15-generic file to kernel source directory naming the file as “.config” and process this file to select the options marked in my running kernel:

         # cp /boot/config-2.6.20-15-generic /usr/src/linux-source-2.6.20/.config
         # cd /usr/src/linux-source-2.6.20/.config
         # make oldconfig
    

    3) I ran the menuconfig and I enabled and disabled some options as showed after:

        # cd /usr/src/linux-source-2.6.20/.config
        # make menuconfig
    
       3.1) Disabled features (Device Drivers --> USB Support):
    
               * FULL speed ISO transactions (USB_EHCI_SPLIT_ISO)
               * Root Hub Transaction Translators (USB_EHCI_ROOT_HUB_IT)
               * Improved Transaction Translator schedulling (USB_EHCI_TT_NEWSCHED)
    
       3.2) Enabled features (Device Drivers --> USB Support):
    
               * USB verbose debug messages (USB_DEBUG)
               * Enforce USB bandwith allocation (USB_BANDWITH)
    

    After recompile the kernel and rebot my system, I connected my external Iomega external drive and everything is working fine and I was finally able to go to bed rest for some hours! :-)

    I don’t know with this solution will work for others people… But here is the solution I found to me. I will be happy to know with these steps help you to solve the problem as well!

    Best regards,

    Jansen Sena jansen@comunidadesol.org

  23. EddyP
    Posted August 6, 2007 at 10:20 | Permalink

    I tried the workaround proposed by Jansen Sena with a Debian Etch modified kernel on a NSLU2 and I still have the resets.

    Now I have lowered the max-sectors value to 32, after going through 128 and 64.

    Maybe I’ll try a newer kernel since 2.6.18-4 doesn’t have the USB_EHCI_TT_NEWSCHED (so I suppose there might be another scheduler instead). It might be moronic what I’m saying….

    I don’t believe that enabling the debug option has anything to do with it (unless the frequent disk writes slow down the device…), and if it does, this is a crude hack…

    I just hope I’ll find a fix for this issue…

  24. Posted September 27, 2007 at 18:25 | Permalink

    @EddyP: I think debug option is there only for checking log if problem still there only. Hence, I’m compiling kernel right now with the debug option disabled.

    @Jansen Sena: Thank you very much for the info!

  25. EddyP
    Posted September 27, 2007 at 19:19 | Permalink

    Well, I got rid of the problem by replacing the case with an external USB hdd.

  26. radar
    Posted November 23, 2007 at 19:20 | Permalink

    I had the same problems in my case, and found that this is a hardware problem (my usb 2.0 pci card was messed up).

  27. -rb
    Posted June 13, 2008 at 05:00 | Permalink

    I agree with the hardware theory. I had a WB Passport drive plugged into the USB 2.0 ports on my Dell 2001FP monitor, and couldn’t keep the drive connected. When I disconnected the Dell monitor (aka convenient usb hub), and plugged the WD drive into the same USB port on my mobo in which the Dell was plugged into, the drive worked.

    Therefore I don’t think it was the USB hardware/chips in my Shuttle mobo, but rather, the USB hardware/chips in the Dell 2001FP monitor.

  28. Posted June 13, 2008 at 11:23 | Permalink

    Yeah, I’ve come around to the same conclusion :(

  29. Ralf
    Posted August 26, 2008 at 19:47 | Permalink

    Here is the solution: I also had the problem with my Thinkpad T40p. You must know that this laptop only has USB 1.1 ports. I run Slackware 12.1 with kernel 2.6.24.5-smp. The problem is: the standard kernel has ehci-hcd as module compiled AND automatic module loading by kernel is enabled. that is the point: modprobe -r ehci-hcd and rmmod ehci-hcd has no effect, because the kernel always loads it automatically (so it is clear why I did not see it in lsmod list…)

    The solution worked for me: `

    rm -rf /lib/modules/2.6.24.5/kernel/*

    rm -rf /lib/modules/2.6.24.5-smp/kernel/*

    `

    configure the kernel with ehci (USB 2.0) completely deactivated (not only “M” = module!): `

    cd /usr/src/linux

    make clean

    make gconfig

    • Device Drivers – USB support — Support for Host-side USB —- EHCI HCD (USB 2.0) support USB_EHCI_HCD: DEACTIVATE!!! …

    make

    make modules

    make modules_install

    make install

    reboot

    `

    After the reboot plugging in my usb harddisk worked fine: ` Aug 26 19:49:06 geckobox kernel: usb 2-2: new full speed USB device using uhci_hcd and address 2

    Aug 26 19:49:06 geckobox kernel: usb 2-2: configuration #1 chosen from 1 choice

    Aug 26 19:49:06 geckobox kernel: scsi2 : SCSI emulation for USB Mass Storage devices

    Aug 26 19:49:11 geckobox kernel: scsi 2:0:0:0: Direct-Access IC25N040 ATCS05-0 0811 PQ: 0 ANSI: 0

    Aug 26 19:49:11 geckobox kernel: sd 2:0:0:0: [sda] 78140160 512-byte hardware sectors (40008 MB)

    Aug 26 19:49:11 geckobox kernel: sd 2:0:0:0: [sda] 78140160 512-byte hardware sectors (40008 MB)

    Aug 26 19:49:11 geckobox kernel: sda: sda1

    Aug 26 19:49:11 geckobox kernel: sd 2:0:0:0: [sda] Attached SCSI disk

    Aug 26 19:49:11 geckobox kernel: sd 2:0:0:0: Attached scsi generic sg0 type 0

    Aug 26 19:49:23 geckobox hald: mounted /dev/sda1 on behalf of uid 0 `

    YEAH!

  30. harry
    Posted September 14, 2008 at 13:27 | Permalink

    Thanks for that tip, Ralf.

    It worked perfectly.

    On my distro (Slackwre 12.1), I recompiled the kernel without USB_EHCI_HCD because it was compiled into the kernel (not as a module) – and now the usb external drives are rock-solid without any resets etc.

    Thanks very much!

    Harry.

  31. Mr. M
    Posted April 17, 2009 at 05:51 | Permalink

    I’ve had the problems you describe here too with a little ThinClient and a VIA-USB chipset.

    I have tried: - the max_sector settings in sys-FS - the allow_restart thing in sys-FS - compiling a new kernel - changing cables - removing ehci for USB1.X support only which made the device slow as hell

    Finally i bought a new PCI-USB-Card with a NEC chipset. That was the only thing that helped.