I recently ran into an issue on an Ubuntu 24.04 server where the system would intermittently hang and become unresponsive. Checking /var/log/syslog
, it seems the onboard Intel network card was the problem.
May 14 01:38:09 server-1 kernel: e1000e 0000:00:1f.6 eno1: Detected Hardware Unit Hang:
...
May 14 01:38:25 server-1 kernel: workqueue: e1000_print_hw_hang [e1000e] hogged CPU for >13333us 4 times, consider switching to WQ_UNBOUND
This repeats every 2 seconds, filling the logs and hogging CPU time. The NIC in question is using the e1000e
driver for the onboard Intel Ethernet controller. This is a known issue with certain Intel NICs and the e1000e
driver. The kernel repeatedly reports a “Hardware Unit Hang” when the NIC’s transmit queue stalls. Apparently, it is more often see this after the system has been up for a while, usually under I/O or network load. Power-saving features and offloads like ASPM and TSO seem to trigger or worsen it.
Fixing e1000_print_hw_hang
1. Disable TSO (TCP Segmentation Offload)
This stops the NIC from offloading TCP segmentation — which can misbehave on this driver.
sudo ethtool -K eno1 tso off
But this won’t persist after reboot, so we’ll make it stick with a systemd
service…
Make TSO Setting Persistent with systemd
Create a service file:
sudo nano /etc/systemd/system/disable-tso.service
Paste this in:
[Unit]
Description=Disable TSO on eno1
After=network.target
[Service]
Type=oneshot
ExecStart=/sbin/ethtool -K eno1 tso off
RemainAfterExit=yes
[Install]
WantedBy=multi-user.target
Enable it:
sudo systemctl daemon-reexec
sudo systemctl enable disable-tso.service
Optional: start it now without rebooting:
sudo systemctl start disable-tso.service
2. Disable PCIe ASPM via GRUB
Edit the GRUB config:
sudo nano /etc/default/grub
Find this line:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"
And change it to:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pcie_aspm=off"
Then update grub:
sudo update-grub
After these changes, the e1000e
driver should became stable again.