Discussion:
Bug#1086861: e2fsprogs: filesystem repair at boot fails
Add Reply
Anees
2024-11-06 17:50:01 UTC
Reply
Permalink
Package: e2fsprogs
Version: 1.47.1-1+b1
Severity: important

Dear Maintainer,

*** Reporter, please consider answering these questions, where appropriate ***

* What led up to the situation?
Test unclean boot

* What exactly did you do (or not do) that was effective (or
ineffective)?
Start chromium browser or any other program that ensure busy disk and then cause kernel panic by issuing "echo c > /proc/sysrq-trigger"

* What was the outcome of this action?
Computer boots in busybox showing the following message:
/dev/sda2: recovering journal
/dev/sda2: Clearing orphaned inode 3408176 (uid=1000, gid=1000, mode-0100600, size=64)
/dev/sda2: clean, 150916/3645440 files, 1460364/14572800 blocks
[ 3.7641111 EXT4-fs error (device sda2): ext4_orphan_get:1421: comm mount: bad orphan inode 3408176
[ 3.7641371 ext4_test_bit (bit-303, block-13631504) = 0
EXT4-fs error (device sda2): ext4 orphan_get:1121: comm mount: bad orphan inode 3408176
ext1_test_bit(bit-303, block-13631504) = 0
[ 3.7646551 EXT4-fs error (device sda2): ext4_mark_recovery_complete:6229: comm mount: Orphan file not empty on read-only fs. 3.764746) EXT4-fs (sda2): mount failed
mount: mounting /dev/sda2 on /root failed: Structure needs cleaning
Failed to mount /dev/sda2 as root file system.
EXT4-fs error (device sda2): ext4_mark_recovery_complete:6229: comm mount: Orphan file not empty on read-only fs. EXT4-fs (sda2): mount failed
BusyBox v1.37.0 (Debian 1:1.37.0-4) built-in shell (ash)

* What outcome did you expect instead?



-- System Information:
Debian Release: trixie/sid
APT prefers testing
APT policy: (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 6.11.5-amd64 (SMP w/4 CPU threads; PREEMPT)
Locale: LANG=en_CA.UTF-8, LC_CTYPE=en_CA.UTF-8 (charmap=UTF-8), LANGUAGE=en_CA:en
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages e2fsprogs depends on:
ii libblkid1 2.40.2-9
ii libc6 2.40-3
ii libcom-err2 1.47.1-1+b1
ii libext2fs2t64 1.47.1-1+b1
ii libss2 1.47.1-1+b1
ii libuuid1 2.40.2-9
ii logsave 1.47.1-1+b1

Versions of packages e2fsprogs recommends:
pn e2fsprogs-l10n <none>

Versions of packages e2fsprogs suggests:
pn e2fsck-static <none>
pn fuse2fs <none>
pn gpart <none>
pn parted <none>

-- no debconf information
Anees Ahmad
2024-11-06 18:30:01 UTC
Reply
Permalink
Hello,

The issue described initially does not happen when both packages
*e2fsprogs_1.47.0-2.4_amd64.deb*
<https://snapshot.debian.org/archive/debian/20240314T094714Z/pool/main/e/e2fsprogs/e2fsprogs_1.47.0-2.4_amd64.deb>
and *libext2fs2t64_1.47.0-2.4_amd64.deb*
<https://snapshot.debian.org/archive/debian/20240314T094714Z/pool/main/e/e2fsprogs/libext2fs2t64_1.47.0-2.4_amd64.deb>
were downgraded to 1.47.0-2.4 on the same computer.
The bug appeared in 1.47.1~rc1-1.

Thank you
Theodore Ts'o
2024-11-06 22:30:01 UTC
Reply
Permalink
Post by Anees Ahmad
Hello,
The issue described initially does not happen when both packages
*e2fsprogs_1.47.0-2.4_amd64.deb*
<https://snapshot.debian.org/archive/debian/20240314T094714Z/pool/main/e/e2fsprogs/e2fsprogs_1.47.0-2.4_amd64.deb>
and *libext2fs2t64_1.47.0-2.4_amd64.deb*
<https://snapshot.debian.org/archive/debian/20240314T094714Z/pool/main/e/e2fsprogs/libext2fs2t64_1.47.0-2.4_amd64.deb>
were downgraded to 1.47.0-2.4 on the same computer.
The bug appeared in 1.47.1~rc1-1.
How reproducible is the problem with the 1.47.1~rc-1?

And can you describe the hardware where this is happening --- what
kind of storage device, etc.? And can you reroduce it on some other
hardware? And can you send the outut of dumpe2fs -h /dev/sda2?

This is not a problem that I've seen on any of my hardware, or on my
regression test suites, so without a clean reproducer it's going to be
very hard to work the problem.

Also, note that the error messages which you reported:

/dev/sda2: recovering journal
/dev/sda2: Clearing orphaned inode 3408176 (uid=1000, gid=1000, mode-0100600, size=64)
/dev/sda2: clean, 150916/3645440 files, 1460364/14572800 blocks
[ 3.7641111 EXT4-fs error (device sda2): ext4_orphan_get:1421: comm mount: bad orphan inode 3408176
[ 3.7641371 ext4_test_bit (bit-303, block-13631504) = 0
EXT4-fs error (device sda2): ext4 orphan_get:1121: comm mount: bad orphan inode 3408176
ext1_test_bit(bit-303, block-13631504) = 0
[ 3.7646551 EXT4-fs error (device sda2): ext4_mark_recovery_complete:6229: comm mount: Orphan file not empty on read-only fs. 3.764746) EXT4-fs (sda2): mount failed
mount: mounting /dev/sda2 on /root failed: Structure needs cleaning
Failed to mount /dev/sda2 as root file system.
EXT4-fs error (device sda2): ext4_mark_recovery_complete:6229: comm mount: Orphan file not empty on read-only fs. EXT4-fs (sda2): mount failed
BusyBox v1.37.0 (Debian 1:1.37.0-4) built-in shell (ash)


Are emitted by the kernel, and happen before any program in e2fsprogs
has a chance to run. Hence, it is highly unlikely that the version of
e2fsprogs would make a difference in this message.

The specific error message indicates that there is an inode on the
orphan list (in this case, inode #3408176) which is marked as unused
(i.e., free) in the block alloation bitmap. This is a file system
corruption problem, and is generally indicative of a kernel bug, or
some kind of hardware problem/failure/bug.

The reason why I'm a bit dubious that it is a kernel bug is (a) no one
else has reported anything like this, and (b) I am regularly running
kernel regression test which tests the unclean shutdown code path, and
I haven't seen a failure in this area of the kernel code for years.

Cheers,

- Ted

Loading...