Discussion:
Bug#868450: Make usage of fstab-decode optional
Add Reply
Michael Biebl
2017-07-15 13:40:02 UTC
Reply
Permalink
Raw Message
Source: open-iscsi
Version: 2.0.874-4
Severity: wishlist

Hi,

I'm currently investigating whether it would be possible to make
fstab-decode non-essential and move it out of sysvinit-utils into the
initscripts package where it is used by /etc/init.d/umountfs and
/etc/init.d/umountnfs.sh.

According to codesearch.d.n, open-iscsi is the only other package which
makes use of fstab-decode.

Please consider dropping this dependency on fstab-decode or making it
optional.

Regards,
Michael

-- System Information:
Debian Release: buster/sid
APT prefers unstable
APT policy: (500, 'unstable'), (200, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 4.11.0-1-amd64 (SMP w/4 CPU cores)
Locale: LANG=de_DE.utf8, LC_CTYPE=de_DE.utf8 (charmap=UTF-8), LANGUAGE=de_DE.utf8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
Christian Seiler
2017-07-15 14:50:02 UTC
Reply
Permalink
Raw Message
Control: tags -1 + moreinfo

Hi Michael,
Post by Michael Biebl
I'm currently investigating whether it would be possible to make
fstab-decode non-essential and move it out of sysvinit-utils into the
initscripts package where it is used by /etc/init.d/umountfs and
/etc/init.d/umountnfs.sh.
According to codesearch.d.n, open-iscsi is the only other package which
makes use of fstab-decode.
Which probably means there are a few packages out there that are broken
in corner cases. ;-)

fstab-decode makes sure that arguments passed to commands in scripts
are properly decoded according to the fstab encoding rules.

If you look at open-iscsi, it only uses fstab-decode in the script
umountiscsi.sh that is run on shutdown. Usages:

fstab-decode mountpoint -q "$path"
fstab-decode umount "$path"

So if you have a space in your mountpoint (which appears as \040 in
/proc/self/mountinfo and /etc/fstab) not using fstab-decode will make
this fail.
Post by Michael Biebl
Please consider dropping this dependency on fstab-decode or making it
optional.
While I don't think spaces are all that nice in path names, we do
currently support them, and I really don't want to drop that support
later on. So I don't really see how to get rid of that dependency
easily (without reimplementing it, which I don't think is a useful
use of my time) without potentially breaking people's existing
systems.

That all said:

1. I wouldn't be pleased with it, but I wouldn't mind depending on
'initiscripts' in open-iscsi, if you do decide to move the binary.

This is not great, as open-iscsi provides native systemd services
(since Stretch), so I wouldn't be too happy about it depending on
initscripts on systemd systems, but I don't have a huge dog in that
fight.

2. More importantly, I consider umountiscsi.sh to be a hack. A
necessary one, because there's no alternative at the moment, but a
hack nonetheless.

In the long term I'd really, really like to not have to use
umountiscsi.sh on systemd systems - because it's a huge layering
violation.

I really think this is something that systemd should provide
natively. To explain the background of the script:

At boot we log in to iSCSI targets after networking has been set up.
This causes the kernel to make new block devices (e.g. /dev/sdb)
available in the system. Then, if LVM is used, lvmetad will pick up LVM
volumes on these devices and new LV devices (/dev/VG/LV) will then
appear. Once block devices required for /etc/fstab entries have
appeared, systemd will mount those block devices and make the
filesystems available.

On shutdown we need to umount all of that stuff. Now for things in
/etc/fstab systemd will do that for us (and we do order properly
against remote-fs-pre.target) - but that still leaves two problems
with that:

a) systemd usually doesn't recognize manually mounted iSCSI
filesystems as network filesystems unless the administrator
explicitly specifies -o _netdev for the mount command (which
nobody actually does), so anything not in /etc/fstab manually
mounted by the administrator will not be unmounted at the right
time by systemd. (The reason is that iSCSI devices look like
normal block devices, so you can use e.g. ext4 or btrfs on top of
them.)

b) If one uses LVM, for example, the LVM volumes do not get properly
shutdown. Same goes for LUKS, dm-raid, multipath, and any other
block device layering you can think of.

That's why we still run that script: to make sure that the underlying
iSCSI block devices are not used anymore at all when we issue the
logout command. Furthermore I had started to implement a mechanism
to not logout from iSCSI if something failed to dismantle properly,
but that still lacks integration into e.g. ifupdown to actually work
properly. (The "don't logout if we couldn't dismantle everything" part
already works, but ifupdown will still kill the networking.)


What I would really like to see here in the long run is the following:


- There's some way for open-iscsi to tell systemd that block devices
that come from iSCSI are dependent on the 'open-iscsi.service'
unit.

- There's some way for LVM, LUKS, etc. to tell systemd how each
logical device they generate actually relates to other devices on
the system (e.g. the LUKS volume depends on the underlying block
device, the LVM LVs depend on their corresponding PVs, etc.).

- There's some way for LVM, LUKS, etc. to tell systemd how to
dismantle a specific device.

- On shutdown systemd then will properly order all mounts and
storage dismantling operations according to this dependency logic.

- We don't need to run umountiscsi.sh on systemd systems anymore,
but can rely on systemd itself to properly provide that, so
open-iscsi.service just does the logout from iSCSI volumes.

- Ideally some kind of error logic that can tell systemd "yeah,
something didn't shut down properly, so keep iSCSI logins active
and don't shutdown networking in that case", because keeping a
network interface active will be better yanking the block device
away forcibly, which does happen if you pull the networking plug.
(As that currently doesn't work properly either, I consider this
to be non-essential for me to switch, but I _would_ like to see
something like that.)

And _then_ I wouldn't need fstab-decode anymore, and would not have
any layering violation anymore, because any system such as LVM or
LUKS could take care of this layering business themselves, and then
systemd could just do the right thing. For example, LUKS support is
new in Stretch, because someone opened a bug about LUKS not working
properly, but we really don't want to special-case each and every
one of these solutions.

For local devices systemd-shutdown tries to pick up all the pieces that
the not quite so precise dependency logic that currently exists doesn't
catch properly. But for iSCSI (and other network block device systems,
such as NBD [1]), where once networking goes away, if anything is still
active after remote-fs-pre.target is reached at shutdown, we must be
able to properly dismantle all that, otherwise people will experience
DATA LOSS, which I would consider to be unacceptable.

For now, that's umountiscsi.sh - and that currently requires
fstab-decode to properly support the corner cases it already supports.
But if a better solution becomes available, I'd be happy to switch to
that.

Regards,
Christian

PS: Please keep in mind that while apparently other Debian packages
don't know about fstab-decode, many admins may have used it in
their own scripts, especially if those admins were following best
practices. If it does become non-essential, it should definitely go
into the release notes.

PPS: In case you were wondering: enterprise-y distributions don't
appear to care at all about this - last time I checked, for
configurations without iSCSI on the rootfs (where the network
has to stay up anyway) they just did a logout command at shutdown
and hoped for the best.

[1] I don't know how much the NBD maintainers have invested in making
sure the shutdown procedure is sane.
Michael Biebl
2017-07-15 21:20:02 UTC
Reply
Permalink
Raw Message
Hi Christian,

thanks a lot for your detailed reply!

I've CCed the pkg-systemd-maintainers m-l and Wouter, as the maintainer
of nbd.

You raise some good points. Maybe this is something you could bring up
on the upstream mailing list? It seems like it should be discussed there
and it would be really helpful if someone knowledgeable in this area has
this conversation there.
Post by Christian Seiler
Control: tags -1 + moreinfo
Hi Michael,
Post by Michael Biebl
I'm currently investigating whether it would be possible to make
fstab-decode non-essential and move it out of sysvinit-utils into the
initscripts package where it is used by /etc/init.d/umountfs and
/etc/init.d/umountnfs.sh.
According to codesearch.d.n, open-iscsi is the only other package which
makes use of fstab-decode.
Which probably means there are a few packages out there that are broken
in corner cases. ;-)
fstab-decode makes sure that arguments passed to commands in scripts
are properly decoded according to the fstab encoding rules.
If you look at open-iscsi, it only uses fstab-decode in the script
fstab-decode mountpoint -q "$path"
fstab-decode umount "$path"
So if you have a space in your mountpoint (which appears as \040 in
/proc/self/mountinfo and /etc/fstab) not using fstab-decode will make
this fail.
Post by Michael Biebl
Please consider dropping this dependency on fstab-decode or making it
optional.
While I don't think spaces are all that nice in path names, we do
currently support them, and I really don't want to drop that support
later on. So I don't really see how to get rid of that dependency
easily (without reimplementing it, which I don't think is a useful
use of my time) without potentially breaking people's existing
systems.
1. I wouldn't be pleased with it, but I wouldn't mind depending on
'initiscripts' in open-iscsi, if you do decide to move the binary.
This is not great, as open-iscsi provides native systemd services
(since Stretch), so I wouldn't be too happy about it depending on
initscripts on systemd systems, but I don't have a huge dog in that
fight.
2. More importantly, I consider umountiscsi.sh to be a hack. A
necessary one, because there's no alternative at the moment, but a
hack nonetheless.
In the long term I'd really, really like to not have to use
umountiscsi.sh on systemd systems - because it's a huge layering
violation.
I really think this is something that systemd should provide
At boot we log in to iSCSI targets after networking has been set up.
This causes the kernel to make new block devices (e.g. /dev/sdb)
available in the system. Then, if LVM is used, lvmetad will pick up LVM
volumes on these devices and new LV devices (/dev/VG/LV) will then
appear. Once block devices required for /etc/fstab entries have
appeared, systemd will mount those block devices and make the
filesystems available.
On shutdown we need to umount all of that stuff. Now for things in
/etc/fstab systemd will do that for us (and we do order properly
against remote-fs-pre.target) - but that still leaves two problems
a) systemd usually doesn't recognize manually mounted iSCSI
filesystems as network filesystems unless the administrator
explicitly specifies -o _netdev for the mount command (which
nobody actually does), so anything not in /etc/fstab manually
mounted by the administrator will not be unmounted at the right
time by systemd. (The reason is that iSCSI devices look like
normal block devices, so you can use e.g. ext4 or btrfs on top of
them.)
b) If one uses LVM, for example, the LVM volumes do not get properly
shutdown. Same goes for LUKS, dm-raid, multipath, and any other
block device layering you can think of.
That's why we still run that script: to make sure that the underlying
iSCSI block devices are not used anymore at all when we issue the
logout command. Furthermore I had started to implement a mechanism
to not logout from iSCSI if something failed to dismantle properly,
but that still lacks integration into e.g. ifupdown to actually work
properly. (The "don't logout if we couldn't dismantle everything" part
already works, but ifupdown will still kill the networking.)
- There's some way for open-iscsi to tell systemd that block devices
that come from iSCSI are dependent on the 'open-iscsi.service'
unit.
- There's some way for LVM, LUKS, etc. to tell systemd how each
logical device they generate actually relates to other devices on
the system (e.g. the LUKS volume depends on the underlying block
device, the LVM LVs depend on their corresponding PVs, etc.).
- There's some way for LVM, LUKS, etc. to tell systemd how to
dismantle a specific device.
- On shutdown systemd then will properly order all mounts and
storage dismantling operations according to this dependency logic.
- We don't need to run umountiscsi.sh on systemd systems anymore,
but can rely on systemd itself to properly provide that, so
open-iscsi.service just does the logout from iSCSI volumes.
- Ideally some kind of error logic that can tell systemd "yeah,
something didn't shut down properly, so keep iSCSI logins active
and don't shutdown networking in that case", because keeping a
network interface active will be better yanking the block device
away forcibly, which does happen if you pull the networking plug.
(As that currently doesn't work properly either, I consider this
to be non-essential for me to switch, but I _would_ like to see
something like that.)
And _then_ I wouldn't need fstab-decode anymore, and would not have
any layering violation anymore, because any system such as LVM or
LUKS could take care of this layering business themselves, and then
systemd could just do the right thing. For example, LUKS support is
new in Stretch, because someone opened a bug about LUKS not working
properly, but we really don't want to special-case each and every
one of these solutions.
For local devices systemd-shutdown tries to pick up all the pieces that
the not quite so precise dependency logic that currently exists doesn't
catch properly. But for iSCSI (and other network block device systems,
such as NBD [1]), where once networking goes away, if anything is still
active after remote-fs-pre.target is reached at shutdown, we must be
able to properly dismantle all that, otherwise people will experience
DATA LOSS, which I would consider to be unacceptable.
For now, that's umountiscsi.sh - and that currently requires
fstab-decode to properly support the corner cases it already supports.
But if a better solution becomes available, I'd be happy to switch to
that.
Regards,
Christian
PS: Please keep in mind that while apparently other Debian packages
don't know about fstab-decode, many admins may have used it in
their own scripts, especially if those admins were following best
practices. If it does become non-essential, it should definitely go
into the release notes.
PPS: In case you were wondering: enterprise-y distributions don't
appear to care at all about this - last time I checked, for
configurations without iSCSI on the rootfs (where the network
has to stay up anyway) they just did a logout command at shutdown
and hoped for the best.
[1] I don't know how much the NBD maintainers have invested in making
sure the shutdown procedure is sane.
--
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?
Christian Seiler
2017-07-15 21:30:01 UTC
Reply
Permalink
Raw Message
Hi Michael,
Post by Michael Biebl
thanks a lot for your detailed reply!
I've CCed the pkg-systemd-maintainers m-l and Wouter, as the maintainer
of nbd.
You raise some good points. Maybe this is something you could bring up
on the upstream mailing list?
I assume you mean the upstream systemd mailing list?

Ok, I'll do that in the next couple of days once I find some time
to write this up better from a more generic perspective. (Because
I think this could also be useful in other cases.)

Regards,
Christian
Michael Biebl
2017-07-15 21:30:01 UTC
Reply
Permalink
Raw Message
Post by Christian Seiler
Post by Michael Biebl
You raise some good points. Maybe this is something you could bring up
on the upstream mailing list?
I assume you mean the upstream systemd mailing list?
Indeed, I meant the upstream systemd mailing list.
--
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?
Wouter Verhelst
2017-07-17 10:50:02 UTC
Reply
Permalink
Raw Message
Post by Michael Biebl
Hi Christian,
thanks a lot for your detailed reply!
I've CCed the pkg-systemd-maintainers m-l and Wouter, as the maintainer
of nbd.
Right, thanks.

I had gotten nbd working properly in combination with systemd in stretch
during dc16, but somehow it stopped working before the release (and I
only found out like a week or so before the release, way too late to fix
it).

As chance would have it, I was on an airplane yesterday and spent some
time debugging things. I managed to get things to almost work on a VM
(the block device gets yanked away at the wrong time during shutdown
still, and there's another issue for which you must just have seen a bug
report appear), but at least it's almost back to functional now.

What I found works is that ***@.service has this:

(...)
[Unit]
After=network-online.target
DefaultDependencies=no
Conflicts=shutdown.target
[Install]
RequiredBy=dev-%i.device
RequiredBy=dev-%ip1.device
RequiredBy=dev-%ip2.device
(... etc, for fifteen partitions ...)

It *should* also need a "Before=dev-%i.device", which *used* to work
(back at dc16), but stopped working some time between then and now.

However, that only supports nbd devices that are configured in fstab and
nbdtab. Things that are configured manually don't work that way.

I agree that it is necessary to make systemd be aware of the connection
that exists between some block devices and the user setup that is
necessary for them to appear. Currently it doesn't, and I managed to
somewhat get it to work before, but essentially that was just a hack.
This support would be necessary for AoE devices too, btw.

Ideally, the link between the userspace setup and the block device would
be something that could be discovered by udev and signalled to systemd
by way of a variable there (like, say, the SYSTEMD_READY thing that is
set on unconnected nbd devices now). That way, you could set a udev rule
declaring the link between a service and a device, and it would work
regardless of whether the mount is done manually.
Post by Michael Biebl
You raise some good points. Maybe this is something you could bring up
on the upstream mailing list? It seems like it should be discussed there
and it would be really helpful if someone knowledgeable in this area has
this conversation there.
I was actually considering doing something like this myself, since as
per the above I'd come to a similar conclusion. Systemd currently
doesn't seem to understand that it is necessary, and that's a problem.
Thanks for bringing it up first.

(actually, I was also hoping to catch you during dc17 in Montreal, if
you're going, so we can possibly discuss this in person)
--
Could you people please use IRC like normal people?!?

-- Amaya Rodrigo Sastre, trying to quiet down the buzz in the DebConf 2008
Hacklab
Loading...