Skip to content

Prevent clobbering data partition when reprovisioning #1257

@bgilbert

Description

@bgilbert

When reprovisioning existing systems, we assume we can rewrite the blocks at the beginning of the disk without impinging on any user-created data partitions immediately after the root partition. However, per coreos/fedora-coreos-tracker#1465, FCOS and RHCOS will eventually increase the size of the /boot partition and thus of the install image. Thus, if an older system is reprovisioned with a newer OS image, such data partitions will be clobbered.

We can't transparently fix disks that don't have enough space for the OS image, but we should try to prevent data loss. Historically we've intentionally ignored any partitions that aren't saved with --save-partindex or --save-partlabel, on the theory that the user knows what they're doing. However, we shouldn't provide footguns. Without changing our general policy, we could check the target disk for an existing CoreOS installation, and if present, add all data partitions to the saved-partition list. (We can't just save the first data partition because only the saved partition table entries are restored after an install failure.)

Detecting CoreOS isn't trivial. The root filesystem may be encrypted. /boot isn't, but might be on a RAID volume. We'd need to:

  1. Detect the boot partition label, or boot-<n> in the case of a Butane-created RAID config. Not all s390x systems have partition tables that support these labels.
  2. If present, start the RAID, read-only and possibly degraded.
  3. Mount the filesystem, read-only, keeping in mind that the boot partition label may not be unique on the host system.
  4. Check the filesystem for some combination of properties indicating a CoreOS volume. We need to successfully detect every FCOS and RHCOS release.
  5. Unmount the filesystem and stop the RAID.

In the overlap case, coreos-installer will download (if necessary) and write most of the image, notice the overlap before reaching the data partition, restore the saved partition table entries, and fail. We'll need to ensure the error message provides clear advice.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions