Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always validate checksums for Direct I/O reads #16598

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Commits on Oct 2, 2024

  1. Always validate checksums for Direct I/O reads

    This fixes an oversight in the Direct I/O PR. There is nothing stops a
    process from manipulating the contents of a buffer for a Direct I/O read
    while the I/O is in flight. This can lead checksum verify failures.
    However, the disk contents are stil correct, but this would lead false
    reporting of checksum validations.
    
    To remedy this, all Direct I/O reads that have a checksum verification
    failure are treated as suspicious. In the event a checksum validation
    failure occurs for a Direct I/O read, then the I/O request will be
    reissued though the ARC. This allows for actual validation to happen and
    removes any possibility of the buffer being manipulated after the I/O
    has been issued.
    
    Just as with Direct I/O write checksum validation failures, Direct I/O
    read checksum validation failures are reported though zpool status -d in
    the DIO columnm. Also the zevent has been updated to have both:
    1. dio_verify_wr -> Checksum verification failure for writes
    2. dio_verify_rd -> Checksum verification failure for reads.
    This allows for determining what I/O operation was the culprit for the
    checksum verification failure. All DIO errors are reported only on the
    top-level VDEV.
    
    Even though FreeBSD can write protect pages (stable pages) it still has
    the same issue as Linux in department.
    
    This commit updates the following:
    1. Propogates checksum failures for reads all the way up to the
       top-level VDEV.
    2. Reports errors through zpool status -d as DIO.
    3. Has two zevents for checksum verify errors with Direct I/O. One for
       read and one for write.
    4. Updates FreeBSD ABD code to also check for ABD_FLAG_FROM_PAGES and
       handle ABD buffer contents validation the same as Linux.
    5. Moves the declartion of nbytes in zfs_read() to the top of the
       function and outside of the while loop. This was needed due to a
       compliation failure in FreeBSD.
    6. Updated manipulate_user_buffer.c to also manipulate a buffer while a
       Direct I/O read is taking place.
    7. Adds a new test case dio_read_verify that stress tests the new code.
    
    This issue was first observed when installing a Windows 11 VM on a ZFS
    dataset with the dataset property direct set to always. The zpool
    devices would report checksum failures, but running a subsequent zpool
    scrub would not repair any data and report no errors.
    
    Signed-off-by: Brian Atkinson <[email protected]>
    bwatkinson committed Oct 2, 2024
    Configuration menu
    Copy the full SHA
    2d48b54 View commit details
    Browse the repository at this point in the history