Skip to content

Conversation

@whoschek
Copy link

@whoschek whoschek commented Dec 9, 2025

This PR is in response to #17998

Motivation and Context

OpenZFS 2.2 introduced the snapshots_changed dataset property as a cheap, persistent way to tell whether the snapshot set for a dataset has changed without having to run zfs list -t snapshot on every poll. Replication and monitoring tools use this as a cache key to avoid repeatedly walking large snapshot trees.

Because snapshots_changed is exposed only with integer-second granularity, multiple snapshot creations or deletions that occur within the same second share the same value. Tools that rely on it as a change detector can therefore miss snapshot updates that happen in the same second, causing cache-based fast paths to skip zfs list -t snapshot when they should refresh their snapshot metadata. This is particularly problematic for near real-time replication and monitoring systems that take frequent snapshots.

This change adds a companion read-only dataset property, snapshots_changed_nsecs, which exposes the nanosecond (tv_nsec) component of the same internal timestamp that backs snapshots_changed. Together, (snapshots_changed, snapshots_changed_nsecs) provide a reliable, nanosecond-granularity indicator of snapshot-set changes without requiring any on-disk format changes.

The change is intentionally minimal and aligned with existing patterns:

  • Internally, OpenZFS already stores the snapshot change time as an inode_timespec with tv_sec and tv_nsec.
  • The nanoseconds are already stored alongside the seconds in the extensible dataset ZAP; we are simply exposing them via a new read-only property.
  • No on-disk format changes are required, and no user-settable properties are added; this is a read-only, introspection-only interface.
  • Mirrors the existing ZFS event time split (ZEVENT_TIME_SECS / ZEVENT_TIME_NSECS).

Description

  • Introduce a new dataset property snapshots_changed_nsecs with enum ZFS_PROP_SNAPSHOTS_CHANGED_NSECS in include/sys/fs/zfs.h and lib/libzfs/libzfs.abi.
  • Register the property in module/zcommon/zfs_prop.c as a numeric, read-only dataset property for filesystems and volumes:
    • Name: snapshots_changed_nsecs
    • Type: PROP_TYPE_NUMBER
    • Display: raw integer value (SNAPSHOTS_CHANGED_NSECS column, no humanized formatting)
  • Populate the property from the existing dsl_dir_snap_cmtime() timestamp in dsl_dataset_stats() (module/zfs/dsl_dataset.c):
    • Cache dsl_dir_snap_cmtime(ds->ds_dir) into a local inode_timespec_t snap_cmtime.
    • Continue to export snap_cmtime.tv_sec as snapshots_changed.
    • When snap_cmtime.tv_sec != 0, export snap_cmtime.tv_nsec as snapshots_changed_nsecs.
    • This reuses the existing extensible-dataset ZAP timestamp and does not change any on-disk structures.
  • Expose the property in userland via libzfs (lib/libzfs/libzfs_dataset.c):
    • Add a ZFS_PROP_SNAPSHOTS_CHANGED_NSECS case to zfs_prop_get() which retrieves the numeric value and formats it as a decimal string.
  • Make the property available to channel programs (module/zfs/zcp_get.c):
    • Extend get_special_prop() to return dsl_dir_snap_cmtime(ds->ds_dir).tv_nsec for ZFS_PROP_SNAPSHOTS_CHANGED_NSECS.
  • Document the new property in zfsprops(7) (man/man7/zfsprops.7):
    • snapshots_changed_nsecs is the nanosecond component corresponding to snapshots_changed.
    • Its value is in the range [0, 999999999] and is only meaningful when snapshots_changed is not -.
    • The full nanosecond timestamp since the Unix epoch can be reconstructed as snapshots_changed * 1000000000 + snapshots_changed_nsecs.
  • Extend the functional test snapshot_018_pos (tests/zfs-tests/tests/functional/snapshot/snapshot_018_pos.ksh):
    • Verify that both snapshots_changed and snapshots_changed_nsecs are - before any snapshots exist, via both zfs get and zfs list -o.
    • After snapshot creation, check that snapshots_changed is greater than or equal to the current time, and that snapshots_changed_nsecs is in [0, 1000000000).
    • Assert that zfs get and zfs list -o report identical values for both properties on the pool and filesystem.
    • Confirm that the properties retain their values correctly across unmount/mount cycles.
    • Verify that destroying snapshots updates snapshots_changed and snapshots_changed_nsecs and that the .zfs/snapshot directory mtime matches snapshots_changed.

How Has This Been Tested?

  • Extended tests/zfs-tests/tests/functional/snapshot/snapshot_018_pos.ksh to cover both snapshots_changed and snapshots_changed_nsecs for:
    • Initial pool and filesystem creation (no snapshots present).
    • Snapshot creation on the pool and filesystem.
    • Snapshot creation while filesystems are unmounted.
    • Mount/unmount cycles of the pool and filesystem.
    • Snapshot destruction on both the filesystem and the pool.
  • The test validates that:
    • snapshots_changed and snapshots_changed_nsecs are - when no snapshots exist.
    • snapshots_changed_nsecs is always in the range [0, 1000000000) when snapshots_changed is set.
    • zfs get -p and zfs list -p -o return consistent values for both properties.
    • The .zfs/snapshot directory mtime remains consistent with snapshots_changed.
  • Ran these test commands:
./scripts/zfs-tests.sh -v -t snapshot_018_pos
./scripts/zfs-tests.sh -v -T snapshot

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Quality assurance (non-breaking change which makes the code more robust against bugs)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@behlendorf behlendorf added the Status: Design Review Needed Architecture or design is under discussion label Dec 9, 2025
@whoschek whoschek force-pushed the 17998-Add_snapshots_changed_nsecs_dataset_property branch 3 times, most recently from 36e9976 to 1626909 Compare December 10, 2025 20:19
@amotin
Copy link
Member

amotin commented Dec 10, 2025

While I understand your motivation, it still looks weird to me report time in two properties. Sure it may work as the change notification, but for any other use it would be quite useless, being potentially incoherent with the seconds part.

@whoschek
Copy link
Author

whoschek commented Dec 10, 2025

While I understand your motivation, it still looks weird to me report time in two properties. Sure it may work as the change notification, but for any other use it would be quite useless, being potentially incoherent with the seconds part.

It's a bit weird, yes, but it is consistent with the second part. The full nanosecond timestamp since the Unix epoch can be reconstructed as snapshots_changed * 1000000000 + snapshots_changed_nsecs

It's also consistent with the existing design elsewhere in the ZFS codebase. FWIW, I figured I should design it so the change is as minimal as possible and "fits in" with the existing codebase, i.e. the ZFS event time split, which does the same thing - it reports ZEVENT_TIME_SECS and ZEVENT_TIME_NSECS with the same semantics: The full nanosecond timestamp = ZEVENT_TIME_SECS * 1000000000 + ZEVENT_TIME_NSECS

@amotin Having said that, I'd be fine with changing snapshots_changed_nsecs to represent the full timestamp in nanoseconds in a uint64 if that's what you'd prefer (the number will overflow uint64 in the year 2554 but I'm not too worried about that yet :-)

@whoschek whoschek force-pushed the 17998-Add_snapshots_changed_nsecs_dataset_property branch from 1626909 to bf3d8da Compare December 11, 2025 00:43
@amotin
Copy link
Member

amotin commented Dec 11, 2025

since the Unix epoch can be reconstructed

It can't, if a new snapshot is created between reading two properties.

Having said that, I'd be fine with changing snapshots_changed_nsecs to represent the full timestamp in nanoseconds in a uint64 if that's what you'd prefer

I can't say that I like having two properties for the same thing just with different precision, but this would make more sense to me.

@whoschek whoschek force-pushed the 17998-Add_snapshots_changed_nsecs_dataset_property branch 2 times, most recently from 5606481 to b23fbc8 Compare December 11, 2025 03:52
@whoschek
Copy link
Author

Having said that, I'd be fine with changing snapshots_changed_nsecs to represent the full timestamp in nanoseconds in a uint64 if that's what you'd prefer

I can't say that I like having two properties for the same thing just with different precision, but this would make more sense to me.

@amotin Ok, I've now changed the code (and pushed it) such that snapshots_changed_nsecs represents the full timestamp in nanoseconds in a uint64. Let me know if you find any other issues!

Signed-off-by: Wolfgang Hoschek <wolfgang.hoschek@mac.com>
@whoschek whoschek force-pushed the 17998-Add_snapshots_changed_nsecs_dataset_property branch from b23fbc8 to 98e725c Compare December 11, 2025 05:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Design Review Needed Architecture or design is under discussion

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants