-
Notifications
You must be signed in to change notification settings - Fork 174
fix(csi): clean up the mount point while node failed to publish volume #4268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
fix(csi): clean up the mount point while node failed to publish volume #4268
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughPrepare and validate bind-mount targets for block volumes, add ensureBindMountPoint and lazy unmount handling, change mount failure handling to call unmountAndCleanupMountPoint with explicit cleanup/logging, and update NodePublishBlockVolume messages to report bind-mount failure details. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
87abeed to
69f056f
Compare
69f056f to
d8cd841
Compare
d8cd841 to
3f953da
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
csi/util.go (1)
467-481: Update misleading comment formakeFile.The comment at Line 468 states "If pathname already exists, whether a file or directory, no error is returned," but this is inaccurate. When
pathnameis an existing directory,os.OpenFilewithos.O_CREATEreturns an error (typicallyEISDIR), which is notos.IsExist, so the function returns that error rather thannil.Apply this diff to correct the comment:
// makeFile creates an empty regular file. -// If pathname already exists, whether a file or directory, no error is returned. +// If pathname already exists as a regular file, no error is returned. +// If pathname already exists as a directory or other non-regular file type, an error is returned. func makeFile(pathname string) error { f, err := os.OpenFile(pathname, os.O_CREATE, os.FileMode(0644)) if f != nil { err = f.Close() return err } if err != nil { if !os.IsExist(err) { return err } } return nil }
🧹 Nitpick comments (1)
csi/util.go (1)
341-364: Good addition for bind mount point validation.The new
ensureBindMountPointfunction correctly validates and prepares bind mount points by removing corrupt entries and ensuring a regular file exists. The error handling and logging are appropriate.However, note that if
pathis an existing directory,makeFilewill fail (Line 360) becauseos.OpenFileon a directory returns anEISDIRerror. This is likely the intended behavior (requiring a regular file for bind mounts), but consider adding an explicit check and a clearer error message for this case to improve debuggability.Optional enhancement:
func ensureBindMountPoint(path string) error { logrus.Infof("Trying to ensure bind mount point %v", path) info, statErr := os.Lstat(path) if statErr != nil { if !os.IsNotExist(statErr) { return errors.Wrapf(statErr, "failed to check bind mount at %s", path) } } else { + if info.Mode().IsDir() { + return errors.Errorf("bind mount point at %s is a directory, expected regular file", path) + } if !info.Mode().IsRegular() && !info.Mode().IsDir() { - logrus.WithField("fileMode", info.Mode().String()).Infof("Bind mount point %v exists but is not a valid file, clean up", path) + logrus.WithField("fileMode", info.Mode().String()).Infof("Bind mount point %v exists but is not a regular file (type: %s), cleaning up", path, info.Mode().Type()) if rmErr := os.Remove(path); rmErr != nil { return errors.Wrapf(rmErr, "failed to remove corrupt bind mount point at %s", path) } } } if makeFileErr := makeFile(path); makeFileErr != nil { return errors.Wrapf(makeFileErr, "failed to create regular file for bind mount at %s", path) } return nil }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
csi/node_server.go(1 hunks)csi/util.go(2 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: csi/controller_server.go:1070-1071
Timestamp: 2024-10-30T07:07:32.174Z
Learning: In the method `cleanupBackupVolume` in `csi/controller_server.go`, errors are returned directly and handled at the upper-level method `DeleteSnapshot`. Therefore, it's acceptable to return errors directly in such helper methods without wrapping them.
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 3648
File: upgrade/v18xto190/upgrade.go:67-86
Timestamp: 2025-04-21T06:56:13.427Z
Learning: In `upgrade/v18xto190/upgrade.go`, changes made to volumes in the `upgradeVolumes` function after calling `ListAndUpdateVolumesInProvidedCache` are expected to be persisted by the caller, not within the function itself.
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: manager/volume.go:1180-1182
Timestamp: 2024-10-29T12:51:34.772Z
Learning: Ensure that when reviewing the `restoreBackingImage` function in `manager/volume.go`, verify that variables referenced in error messages are actually undefined before suggesting corrections.
📚 Learning: 2024-10-30T12:47:32.309Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/volume_controller.go:3627-3636
Timestamp: 2024-10-30T12:47:32.309Z
Learning: In the file `controller/volume_controller.go`, avoid suggesting additional logging statements if they are unnecessary or redundant. Only add logging when it provides significant value.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-29T08:47:14.968Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/backup_controller.go:404-404
Timestamp: 2024-10-29T08:47:14.968Z
Learning: In `controller/backup_controller.go`, be aware that after setting the backup status to error when a volume is not found, the code already includes `return nil`, so no additional return statement is needed.
Applied to files:
csi/node_server.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Build binaries
- GitHub Check: Summary
🔇 Additional comments (1)
csi/node_server.go (1)
384-390: Improved mount failure cleanup.The change to use
unmountAndCleanupMountPointinstead ofos.Removeis the correct approach for cleaning up failed mount attempts. This ensures the mount point is properly unmounted before removal, preventing the original mount error from being shadowed by cleanup errors. The logging of cleanup failures is also appropriate.
WebberHuang1118
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few questions and suggestions — please check it, thanks.
958a8c2 to
928b613
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
csi/node_server.go(1 hunks)csi/util.go(5 hunks)
🧰 Additional context used
🧠 Learnings (14)
📚 Learning: 2024-10-29T08:47:14.968Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/backup_controller.go:404-404
Timestamp: 2024-10-29T08:47:14.968Z
Learning: In `controller/backup_controller.go`, be aware that after setting the backup status to error when a volume is not found, the code already includes `return nil`, so no additional return statement is needed.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-30T12:47:32.309Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/volume_controller.go:3627-3636
Timestamp: 2024-10-30T12:47:32.309Z
Learning: In the file `controller/volume_controller.go`, avoid suggesting additional logging statements if they are unnecessary or redundant. Only add logging when it provides significant value.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-30T12:51:18.255Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/volume_controller.go:3652-3669
Timestamp: 2024-10-30T12:51:18.255Z
Learning: In `volume_controller.go`, within the `getBackupVolumeInfo` method, avoid adding `nil` checks for the `volume` variable `v` because it is already guaranteed to be non-`nil` by an earlier check at the beginning of the `syncVolume` method.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-29T12:51:34.772Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: manager/volume.go:1180-1182
Timestamp: 2024-10-29T12:51:34.772Z
Learning: Ensure that when reviewing the `restoreBackingImage` function in `manager/volume.go`, verify that variables referenced in error messages are actually undefined before suggesting corrections.
Applied to files:
csi/node_server.go
📚 Learning: 2025-04-21T06:56:13.427Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 3648
File: upgrade/v18xto190/upgrade.go:67-86
Timestamp: 2025-04-21T06:56:13.427Z
Learning: In `upgrade/v18xto190/upgrade.go`, changes made to volumes in the `upgradeVolumes` function after calling `ListAndUpdateVolumesInProvidedCache` are expected to be persisted by the caller, not within the function itself.
Applied to files:
csi/node_server.go
📚 Learning: 2024-11-07T08:45:46.186Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: webhook/resources/backuptarget/validator.go:0-0
Timestamp: 2024-11-07T08:45:46.186Z
Learning: In the `validateDRVolume` function within `webhook/resources/backuptarget/validator.go`, prefer comparing backup target names via labels instead of URLs when validating DR volumes to account for potential URL formatting differences.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-28T13:21:51.980Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: api/backup.go:123-128
Timestamp: 2024-10-28T13:21:51.980Z
Learning: In `api/backup.go`, the method `s.m.DeleteBackupTarget` already checks for the existence of the backup target and handles errors. Therefore, it's unnecessary to add an existence check before calling this method.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-24T12:04:44.187Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/backup_controller.go:325-326
Timestamp: 2024-10-24T12:04:44.187Z
Learning: In `controller/backup_controller.go`, `backupTargetClient` is guaranteed to be non-`nil` when there is no error, so additional `nil` checks before its usage are unnecessary.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-30T12:31:25.782Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: upgrade/v17xto180/upgrade.go:68-110
Timestamp: 2024-10-30T12:31:25.782Z
Learning: In `upgrade/v17xto180/upgrade.go`, the `backupTargetName` can be trusted to be validated if it is not an empty string. Therefore, additional validation in `ensureBackupTargetSet` and `ensureLabel` is unnecessary.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-28T01:25:02.898Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/backing_image_data_source_controller.go:714-714
Timestamp: 2024-10-28T01:25:02.898Z
Learning: In `controller/backing_image_data_source_controller.go`, the method `GetBackupTargetRO` already handles error checking for `bids.Spec.Parameters[longhorn.DataSourceTypeRestoreParameterBackupTargetName]`, so explicit checks before calling it are unnecessary.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-30T12:42:20.034Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/volume_controller.go:4776-4778
Timestamp: 2024-10-30T12:42:20.034Z
Learning: In the `controller/volume_controller.go` file's `ReconcileBackupVolumeState` function, when using `errors.Wrapf(err, ...)`, avoid including `err` in the format string, since `errors.Wrapf` already includes the original error `err`. This helps to prevent redundancy in error messages.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-29T04:23:30.091Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/backup_volume_controller.go:240-242
Timestamp: 2024-10-29T04:23:30.091Z
Learning: In `controller/backup_volume_controller.go`, the `BackupVolumeDelete` method of `backupTargetClient` returns `nil` if the backup volume is not found, so it's unnecessary to check for 'not found' errors after calling this method.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-30T12:46:06.911Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/volume_controller.go:3884-3894
Timestamp: 2024-10-30T12:46:06.911Z
Learning: In `controller/volume_controller.go`, within the `restoreVolumeRecurringJobs` function, avoid adding redundant checks for `v.Spec.BackupTargetName == ""`, as `GetBackupVolumeWithBackupTargetAndVolumeRO` already performs this validation, and `v.Spec.BackupTargetName` defaults to `default` when the custom resource is created.
Applied to files:
csi/node_server.go
📚 Learning: 2024-11-25T23:55:02.080Z
Learnt from: derekbit
Repo: longhorn/longhorn-manager PR: 3282
File: controller/monitor/node_upgrade_monitor.go:351-357
Timestamp: 2024-11-25T23:55:02.080Z
Learning: `GetVolumeRO` guarantees that `volume` is non-nil when `err == nil`, so explicit nil checks after error handling are not needed.
Applied to files:
csi/node_server.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Build binaries
- GitHub Check: Summary
928b613 to
ad77400
Compare
WebberHuang1118
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check the comments, thanks.
a80ea57 to
3fb706f
Compare
|
PR updated per review comment. Please help review again, thanks. |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
csi/util.go (1)
383-416: unmount still returns the original error even after successful lazy unmountIf the initial
Unmount/UnmountWithForcecall returns an error andIsMountPointreports the path is still mounted, a lazy unmount withumount -lis performed. When that lazy unmount succeeds, the mount is released, but the function still returns the earlier error at line 416. This has two effects:
unmountAndCleanupMountPointreceives an error and skips the subsequent Lstat+Remove cleanup (lines 428-440), leaving the mount point file unremovedMicrosoft.- Node operations like
NodeUnpublishVolumepropagate this error and fail, reporting a failure despite the mount point having been successfully detached.Fix by returning
nilinstead of the earliererrafter a successful lazy unmount (line 416).
🧹 Nitpick comments (1)
csi/node_server.go (1)
365-385: Block publish flow now cleans up correctly and surfaces the real mount errorThe updated
nodePublishBlockVolumelogic looks solid:
ensureDirectory(filepath.Dir(targetPath))makes sure the parent directory exists before any bind work.ensureBindMountPoint(targetPath, mounter)centralizes “clear and recreate bind target” logic, reusing the new unmount/cleanup helpers and guaranteeing a regular file.- On mount failure you:
- Log
mountErr.- Call
unmountAndCleanupMountPointto clean up the target, logging but ignoring cleanup failures.- Return a
codes.Internalerror that includesdevicePath,targetPath, andmountErr, so the underlying bind-mount failure is visible to kubelet instead of being hidden by cleanup.That matches the issue’s goal of cleaning up the target path without letting cleanup errors shadow the original mount failure. The only minor nit is that the cleanup path uses
mount.New("")rather than the passed-inmounter, but for block volumes today that’s effectively the same implementation.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
csi/node_server.go(1 hunks)csi/util.go(5 hunks)
🧰 Additional context used
🧠 Learnings (16)
📚 Learning: 2024-10-30T12:47:32.309Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/volume_controller.go:3627-3636
Timestamp: 2024-10-30T12:47:32.309Z
Learning: In the file `controller/volume_controller.go`, avoid suggesting additional logging statements if they are unnecessary or redundant. Only add logging when it provides significant value.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-29T08:47:14.968Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/backup_controller.go:404-404
Timestamp: 2024-10-29T08:47:14.968Z
Learning: In `controller/backup_controller.go`, be aware that after setting the backup status to error when a volume is not found, the code already includes `return nil`, so no additional return statement is needed.
Applied to files:
csi/node_server.gocsi/util.go
📚 Learning: 2025-04-21T06:56:13.427Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 3648
File: upgrade/v18xto190/upgrade.go:67-86
Timestamp: 2025-04-21T06:56:13.427Z
Learning: In `upgrade/v18xto190/upgrade.go`, changes made to volumes in the `upgradeVolumes` function after calling `ListAndUpdateVolumesInProvidedCache` are expected to be persisted by the caller, not within the function itself.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-29T12:51:34.772Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: manager/volume.go:1180-1182
Timestamp: 2024-10-29T12:51:34.772Z
Learning: Ensure that when reviewing the `restoreBackingImage` function in `manager/volume.go`, verify that variables referenced in error messages are actually undefined before suggesting corrections.
Applied to files:
csi/node_server.gocsi/util.go
📚 Learning: 2024-10-28T13:21:51.980Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: api/backup.go:123-128
Timestamp: 2024-10-28T13:21:51.980Z
Learning: In `api/backup.go`, the method `s.m.DeleteBackupTarget` already checks for the existence of the backup target and handles errors. Therefore, it's unnecessary to add an existence check before calling this method.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-24T12:04:44.187Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/backup_controller.go:325-326
Timestamp: 2024-10-24T12:04:44.187Z
Learning: In `controller/backup_controller.go`, `backupTargetClient` is guaranteed to be non-`nil` when there is no error, so additional `nil` checks before its usage are unnecessary.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-30T12:31:25.782Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: upgrade/v17xto180/upgrade.go:68-110
Timestamp: 2024-10-30T12:31:25.782Z
Learning: In `upgrade/v17xto180/upgrade.go`, the `backupTargetName` can be trusted to be validated if it is not an empty string. Therefore, additional validation in `ensureBackupTargetSet` and `ensureLabel` is unnecessary.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-28T01:25:02.898Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/backing_image_data_source_controller.go:714-714
Timestamp: 2024-10-28T01:25:02.898Z
Learning: In `controller/backing_image_data_source_controller.go`, the method `GetBackupTargetRO` already handles error checking for `bids.Spec.Parameters[longhorn.DataSourceTypeRestoreParameterBackupTargetName]`, so explicit checks before calling it are unnecessary.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-30T12:42:20.034Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/volume_controller.go:4776-4778
Timestamp: 2024-10-30T12:42:20.034Z
Learning: In the `controller/volume_controller.go` file's `ReconcileBackupVolumeState` function, when using `errors.Wrapf(err, ...)`, avoid including `err` in the format string, since `errors.Wrapf` already includes the original error `err`. This helps to prevent redundancy in error messages.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-29T04:23:30.091Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/backup_volume_controller.go:240-242
Timestamp: 2024-10-29T04:23:30.091Z
Learning: In `controller/backup_volume_controller.go`, the `BackupVolumeDelete` method of `backupTargetClient` returns `nil` if the backup volume is not found, so it's unnecessary to check for 'not found' errors after calling this method.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-30T12:46:06.911Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/volume_controller.go:3884-3894
Timestamp: 2024-10-30T12:46:06.911Z
Learning: In `controller/volume_controller.go`, within the `restoreVolumeRecurringJobs` function, avoid adding redundant checks for `v.Spec.BackupTargetName == ""`, as `GetBackupVolumeWithBackupTargetAndVolumeRO` already performs this validation, and `v.Spec.BackupTargetName` defaults to `default` when the custom resource is created.
Applied to files:
csi/node_server.go
📚 Learning: 2024-11-25T23:55:02.080Z
Learnt from: derekbit
Repo: longhorn/longhorn-manager PR: 3282
File: controller/monitor/node_upgrade_monitor.go:351-357
Timestamp: 2024-11-25T23:55:02.080Z
Learning: `GetVolumeRO` guarantees that `volume` is non-nil when `err == nil`, so explicit nil checks after error handling are not needed.
Applied to files:
csi/node_server.gocsi/util.go
📚 Learning: 2024-10-29T08:51:32.394Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/backup_controller.go:310-310
Timestamp: 2024-10-29T08:51:32.394Z
Learning: In `controller/backup_controller.go`, when retrieving `backupVolume` using `GetBackupVolumeWithBackupTargetAndVolume`, it can be `nil` without being an error, and the code should handle this case appropriately.
Applied to files:
csi/util.go
📚 Learning: 2024-10-30T12:51:18.255Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/volume_controller.go:3652-3669
Timestamp: 2024-10-30T12:51:18.255Z
Learning: In `volume_controller.go`, within the `getBackupVolumeInfo` method, avoid adding `nil` checks for the `volume` variable `v` because it is already guaranteed to be non-`nil` by an earlier check at the beginning of the `syncVolume` method.
Applied to files:
csi/util.go
📚 Learning: 2024-10-28T15:09:08.570Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: webhook/resources/volume/mutator.go:115-117
Timestamp: 2024-10-28T15:09:08.570Z
Learning: In `webhook/resources/volume/mutator.go`, within the `Create` method of `volumeMutator`, it's acceptable for `bv` to be `nil` after calling `v.ds.GetBackupVolumeWithBackupTargetAndVolumeRO`, as a `nil` `bv` is handled by the volume and engine controllers. Explicit `nil` checks in this context are not necessary.
Applied to files:
csi/util.go
📚 Learning: 2024-10-28T14:41:50.732Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/uninstall_controller.go:530-535
Timestamp: 2024-10-28T14:41:50.732Z
Learning: The duplicate backup target deletion code in the `UninstallController` is intentional, and the `deleteRecreatedCRs` method may be removed instead.
Applied to files:
csi/util.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Build binaries
- GitHub Check: Summary
🔇 Additional comments (3)
csi/util.go (3)
342-358: ensureBindMountPoint behavior looks correct and idempotentThe flow of unconditionally calling
unmountAndCleanupMountPointand thenmakeFilegives a clean, regular-file bind target on first and subsequent calls, while treating non-existent or already-unmounted paths as success via the underlying helpers. This matches the block bind-mount requirements and avoids leaking stale symlinks or files.
419-442: Cleanup path correctly handles broken symlinks and ENOENTThe updated
unmountAndCleanupMountPointuses the newunmountbehavior and thenos.Lstat+os.Remove, treatingos.IsNotExistas success and logging the file mode when removal is needed. This is a good fit for:
- Handling broken symlinks reliably.
- Keeping Node* calls idempotent when the path is already gone.
Once
unmount’s lazy-unmount return behavior is adjusted as suggested, this function should fully honor the “best-effort cleanup without hiding the original mount failure” goal.
487-501: makeFile doc comment matches current usageThe clarified comment (“creates an empty regular file”) aligns with how
makeFileis used afterunmountAndCleanupMountPointto materialize a bind-mount target. The behavior of tolerating existing paths without error remains compatible with previous callers.
08dfe31 to
fa1a17e
Compare
WebberHuang1118
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check the comments, thanks.
fa1a17e to
372da43
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
372da43 to
f7bb24e
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
csi/util.go (2)
373-389: Consider validating mount state before blind cleanup.The function unconditionally unmounts and removes the target path, even if it's already a valid bind mount point. While this ensures a clean state, it may cause unnecessary work and potential disruption if the mount point is already correctly configured. Consider checking the current state first (e.g., whether the path exists and is a regular file, and whether it's already mounted to the correct device) before performing cleanup.
However, if the "blind cleanup" approach is intentional to ensure a completely fresh state (aligning with the PR objective of thorough cleanup), this is acceptable.
414-424: Consider using utilexec for consistency and testability.The function uses
exec.Commanddirectly, while the rest of the codebase uses theutilexecpackage (e.g., line 23 importsutilexec, and it's used ingetDiskFormatat line 485). Usingutilexecwould improve consistency and testability through mock executors.Apply this diff if you want to align with the existing pattern:
-func lazyUnmount(path string) (err error) { - lazyUnmountCmd := exec.Command("umount", "-l", path) - output, err := lazyUnmountCmd.CombinedOutput() +func lazyUnmount(path string, exec utilexec.Interface) (err error) { + lazyUnmountCmd := exec.Command("umount", "-l", path) + output, err := lazyUnmountCmd.CombinedOutput() if err != nil { if isNotMountedError(err) { return nil } return errors.Wrapf(err, "failed to lazy unmount a mount point %v, output: %s", path, output) } return nil }Then update the caller in
unmountto passutilexec.New()or the appropriate executor.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
csi/node_server.go(1 hunks)csi/util.go(7 hunks)
🧰 Additional context used
🧠 Learnings (16)
📚 Learning: 2025-04-21T06:56:13.427Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 3648
File: upgrade/v18xto190/upgrade.go:67-86
Timestamp: 2025-04-21T06:56:13.427Z
Learning: In `upgrade/v18xto190/upgrade.go`, changes made to volumes in the `upgradeVolumes` function after calling `ListAndUpdateVolumesInProvidedCache` are expected to be persisted by the caller, not within the function itself.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-29T08:47:14.968Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/backup_controller.go:404-404
Timestamp: 2024-10-29T08:47:14.968Z
Learning: In `controller/backup_controller.go`, be aware that after setting the backup status to error when a volume is not found, the code already includes `return nil`, so no additional return statement is needed.
Applied to files:
csi/node_server.gocsi/util.go
📚 Learning: 2024-10-30T12:47:32.309Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/volume_controller.go:3627-3636
Timestamp: 2024-10-30T12:47:32.309Z
Learning: In the file `controller/volume_controller.go`, avoid suggesting additional logging statements if they are unnecessary or redundant. Only add logging when it provides significant value.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-28T13:21:51.980Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: api/backup.go:123-128
Timestamp: 2024-10-28T13:21:51.980Z
Learning: In `api/backup.go`, the method `s.m.DeleteBackupTarget` already checks for the existence of the backup target and handles errors. Therefore, it's unnecessary to add an existence check before calling this method.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-24T12:04:44.187Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/backup_controller.go:325-326
Timestamp: 2024-10-24T12:04:44.187Z
Learning: In `controller/backup_controller.go`, `backupTargetClient` is guaranteed to be non-`nil` when there is no error, so additional `nil` checks before its usage are unnecessary.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-30T12:31:25.782Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: upgrade/v17xto180/upgrade.go:68-110
Timestamp: 2024-10-30T12:31:25.782Z
Learning: In `upgrade/v17xto180/upgrade.go`, the `backupTargetName` can be trusted to be validated if it is not an empty string. Therefore, additional validation in `ensureBackupTargetSet` and `ensureLabel` is unnecessary.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-28T01:25:02.898Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/backing_image_data_source_controller.go:714-714
Timestamp: 2024-10-28T01:25:02.898Z
Learning: In `controller/backing_image_data_source_controller.go`, the method `GetBackupTargetRO` already handles error checking for `bids.Spec.Parameters[longhorn.DataSourceTypeRestoreParameterBackupTargetName]`, so explicit checks before calling it are unnecessary.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-30T12:42:20.034Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/volume_controller.go:4776-4778
Timestamp: 2024-10-30T12:42:20.034Z
Learning: In the `controller/volume_controller.go` file's `ReconcileBackupVolumeState` function, when using `errors.Wrapf(err, ...)`, avoid including `err` in the format string, since `errors.Wrapf` already includes the original error `err`. This helps to prevent redundancy in error messages.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-29T04:23:30.091Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/backup_volume_controller.go:240-242
Timestamp: 2024-10-29T04:23:30.091Z
Learning: In `controller/backup_volume_controller.go`, the `BackupVolumeDelete` method of `backupTargetClient` returns `nil` if the backup volume is not found, so it's unnecessary to check for 'not found' errors after calling this method.
Applied to files:
csi/node_server.go
📚 Learning: 2024-10-29T12:51:34.772Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: manager/volume.go:1180-1182
Timestamp: 2024-10-29T12:51:34.772Z
Learning: Ensure that when reviewing the `restoreBackingImage` function in `manager/volume.go`, verify that variables referenced in error messages are actually undefined before suggesting corrections.
Applied to files:
csi/node_server.gocsi/util.go
📚 Learning: 2024-10-30T12:46:06.911Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/volume_controller.go:3884-3894
Timestamp: 2024-10-30T12:46:06.911Z
Learning: In `controller/volume_controller.go`, within the `restoreVolumeRecurringJobs` function, avoid adding redundant checks for `v.Spec.BackupTargetName == ""`, as `GetBackupVolumeWithBackupTargetAndVolumeRO` already performs this validation, and `v.Spec.BackupTargetName` defaults to `default` when the custom resource is created.
Applied to files:
csi/node_server.go
📚 Learning: 2024-11-25T23:55:02.080Z
Learnt from: derekbit
Repo: longhorn/longhorn-manager PR: 3282
File: controller/monitor/node_upgrade_monitor.go:351-357
Timestamp: 2024-11-25T23:55:02.080Z
Learning: `GetVolumeRO` guarantees that `volume` is non-nil when `err == nil`, so explicit nil checks after error handling are not needed.
Applied to files:
csi/node_server.gocsi/util.go
📚 Learning: 2024-10-29T08:51:32.394Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/backup_controller.go:310-310
Timestamp: 2024-10-29T08:51:32.394Z
Learning: In `controller/backup_controller.go`, when retrieving `backupVolume` using `GetBackupVolumeWithBackupTargetAndVolume`, it can be `nil` without being an error, and the code should handle this case appropriately.
Applied to files:
csi/util.go
📚 Learning: 2024-10-30T12:51:18.255Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/volume_controller.go:3652-3669
Timestamp: 2024-10-30T12:51:18.255Z
Learning: In `volume_controller.go`, within the `getBackupVolumeInfo` method, avoid adding `nil` checks for the `volume` variable `v` because it is already guaranteed to be non-`nil` by an earlier check at the beginning of the `syncVolume` method.
Applied to files:
csi/util.go
📚 Learning: 2024-10-28T15:09:08.570Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: webhook/resources/volume/mutator.go:115-117
Timestamp: 2024-10-28T15:09:08.570Z
Learning: In `webhook/resources/volume/mutator.go`, within the `Create` method of `volumeMutator`, it's acceptable for `bv` to be `nil` after calling `v.ds.GetBackupVolumeWithBackupTargetAndVolumeRO`, as a `nil` `bv` is handled by the volume and engine controllers. Explicit `nil` checks in this context are not necessary.
Applied to files:
csi/util.go
📚 Learning: 2024-10-28T14:41:50.732Z
Learnt from: mantissahz
Repo: longhorn/longhorn-manager PR: 2182
File: controller/uninstall_controller.go:530-535
Timestamp: 2024-10-28T14:41:50.732Z
Learning: The duplicate backup target deletion code in the `UninstallController` is intentional, and the `deleteRecreatedCRs` method may be removed instead.
Applied to files:
csi/util.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Build binaries
- GitHub Check: Summary
🔇 Additional comments (6)
csi/util.go (3)
426-440: LGTM! Enhanced unmount with lazy fallback.The addition of lazy unmount as a fallback for stuck mount points aligns well with the PR objectives. The use of
isNotMountedErrorto treat "not mounted" conditions as success is appropriate, and the logging provides good visibility for troubleshooting.
442-465: LGTM! Improved cleanup with explicit symlink handling.The refactored cleanup logic using
os.Lstatcorrectly handles broken symlinks (whichos.Statwould fail on), and the explicitos.Removecall with error checking ensures thorough cleanup. The logging of file mode information is helpful for troubleshooting.
575-581: String-based error detection is acceptable for mount errors.The function uses string matching to detect "not mounted" conditions, which is somewhat fragile but reasonable given that mount utilities typically return string-based errors. The checked strings ("not mounted" and "no mount point specified") cover the common cases.
Note: If future maintenance reveals additional error patterns that should be treated as "not mounted", consider adding them to this helper.
csi/node_server.go (3)
367-367: LGTM! More precise error message.The updated message "prepare mount point directory" correctly distinguishes the parent directory preparation from the bind mount point file preparation that follows.
370-373: LGTM! Clean integration of bind mount point preparation.The call to
ensureBindMountPointproperly prepares the target path for bind mounting, and the error message clearly distinguishes "mount point file" from the earlier "mount point directory" message, providing good clarity for troubleshooting.
376-384: Excellent error handling that preserves the mount failure.The updated logic correctly:
- Returns immediately on success (lines 377-379)
- Attempts cleanup on failure without shadowing the original mount error (lines 380-383)
- Returns the mount error (
mountErr) with full context (line 384)This perfectly aligns with the PR objective to "ensure the original mount error is not shadowed by subsequent cleanup logic." The error message provides comprehensive debugging information with both the device path and target path.
f7bb24e to
5f20e98
Compare
WebberHuang1118
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a nit, thanks for the enhancement.
longhorn-12006 Signed-off-by: Raphanus Lo <yunchang.lo@suse.com>
5f20e98 to
1b80f32
Compare
Which issue(s) this PR fixes:
Issue longhorn/longhorn#12006
What this PR does / why we need it:
Ensure the target mount point is unmounted and removed while node server handling a failed publish volume request.
Special notes for your reviewer:
Additional documentation or context