Home - Waterfall Grid T-Grid Console Builders Recent Builds Buildslaves Changesources - JSON API - About

Console View


Tags: Architectures Platforms default
Legend:   Passed Failed Warnings Failed Again Running Exception Offline No data

Architectures Platforms default
Jorgen Lundman
Upstream: Memory leak in dsl_destroy_snapshots_nvl error case

Signed-off-by: Jorgen Lundman <lundman@lundman.net>

Pull-request: #10366 part 1/1
Brian Behlendorf
Revert "Let zfs mount all tolerate in-progress mounts"

This reverts commit a9cd8bf which introduced a segfault when running
`zfs mount -a` multiple times when there are mountpoints which are
not empty.  This segfault is now seen frequently by the CI after the
mount code was updated to directly call mount(2).

The original reason this logic was added is described in #8881.
Since then the systemd `zfs-share.target` has been updated to run
"After" the `zfs-mount.server` which should avoid this issue.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #9560

Pull-request: #10364 part 1/1
Brian Behlendorf
ZTS: Fix zfs_mount.kshlib cleanup

Update cleanup_filesystem to use destroy_dataset when performing
cleanup.  This ensures the destroy is retried if the pool is busy
preventing occasional failures.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10358
Brian Behlendorf
Changes

* Impose a maximum rebuild segment size which is less than the
  maximum block size.  This resolves the performance discrepency
  between resilvers and rebuilds.  Allocating very large ABDs can
  be expensive and out weighs the marginal performance gain of
  >1MB IOs.

* Harded the import logic to gracefully handle the expected cases
  where the rebuild status might be damaged on disk.  We don't
  want this to ever prevent a pool import.

* Account for unflushed allocations and frees when calculating
  the ranges to be rebuilt.  Failure to include these ranges
  was resulting in a handful of checksum errors detected by a
  follow up verification scrub.

* Update the attach_rebuild, attach_resilver, replace_rebuild,
  and replace_resilver tests to verify the pool with a scrub
  after the rebuild completes.

* Fix lock inversion caused by calling spa_notify_waiters(spa)
  under the rebuild lock.  The locks are taken in the reverse
  order by spa_wait().

* Update ztest to occasional perform rebuilds insteads of
  resilvered for additional stress testing.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>

Pull-request: #10349 part 2/2
Brian Behlendorf
Device rebuild feature

The sequential device rebuild feature adds a more traditional
RAID rebuild mechanism to ZFS.  Specifially, it allows for mirror
vdevs to be rebuilt in LBA order.  Depending on the pools average
block size, overall fragmentation and performance characteristics
of the devices (SMR) a rebuild may restore redundancy in less time
than a resilver.  However, it cannot verify block checksums as
part of the rebuild therefore it's recommended that a scrub be
run after the rebuild completes.

The new '-r' option has been added to the `zpool attach` and
`zpool replace` command to request a rebuild instead of a resilver.

    zpool attach -r <pool> <existing vdev> <new vdev>
    zpool replace -r <pool> <old vdev> <new vdev>

The `zpool status` output was updated to report the progress of
a rebuild in a similar way to a normal resilver or scrub.  The
one notable differce is that multiple rebuilds may be currently
in progress as long as they're operating on different top-level
vdevs.  Rebuilds, resilvers, and scrubs are mutually exclusive
operations and only one at a time is permitted.

Additionally, the `zpool wait` command was updated with the
'rebuild' type to allow waiting on rebuild operations.

Device rebuilds cannot be supported for RAIDZ, but are compatible
with the dRAID feature being developed.

As part of this change the resilver_restart_* tests were moved
in to the functional/replacement directory.  Additionally, the
replacement tests were renamed and extended to verify both
resilvering and rebuilding.

Original-patch-by: Isaac Huang <he.huang@intel.com>
Co-authored-by: Isaac Huang <he.huang@intel.com>
Co-authored-by: Mark Maybee <mmaybee@cray.com>
Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>

Pull-request: #10349 part 1/2
Brian Behlendorf
Changes

* Impose a maximum rebuild segment size which is less than the
  maximum block size.  This resolves the performance discrepency
  between resilvers and rebuilds.  Allocating very large ABDs can
  be expensive and out weighs the marginal performance gain of
  >1MB IOs.

* Harded the import logic to gracefully handle the expected cases
  where the rebuild status might be damaged on disk.  We don't
  want this to ever prevent a pool import.

* Account for unflushed allocations and frees when calculating
  the ranges to be rebuilt.  Failure to include these ranges
  was resulting in a handful of checksum errors detected by a
  follow up verification scrub.

* Update the attach_rebuild, attach_resilver, replace_rebuild,
  and replace_resilver tests to verify the pool with a scrub
  after the rebuild completes.

* Fix lock inversion caused by calling spa_notify_waiters(spa)
  under the rebuild lock.  The locks are taken in the reverse
  order by spa_wait().

* Update ztest to occasional perform rebuilds insteads of
  resilvered for additional stress testing.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>

Pull-request: #10349 part 2/2
Brian Behlendorf
Device rebuild feature

The sequential device rebuild feature adds a more traditional
RAID rebuild mechanism to ZFS.  Specifially, it allows for mirror
vdevs to be rebuilt in LBA order.  Depending on the pools average
block size, overall fragmentation and performance characteristics
of the devices (SMR) a rebuild may restore redundancy in less time
than a resilver.  However, it cannot verify block checksums as
part of the rebuild therefore it's recommended that a scrub be
run after the rebuild completes.

The new '-r' option has been added to the `zpool attach` and
`zpool replace` command to request a rebuild instead of a resilver.

    zpool attach -r <pool> <existing vdev> <new vdev>
    zpool replace -r <pool> <old vdev> <new vdev>

The `zpool status` output was updated to report the progress of
a rebuild in a similar way to a normal resilver or scrub.  The
one notable differce is that multiple rebuilds may be currently
in progress as long as they're operating on different top-level
vdevs.  Rebuilds, resilvers, and scrubs are mutually exclusive
operations and only one at a time is permitted.

Additionally, the `zpool wait` command was updated with the
'rebuild' type to allow waiting on rebuild operations.

Device rebuilds cannot be supported for RAIDZ, but are compatible
with the dRAID feature being developed.

As part of this change the resilver_restart_* tests were moved
in to the functional/replacement directory.  Additionally, the
replacement tests were renamed and extended to verify both
resilvering and rebuilding.

Original-patch-by: Isaac Huang <he.huang@intel.com>
Co-authored-by: Isaac Huang <he.huang@intel.com>
Co-authored-by: Mark Maybee <mmaybee@cray.com>
Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>

Pull-request: #10349 part 1/2
Alek Pinchuk
fix dnode eviction typo in arc_evict_state()

In addition decrease arc_prune_taskq max thread count to avoid
consuming all the cores when the system is under memory pressure and
is trying to reclaim memory from the dentry/inode caches.
closes #7559

Signed-off-by: Alek Pinchuk <apinchuk@axcient.com>

Pull-request: #10331 part 1/1
Michael Niewöhner
Rework stdio and printk compatibility

Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>

Pull-request: #10278 part 20/20
Michael Niewöhner
Rework stdio and printk compatibility

Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>

Pull-request: #10278 part 20/20
Michael Niewöhner
Another FreeBSD build fix

Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>

Pull-request: #10278 part 19/19
Michael Niewöhner
Another FreeBSD build fix

Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>

Pull-request: #10278 part 19/19
Michael Niewöhner
Another FreeBSD build fix

Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>

Pull-request: #10278 part 19/19
Michael Niewöhner
Another FreeBSD build fix

Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>

Pull-request: #10278 part 19/19
Michael Niewöhner
malloc

Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Michael Niewöhner <foss@mniewoehner.de>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>

Pull-request: #10278 part 18/18
  • Debian 8 arm (BUILD): cloning zfs -  stdio
  • Debian 8 ppc (BUILD): cloning zfs -  stdio
Michael Niewöhner
malloc

Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Michael Niewöhner <foss@mniewoehner.de>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>

Pull-request: #10278 part 18/18
Michael Niewöhner
malloc

Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Michael Niewöhner <foss@mniewoehner.de>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>

Pull-request: #10278 part 18/18
Michael Niewöhner
malloc

Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Michael Niewöhner <foss@mniewoehner.de>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>

Pull-request: #10278 part 18/18
  • Debian 8 ppc (BUILD): cloning zfs -  stdio
Michael Niewöhner
malloc

Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Michael Niewöhner <foss@mniewoehner.de>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>

Pull-request: #10278 part 18/18
Michael Niewöhner
malloc

Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>

Pull-request: #10278 part 18/18
  • Ubuntu 18.04 x86_64 (STYLE): cloning zfs -  stdio
Michael Niewöhner
malloc

Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>

Pull-request: #10278 part 18/18
Michael Niewöhner
oops...

Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>

Pull-request: #10278 part 18/18
Michael Niewöhner
Fix codecov

Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>

Pull-request: #10278 part 17/17
Michael Niewöhner
aarch64: disable SIMD for zstd kernel space

Co-authored-by: Michael Niewöhner <foss@mniewoehner.de>
Co-authored-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>
Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>
Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>

Pull-request: #10278 part 14/14
Michael Niewöhner
aarch64: disable SIMD for zstd kernel space

Add the disabling code to zstdlib-in.c and recombine zstdlib.c

Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>

Pull-request: #10278 part 14/14
Michael Niewöhner
aarch64: disable SIMD for zstd kernel space

Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>

Pull-request: #10278 part 14/14
Michael Niewöhner
another test

Signed-off-by: Michael Niewöhner <foss@mniewoehner.de>

Pull-request: #10278 part 14/14
Sebastian Gottschall
replace malloc wrapper with dummy

malloc / free / calloc is not called in reality, so we dont need such a
wrapper. replace them with dummies

Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>

Pull-request: #10277 part 24/24
Sebastian Gottschall
replace malloc wrapper with dummy

malloc / free / calloc is not called in reality, so we dont need such a
wrapper. replace them with dummies

Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>

Pull-request: #10277 part 24/24
Sebastian Gottschall
merge changes from PR10278

Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>

Pull-request: #10277 part 23/23
Sebastian Gottschall
some bsd fixes

Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>

Pull-request: #10277 part 26/26
  • Amazon 2 x86_64 (BUILD): cloning zfs -  stdio
  • Debian 8 arm (BUILD): cloning zfs -  stdio
  • Debian 8 ppc (BUILD): cloning zfs -  stdio
  • Ubuntu 16.04 aarch64 (BUILD): cloning zfs -  stdio
  • Kernel.org Built-in x86_64 (BUILD): cloning zfs -  stdio
  • Ubuntu 18.04 x86_64 (STYLE): cloning zfs -  stdio
Sebastian Gottschall
fix cstyle

Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>

Pull-request: #10277 part 26/26
Sebastian Gottschall
disable aarch64 neon assembly instructions

this is only required for kernel mode

Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>

Pull-request: #10277 part 25/25
Sebastian Gottschall
disable aarch64 neon assembly instructions

this is only required for kernel mode

Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>

Pull-request: #10277 part 25/25
Sebastian Gottschall
disable aarch64 neon assembly instructions

this is only required for kernel mode

Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>

Pull-request: #10277 part 25/25
Sebastian Gottschall
disable aarh64 neon assembly instructions

this is only required for kernel mode

Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>

Pull-request: #10277 part 25/25
  • Amazon 2 x86_64 (BUILD): cloning zfs -  stdio
  • Debian 8 arm (BUILD): cloning zfs -  stdio
  • Debian 8 ppc64 (BUILD): cloning zfs -  stdio
  • Debian 8 ppc (BUILD): cloning zfs -  stdio
  • Ubuntu 16.04 aarch64 (BUILD): cloning zfs -  stdio
  • Ubuntu 16.04 i386 (BUILD): cloning zfs -  stdio
  • Kernel.org Built-in x86_64 (BUILD): cloning zfs -  stdio
Brian Behlendorf
Fix dRAID sparing issues

* Fix unsparing for dRAID hot spares.  The distributed hot spares
  are integrated with the dRAID and this cannot be shared or removed.
  Therefore they must never be removed from the host spares list.

* Update construct_spec() to only allow configurations where dRAID
  spares are used to replace primary vdevs.  Never special vdev types.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>

Pull-request: #10102 part 4/4
Brian Behlendorf
[WIP] Distributed Parity (dRAID) Feature

WARNING: This is still work in progress.  The user interface
and on-disk format have changed from previous versions of this
PR.  It is not compatible with previous versions.  The on-disk
format is not finalized and may continue to change in future
versions.

This patch adds a new top-level vdev type called dRAID, which
stands for Distributed parity RAID.  This pool configuration
allows all dRAID vdevs to participate when rebuilding to a hot
spare device.  This can substantially reduce the total time
required to restore full parity to pool with a failed device.

A dRAID pool can be created using the new top-level `draid` type.
Like `raidz`, the desired redundancy is specified after the type:
`draid[1,2,3]`.  No additional information is required to create
the pool and reasonable default values will be chosen based on
the number of child vdevs in the dRAID vdev.

    zpool create <pool> draid[1,2,3] <vdevs...>

Unlike raidz, additional optional dRAID configuration values can
be provided as part of the draid type as colon separated values.
This allows administrators to fully specify a layout for either
performance or capacity reasons.  The supported options include:

  - draid[:<groups>g] - Redundancy groups
  - draid[:<spares>s] - Distributed hot spares (default 1)
  - draid[:<data>d]  - Data devices per group
  - draid[:<iter>i]  - Iterations perform when generating
                        a valid dRAID mapping (default 3)

As part of adding test coverage for the new dRAID vdev type
the following options were added to the ztest command.  These
options are leverages by the zloop.sh test script to test a
wide range of dRAID configurations.

  -K draid|raidz|random -- kind of RAID to test
  -D <value> -- dRAID data drives per redundancy group
  -G <value> -- dRAID redundancy group count
  -S <value> -- dRAID distributed spare drives
  -R <value> -- RAID parity (raidz or dRAID)
  -L        -- (Existing -G (dump log option) was renamed -L)

The zpool_create, zpool_import, redundancy, replacement and
fault test groups have all been updated provide test coverage
for the dRAID feature.

TODO:
- [x] - Rebased on master, will be frequently rebased from now on.
- [x] - Add dRAID config validation functionality.
- [x] - Enforced reasonable defaults to prevent harmful configs.
- [x] - Move common dRAID functions to zcommon.
- [x] - Replaced `draidcfg` command with `zpool create ...`.
- [x] - Add functionalty to load/save known dRAID layouts.
- [x] - Convert custom dRAID debugging to normal ZFS debugging.
- [x] - Cleaned up 'zpool status' output.
- [x] - Permutations for 255 device pool reduce to fit in label.
- [x] - Rebuild works with virtual hot spare and physical device.
- [x] - Allow adding new top-level dRAID vdevs (no removal).
- [x] - Logical spares are now mandatory for dRAID.
- [x] - Update `ztest` to add dRAID pools to its pool layouts.
- [x] - User commands updated to detect dRAID kmod support.
- [x] - Resolve checksum errors for non-uniform groups in pools
- [x] - Investigate reducing the permutations size in the label.
- [x] - Review and update the sequential rebuild code.
- [x] - Debug (or remove) dRAID mirror code (currently disabled).
- [x] - Add `zpool replace` option to request rebuild or resilver.
- [x] - Add new and extend existing ZTS test cases.
- [x] - Investigate stale labels on disk preventing pool import.
- [x] - Verify checksum errors are reported correctly (zinject).
- [x] - Support 'zpool detach/attach' for rebuild during sparing.
- [x] - Add support for ZED to kick in logical spares.
- [x] - Developer documention for vdev_rebuild.c
- [x] - Verify gang block handling works correctly.
- [x] - Verify rebuild/resilver `zpool status` reporting.
- [x] - Update packaging as needed.
- [x] - Add new and extend existing ZTS test cases.
- [x] - Documentation updates (man pages, comments, wiki, etc).
- [ ] - Verify corruption repair works correctly.
- [ ] - Implement vdev_xlate() to support initialize and trim.
- [ ] - Developer documention for vdev_draid.c
- [ ] - Performance optimizaiton / analysis.
- [ ] - Evaluate the existing rebuild throttle

Future work:
- [ ] - Extend `zdb` to print 2d array of permutation tables.
- [ ] - Add a utility to generate known balanced layouts.
- [ ] - Generate known balanced dRAID layouts for common configurations.

Co-authored-by: Isaac Huang <he.huang@intel.com>
Co-authored-by: Mark Maybee <mmaybee@cray.com>
Co-authored-by: Don Brady <don.brady@delphix.com>
Co-authored-by: Srikanth N S <nsrikanth@cray.com>
Co-authored-by: Stuart Maybee <smaybee@cray.com>
Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>

External-issue: ZFS-12 ZFS-35 ZFS-36 ZFS-17 ZFS-56 ZFS 95 ZFS-96
External-issue: ZFS-100 ZFS-103 ZFS-106 ZFS-110 ZFS-111 ZFS-117
External-issue: ZFS-137 ZFS-139 ZFS-202
Issue #9558

Pull-request: #10102 part 3/4
Brian Behlendorf
Changes

* Impose a maximum rebuild segment size which is less than the
  maximum block size.  This resolves the performance discrepency
  between resilvers and rebuilds.  Allocating very large ABDs can
  be expensive and out weighs the marginal performance gain of
  >1MB IOs.

* Harded the import logic to gracefully handle the expected cases
  where the rebuild status might be damaged on disk.  We don't
  want this to ever prevent a pool import.

* Account for unflushed allocations and frees when calculating
  the ranges to be rebuilt.  Failure to include these ranges
  was resulting in a handful of checksum errors detected by a
  follow up verification scrub.

* Update the attach_rebuild, attach_resilver, replace_rebuild,
  and replace_resilver tests to verify the pool with a scrub
  after the rebuild completes.

* Fix lock inversion caused by calling spa_notify_waiters(spa)
  under the rebuild lock.  The locks are taken in the reverse
  order by spa_wait().

* Update ztest to occasional perform rebuilds insteads of
  resilvered for additional stress testing.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>

Pull-request: #10102 part 2/4
Brian Behlendorf
Device rebuild feature

The sequential device rebuild feature adds a more traditional
RAID rebuild mechanism to ZFS.  Specifially, it allows for mirror
vdevs to be rebuilt in LBA order.  Depending on the pools average
block size, overall fragmentation and performance characteristics
of the devices (SMR) a rebuild may restore redundancy in less time
than a resilver.  However, it cannot verify block checksums as
part of the rebuild therefore it's recommended that a scrub be
run after the rebuild completes.

The new '-r' option has been added to the `zpool attach` and
`zpool replace` command to request a rebuild instead of a resilver.

    zpool attach -r <pool> <existing vdev> <new vdev>
    zpool replace -r <pool> <old vdev> <new vdev>

The `zpool status` output was updated to report the progress of
a rebuild in a similar way to a normal resilver or scrub.  The
one notable differce is that multiple rebuilds may be currently
in progress as long as they're operating on different top-level
vdevs.  Rebuilds, resilvers, and scrubs are mutually exclusive
operations and only one at a time is permitted.

Additionally, the `zpool wait` command was updated with the
'rebuild' type to allow waiting on rebuild operations.

Device rebuilds cannot be supported for RAIDZ, but are compatible
with the dRAID feature being developed.

As part of this change the resilver_restart_* tests were moved
in to the functional/replacement directory.  Additionally, the
replacement tests were renamed and extended to verify both
resilvering and rebuilding.

Original-patch-by: Isaac Huang <he.huang@intel.com>
Co-authored-by: Isaac Huang <he.huang@intel.com>
Co-authored-by: Mark Maybee <mmaybee@cray.com>
Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>

Pull-request: #10102 part 1/4