Home - Waterfall Grid T-Grid Console Builders Recent Builds Buildslaves Changesources - JSON API - About

Console View


Tags: Architectures Distributions Performance Style Tests default
Legend:   Passed Failed Warnings Failed Again Running Exception Offline No data

Architectures Distributions Performance Style Tests default
Matthew Macy
async COW part 2 - factor out dbuf_sync_bonus

Signed-off-by: Matt Macy <mmacy@FreeBSD.org>

Pull-request: #9208 part 2/2
Matthew Macy
Convert dbuf dirty record record list to a list_t

Additionally pull in state machine comments about
upcoming async cow work.

Signed-off-by: Matt Macy <mmacy@FreeBSD.org>

Pull-request: #9208 part 1/2
Matthew Macy
openzfs restructuring part 2

move platform specific module source files
and update makefiles accordingly

Signed-off-by: Matthew Macy <mmacy@FreeBSD.org>

Pull-request: #9206 part 2/2
  • Ubuntu 18.04 x86_64 (TEST): zfstests failed -  stdiotests
Matthew Macy
openzfs restructuring part 1
move platform specific headers

Signed-off-by: Matthew Macy <mmacy@FreeBSD.org>

Pull-request: #9206 part 1/2
Matthew Macy
openzfs restructuring part 2

move platform specific module source files
and update makefiles accordingly

Signed-off-by: Matthew Macy <mmacy@FreeBSD.org>

Pull-request: #9206 part 2/2
Matthew Macy
openzfs restructuring part 1
move platform specific headers

Signed-off-by: Matthew Macy <mmacy@FreeBSD.org>

Pull-request: #9206 part 1/2
Matthew Macy
openzfs restructuring part

move platform specific module source files
and update makefiles accordingly

Signed-off-by: Matthew Macy <mmacy@FreeBSD.org>

Pull-request: #9206 part 2/2
Matthew Macy
openzfs restructuring part 1
move platform specific headers

Signed-off-by: Matthew Macy <mmacy@FreeBSD.org>

Pull-request: #9206 part 1/2
Paul Dagnelie
Fix install error introduced by #9089

Signed-off-by: Paul Dagnelie <pcd@delphix.com>

Pull-request: #9205 part 1/1
Matthew Macy
Convert dbuf dirty record record list to a list_t

Additionally pull in state machine comments about
upcoming async cow work.

Signed-off-by: Matt Macy <mmacy@FreeBSD.org>

Pull-request: #9200 part 1/1
Paul Dagnelie
feedback

Signed-off-by: Paul Dagnelie <pcd@delphix.com>

Pull-request: #9197 part 4/4
Paul Dagnelie
DLPX-65047 [Backport of Issue DLPX-63811 to 5.3.5.0] when device removal is cancelled with loaded metaslabs, assertion fails

Reviewed at: http://reviews.delphix.com/r/50310/

Pull-request: #9197 part 3/4
Paul Dagnelie
DLPX-65016 [Backport of Issue DLPX-65015 to 5.3.5.0] keep even more metaslabs loaded

Reviewed at: http://reviews.delphix.com/r/50808/

Pull-request: #9197 part 2/4
Paul Dagnelie
keep more metaslabs loaded

Signed-off-by: Paul Dagnelie <pcd@delphix.com>

Pull-request: #9197 part 1/4
Paul Dagnelie
DLPX-65047 [Backport of Issue DLPX-63811 to 5.3.5.0] when device removal is cancelled with loaded metaslabs, assertion fails

Reviewed at: http://reviews.delphix.com/r/50310/

Pull-request: #9197 part 3/3
Paul Dagnelie
DLPX-65016 [Backport of Issue DLPX-65015 to 5.3.5.0] keep even more metaslabs loaded

Reviewed at: http://reviews.delphix.com/r/50808/

Pull-request: #9197 part 2/3
Paul Dagnelie
keep more metaslabs loaded

Signed-off-by: Paul Dagnelie <pcd@delphix.com>

Pull-request: #9197 part 1/3
Matthew Macy
add callbacks to taskq create

Signed-off-by: Matt Macy <mmacy@FreeBSD.org>

Pull-request: #9192 part 1/1
Tony Hutter
Squash zfs-0.8.2 commits for testing

Signed-off-by: Tony Hutter <hutter2@llnl.gov>

Pull-request: #9164 part 2/2
Tony Hutter
Tag zfs-0.8.1

META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>

Pull-request: #9164 part 1/2
Tony Hutter
Minimize aggsum_compare(&arc_size, arc_c) calls.

For busy ARC situation when arc_size close to arc_c is desired.  But
then it is quite likely that aggsum_compare(&arc_size, arc_c) will need
to flush per-CPU buckets to find exact comparison result.  Doing that
often in a hot path penalizes whole idea of aggsum usage there, since it
replaces few simple atomic additions with dozens of lock acquisitions.

Replacing aggsum_compare() with aggsum_upper_bound() in code increasing
arc_p when ARC is growing (arc_size < arc_c) according to PMC profiles
allows to save ~5% of CPU time in aggsum code during sequential write
to 12 ZVOLs with 16KB block size on large dual-socket system.

I suppose there some minor arc_p behavior change due to lower precision
of the new code, but I don't think it is a big deal, since it should
affect only very small window in time (aggsum buckets are flushed every
second) and in ARC size (buckets are limited to 10 average ARC blocks
per CPU).

Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:  Alexander Motin <mav@FreeBSD.org>
Closes #8901

Pull-request: #9161 part 30/91
Tony Hutter
Python config cleanup

Don't require Python at configure/build unless building pyzfs.
Move ZFS_AC_PYTHON_MODULE to always-pyzfs.m4 where it is used.
Make test syntax more consistent.

Sponsored by: iXsystems, Inc.
Reviewed-by: Neal Gompa <ngompa@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@ixsystems.com>
Closes #8895

Pull-request: #9161 part 29/91
Tony Hutter
lz4_decompress_abd declared but not defined

`lz4_decompress_abd` is declared in zio_compress.h but it is not defined
anywhere. The declaration should be removed.

Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-47477
Closes #8894

Pull-request: #9161 part 28/91
Tony Hutter
panic in removal_remap test on 4K devices

If the zfs_remove_max_segment tunable is changed to be not a multiple of
the sector size, then the device removal code will malfunction and try
to create mappings that are smaller than one sector, leading to a panic.

On debug bits this assertion will fail in spa_vdev_copy_segment():
    ASSERT3U(DVA_GET_ASIZE(&dst), ==, size);

On nondebug, the system panics with a stack like:
    metaslab_free_concrete()
    metaslab_free_impl()
    metaslab_free_impl_cb()
    vdev_indirect_remap()
    free_from_removing_vdev()
    metaslab_free_impl()
    metaslab_free_dva()
    metaslab_free()

Fortunately, the default for zfs_remove_max_segment is 1MB, so this
can't occur by default.  We hit it during this test because
removal_remap.ksh changes zfs_remove_max_segment to 1KB. When testing on
4KB-sector disks, we hit the bug.

This change makes the zfs_remove_max_segment tunable more robust,
automatically rounding it up to a multiple of the sector size. We also
turn some key assertions into VERIFY's so that similar bugs would be
caught before they are encoded on disk (and thus avoid a
panic-reboot-loop).

Reviewed-by: Sean Eric Fagan <sef@ixsystems.com>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-61342
Closes #8893

Pull-request: #9161 part 27/91
Tony Hutter
compress metadata in later sync passes

Starting in sync pass 5 (zfs_sync_pass_dont_compress), we disable
compression (including of metadata).  Ostensibly this helps the sync
passes to converge (i.e. for a sync pass to not need to allocate
anything because it is 100% overwrites).

However, in practice it increases the average number of sync passes,
because when we turn compression off, a lot of block's size will change
and thus we have to re-allocate (not overwrite) them.  It also increases
the number of 128KB allocations (e.g. for indirect blocks and spacemaps)
because these will not be compressed.  The 128K allocations are
especially detrimental to performance on highly fragmented systems,
which may have very few free segments of this size, and may need to load
new metaslabs to satisfy 128K allocations.

We should increase zfs_sync_pass_dont_compress.  In practice on a highly
fragmented system we see a few 5-pass txg's, a tiny number of 6-pass
txg's, and no txg's with more than 6 passes.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: George Wilson <george.wilson@delphix.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-63431
Closes #8892

Pull-request: #9161 part 26/91
Tony Hutter
systemd encryption key support

Modify zfs-mount-generator to produce a dependency on new
zfs-import-key-*.service units, dynamically created at boot to call
zfs load-key for the encryption root, before attempting to mount any
encrypted datasets.

These units are created by zfs-mount-generator, and RequiresMountsFor on
the keyfile, if present, or call systemd-ask-password if a passphrase is
requested.

This patch includes suggestions from @Fabian-Gruenbichler, @ryanjaeb and
@rlaager, as well an adaptation of @rlaager's script to retry on
incorrect password entry.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Fabian Gr├╝nbichler <f.gruenbichler@proxmox.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Antonio Russo <antonio.e.russo@gmail.com>
Closes #8750
Closes #8848

Pull-request: #9161 part 25/91
Tony Hutter
Move write aggregation memory copy out of vq_lock

Memory copy is too heavy operation to do under the congested lock.
Moving it out reduces congestion by many times to almost invisible.
Since the original zio removed from the queue, and the child zio is
not executed yet, I don't see why would the copy need protection.
My guess it just remained like this from the time when lock was not
dropped here, which was added later to fix lock ordering issue.

Multi-threaded sequential write tests with both HDD and SSD pools
with ZVOL block sizes of 4KB, 16KB, 64KB and 128KB all show major
reduction of lock congestion, saving from 15% to 35% of CPU time
and increasing throughput from 10% to 40%.

Reviewed-by: Richard Yao <ryao@gentoo.org>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:  Alexander Motin <mav@FreeBSD.org>
Closes #8890

Pull-request: #9161 part 24/91
Tony Hutter
Restrict filesystem creation if name referred either '.' or '..'

This change restricts filesystem creation if the given name
contains either '.' or '..'

Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Signed-off-by: TulsiJain <tulsi.jain@delphix.com>
Closes #8842
Closes #8564

Pull-request: #9161 part 23/91
Tony Hutter
ztest: dmu_tx_assign() gets ENOSPC in spa_vdev_remove_thread()

When running zloop, we occasionally see the following crash:

    dmu_tx_assign(tx, TXG_WAIT) == 0 (0x1c == 0)
    ASSERT at ../../module/zfs/vdev_removal.c:1507:spa_vdev_remove_thread()/sbin/ztest(+0x89c3)[0x55faf567b9c3]

The error value 0x1c is ENOSPC.

The transaction used by spa_vdev_remove_thread() should not be able to
fail due to being out of space. i.e. we should not call
dmu_tx_hold_space().  This will allow the removal thread to schedule its
work even when the pool is low on space.  The "slop space" will provide
enough free space to sync out the txg.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-37853
Closes #8889

Pull-request: #9161 part 22/91
Tony Hutter
Fix lockdep warning on insmod

sysfs_attr_init() is required to make lockdep happy for dynamically
allocated sysfs attributes. This fixed #8868 on Fedora 29 running
kernel-debug.

This requirement was introduced in 2.6.34.
See include/linux/sysfs.h for what it actually does.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #8868
Closes #8884

Pull-request: #9161 part 21/91
Tony Hutter
fat zap should prefetch when iterating

When iterating over a ZAP object, we're almost always certain to iterate
over the entire object. If there are multiple leaf blocks, we can
realize a performance win by issuing reads for all the leaf blocks in
parallel when the iteration begins.

For example, if we have 10,000 snapshots, "zfs destroy -nv
pool/fs@1%9999" can take 30 minutes when the cache is cold. This change
provides a >3x performance improvement, by issuing the reads for all ~64
blocks of each ZAP object in parallel.

Reviewed-by: Andreas Dilger <andreas.dilger@whamcloud.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-58347
Closes #8862

Pull-request: #9161 part 20/91
Tony Hutter
Target ARC size can get reduced to arc_c_min

Sometimes the target ARC size is reduced to arc_c_min, which impacts
performance.  We've seen this happen as part of the random_reads
performance regression test, where the ARC size is reduced before the
reads test starts which impacts how long it takes for system to reach
good IOPS performance.

We call arc_reduce_target_size when arc_reap_cb_check() returns TRUE,
and arc_available_memory() is less than arc_c>>arc_shrink_shift.

However, arc_available_memory() could easily be low, even when arc_c is
low, because we can have tons of unused bufs in the abd kmem cache. This
would be especially true just after the DMU requests a bunch of stuff be
evicted from the ARC (e.g. due to "zpool export").

To fix this, the ARC should reduce arc_c by the requested amount, not
all the way down to arc_size (or arc_c_min), which can be very small.

Reviewed-by: Tim Chase <tim@chase2k.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
External-issue: DLPX-59431
Closes #8864

Pull-request: #9161 part 19/91
Tony Hutter
Fix typo in vdev_raidz_math.c

Fix typo in vdev_raidz_math.c

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brad Forschinger <github@bnjf.id.au>
Closes #8875
Closes #8880

Pull-request: #9161 part 18/91
Tony Hutter
Block_device_wait does not return an error code

Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Giuseppe Di Natale <guss80@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Elling <Richard.Elling@RichardElling.com>
Closes #8839

Pull-request: #9161 part 17/91
Tony Hutter
Allow metaslab to be unloaded even when not freed from

On large systems, the memory used by loaded metaslabs can become
a concern. While range trees are a fairly efficient data structure,
on heavily fragmented pools they can still consume a significant
amount of memory. This problem is amplified when we fail to unload
metaslabs that we aren't using. Currently, we only unload a metaslab
during metaslab_sync_done; in order for that function to be called
on a given metaslab in a given txg, we have to have dirtied that
metaslab in that txg. If the dirtying was the result of an allocation,
we wouldn't be unloading it (since it wouldn't be 8 txgs since it
was selected), so in effect we only unload a metaslab during txgs
where it's being freed from.

We move the unload logic from sync_done to a new function, and
call that function on all metaslabs in a given vdev during
vdev_sync_done().

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #8837

Pull-request: #9161 part 16/91
Tony Hutter
Avoid updating zfs_gitrev.h when rev is unchanged

Build process would always re-compile spa_history.c due to touching
zfs_gitrev.h - avoid if no change in gitrev.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Reviewed-by: Allan Jude <allanjude@freebsd.org>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes #8860

Pull-request: #9161 part 15/91
Tony Hutter
l2arc_apply_transforms: Fix typo in comment

Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Reviewed-by: Matt Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Allan Jude <allanjude@freebsd.org>
Closes #8822

Pull-request: #9161 part 14/91
Paul Dagnelie
resolve test issues

Pull-request: #9134 part 3/3
  • Fedora 30 x86_64 (TEST): zfstests failed -  stdio
Paul Dagnelie
Fix #! line

Signed-off-by: Paul Dagnelie <pcd@delphix.com>

Pull-request: #9134 part 2/3
Paul Dagnelie
add regression test for "zpool list -p"

Signed-off-by: Paul Dagnelie <pcd@delphix.com>

Pull-request: #9134 part 1/3