Today, we changed the default recommended filesystem in the #gentoo handbook from ext4 to xfs.
XFS is robust and has all the niceties of ext4 (modulo shrinking) but with modern features on top.
(It helps that xfsprogs
doesn't have weird endianness bugs too..)
btrfs is of course another option for the future but our users are conservative in some respects, and baby steps > none.
The main benefit of this is reflinks and copy_file_range
which automatically takes advantage of that.
Now, as for using it..
Portage uses copy_file_range
and friends when merging packages to the live filesystem when it can by providing wrappers of some Python stdlib functions.
If XFS (or another "good" backing fs is used), it can take advantage of it for you.
We've found a bunch of bugs over the years through this, in various filesystems, actually!
Take a look at https://wiki.gentoo.org/wiki/User:Sam/Memorable_bugs_I_like_to_reference for a list.
Unfortunately, Python itself doesn't automagically do this yet:
But you can see how we handle it at https://github.com/gentoo/portage/blob/22e027aef2ddb49d1c4e2423b5b1f3c209ac8efe/src/portage_util_file_copy_reflink_linux.c if interested.
Fortunately, other libraries are starting to allow transparent use!
Jannik Glückert recently implemented support in libstdc++ [0][1] which will arrive in GCC 14.
[0] https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=f80a8b42296265bb868a48592a2bd1fdaa2a3d8a
[1] https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d87caacf8e2df563afda85f3a5b7b852e08b6b2c
glib will do the same in >=2.78
thanks to https://gitlab.gnome.org/GNOME/glib/-/issues/2863.
EDIT: I forgot to mention, KDE's kio handles it too. So, together, the respective KDE and GNOME file managers and such should be fine.
And last but in no way least, GNU coreutils 9.0 defaults to --reflink=auto
for cp
and install
.
Let me know if there's other software out there which supports this transparently.
Reflinks aren't particularly new, and nor is copy_file_range
or even sendfile
(see the GCC commits), but it's interesting how often these improvements are sittong on the table waiting for someone to claim them.
Check the guts of your favourite libraries, you might be surprised!
Bonus: I should say, zfs doesn't yet support this in a release on Linux, surprisingly.
Support was recently added on master in https://github.com/openzfs/zfs/pull/13392 but only for FreeBSD. I believe Linux support is in the works.
Bonus2: Portage >=3.0.48
won't overwrite identical files on the target filesystem as well: https://bugs.gentoo.org/722270. Thanks to Michael Egger for the contribution!
Some more:
Emacs does for copy-file
: https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=486a81f387bb59b2fbbc6aff7b41adbe1621394e (thanks to Spawns_Carpeting for finding this)
Ruby's IO.copy_stream
does and e.g. FileUtils::copy_file
benefits from it as a result
PHP supports it in >= 8.2
: https://github.com/php/php-src/commit/fa6d97db5d941451615e491034918cdbaa5164bd
Seems bizarre that Emacs is further ahead than Python on this..
OpenJDK does not support it (WONTFIX'd), seemingly based on a misunderstanding of what it's for: https://bugs.openjdk.org/browse/JDK-8282039. They tested it out and saw they got the same performance without (seemingly) trying a filesystem that would benefit from it.
rsync does not support it, but they're open to it: https://github.com/WayneD/rsync/issues/153.
I've proposed it for Perl: https://www.nntp.perl.org/group/perl.perl5.porters/2023/07/msg266636.html.
@thesamesam Reflinks live!!!
@thesamesam chimerautils uses copy_file_range for cp(1), due to its freebsd origin (where copy_file_range was first introduced)
@thesamesam Sadly I think this means fallocate doesn't work and you can ENOSPC even after attempting to reserve space. Basically, overcommit for disk.