Skip to content

The Optimistic File System (OptFS) is a Linux ext4 variant that implements Optimistic Crash Consistency, a new approach to crash consistency in journaling file systems. OptFS improves performance for many workloads, sometimes by an order of magnitude. OptFS provides strong consistency, equivalent to data journaling mode of ext4.

License

ayoosh/optfs

 
 

Repository files navigation

The Optimistic File System

The Optimistic File System (OptFS) is Linux ext4 variant that implements Optimistic Crash Consistency, a new approach to crash consistency in journaling file systems. OptFS improves performance for many workloads, sometimes by an order of magnitude. OptFS provides strong consistency, equivalent to data journaling mode of ext4.

For technical details, please read the SOSP 2013 paper Optimistic Crash Consistency. Please cite this publication if you use this work.

Please feel free to contact me with any questions.

Description

The code is comprised of two parts:

  • The modified Linux 3.2 kernel.

  • The OptFS file-system module.

OptFS is referenced in the code as ext4bf (barrier-free version of ext4). The module can be found at fs/ext4bf/.

Compiling and installing the kernel

  • From the root kernel folder (where this README is found), do as root:

    mv kernel.config .config
    make -j4 && make modules && make modules_install && make install
  • Fix up grub (/etc/default/grub) as required.

  • Reboot into installed kernel.

Compiling and loading the OptFS module

Note: you must have already compiled and rebooted into the OptFS kernel before the following steps. The following scripts run OptFS on a new disk partition (/dev/sdb1). If required, you need to attach a second disk (/dev/sdb) to the machine, and create a partition (/dev/sdb1) on the disk.

  • mkdir /mnt/mydisk
  • Navigate to fs/ext4bf folder.

  • Make the file system (on /dev/sdb1) using:

    sh mk-big.sh

    Warning: this creates a file system on /dev/sdb1, erasing all previous content. The drive has to be big enough to contain a 16 GB journal.

  • Compile and load the module using

    sh load.sh

    This will result in an OptFS file system on /dev/sdb1, mounted at /mnt/mydisk.

Patches

We have included two patches: a full patch and an educational patch. The full patch is the diff of OptFS kernel with a vanilla linux 3.2 kernel. Applying this patch directly to a 3.2 kernel will result in the OptFS kernel.

The OptFS file-system module is modified on top of ext4 and jbd2. It contains the code of both ext4 and jbd2. Diffing this against just ext4 will result in a lot of redundant lines in the patch. Hence we have roughly merged ext4 and jbd2 into a module, and diffed OptFS against this. This patch is presented as optfs_patch_educational.

Note that optfs_patch_educational should never be applied to an actual system: this will not work. It is provided so that users can easily view the differences in code between OptFS and ext4, and is not meant to be applied directly.

Notes

OptFS is built upon asynchronous durability notifications from the disk. Since disks do not provide such functionality yet, we emulate such notifications using durability timeouts in OptFS. This is basically a guarantee from the disk that it will not cache a write request for longer than a certain amount of time. This is set to be 30 seconds in OptFS.

This needs to be tuned for the disk that OptFS runs on. If the disk does cache writes for more time than the OptFS setting, OptFS will not be able to guarantee consistency in the case of a crash.

Caveats

This version of the code provides only eventual durability by default: every OptFS (ext4bf) fsync() call behaves likes an osync(). Therefore, any unmodified application running on ext4bf behaves as if every fsync() was replaced by an osync(). This allows you to take any application and run it on OptFS to determine the maximum performance you could potentially get.

For applications requiring durability, please use the dsync() call. The code can be modified fairly easily so that the default behavior is dsync() instead of osync().

Note that the code is provided "as is": compiling and running the code will require some tweaking based on the operating system environment. The file system is only meant as a prototype and not meant for production use in any way.

About

The Optimistic File System (OptFS) is a Linux ext4 variant that implements Optimistic Crash Consistency, a new approach to crash consistency in journaling file systems. OptFS improves performance for many workloads, sometimes by an order of magnitude. OptFS provides strong consistency, equivalent to data journaling mode of ext4.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C 96.3%
  • Assembly 2.2%
  • C++ 1.0%
  • Objective-C 0.2%
  • Shell 0.2%
  • Perl 0.1%