Skip to content

AltSysrq/libchistka

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

93 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

libchistka

INTRODUCTION

libchistka is a layer that can be inserted between the OS's system calls and
another program. It attempts to remedy power drain caused by programs which
perform IO irresponsibly.

SYSTEM REQUIREMENTS
A relatively modern UNIX system which supports LD_PRELOAD, and a C99
compiler. You also need GNU Autotools to set up the build system if cloning
directly from Github. Chistka is known to work on:
  GNU/Linux
  FreeBSD

The system-wide install has only been tested in Debian Squeeze and Wheezy on
x86 and AMD64, but it works correctly most of the time (about 1 in 50 boot-ups
hang).

THE READ PROBLEM

A number of programs will open a file for reading, then slowly read it in, or
jump around reading small pieces. If enough time elapses between these reads,
the harddisk may have spun down and will then need to spin back up. Power is
lost both by the new spinup of the disk, as well as the time it spent idly
spinning waiting for new reads.

An example of this problem is Firefox. Whenever a website is accessed, it will
check for a cookie or other such data, often spinning the HDD back up just for
a miniscule data access.

libchistka uses five methods to mitigate the read problem:

1. Programs usually use mostly the same files on each run. The first 16 MB (by
default) of files read by the program more than 5 (by default) seconds after
the program started are automatically read in when it starts.

2. Traversing one part of a directory usually results in traversing more of
it. Whenever a file is opened, its containing directory is iterated if it is
the first time that a file in that directory was opened.

3. Opening a file for reading usually indicates that the whole file will be
read at some point. When a file is opened for reading, data from the file (up
to 64 MB by default) is read before returning the file handle, the first time a
given file is opened.

4. Reading one file in a directory is usually followed by reading others (eg,
cookies in Firefox). When a file is openned for reading whose name matches a
configured pattern(s), other files in that directory will be read in the
background.

5. libchistka may be configured to deny the existence of files matching certain
patterns.

THE FSYNC PROBLEM

Normally, a program uses fsync() or such because it is important. For example,
you saved a file in your text editor, so you want to make sure it actually
makes it to disk, in case the system crashes or you lose power. On a system
like a netbook though, neither of these are a risk if the OS is stable, and the
programs which abuse fsync() become a problem.

Firefox again is a good example of an abuser. It normally fsync()s a few times
for every cookie modification and every history addition, waking the hard disk
up for almost every page load.

libchistka replaces fsync() and friends (msync(), sync(), syncfs(), and
fdatasync()) with no-ops, and removes O_SYNC, O_DSYNC, and O_RSYNC flags from
calls to open(). The end effect is better performance and less disk awakening,
in exchange for losing all guarantees of data integrity.

(This functionality is a rough clone of libeatmydata.)

A related issue is Btrfs's transaction system; in recent kernels, flushoncommit
is the default, and there is no way to turn it off. If you have the Btrfs
headers when you compile, libchistka will also prevent userspace programs from
performing transactions. (Note that this could be very dangerous!)

THE IRRELEVANT POLL PROBLEM

Some programs can be observed (with strace) to poll() a set of several files,
and then only checking a couple when poll() returns. This causes the program to
wake up often and accomplish nothing, thus needlessly keeping the CPU in higher
energy states and resulting in a few more context switches.

Some X programs are also guilty of using a needlessly small timeout with
poll(), exacerbating the problem.

libchistka works around this by rewriting calls to poll() to account for two
things. First, file descriptors which are repeatedly poll()ed then ignored will
be deleted from poll() calls. Secondly, the timout, when not negative
(indicating infinity), is clamped to a minimum, defaut 1 second. Both are
disabled by default since it could break programs which actually use poll()
entirely correctly.

(Firefox can be observed to poll() with a timeout of 0 several times per
second, even when idle. Forcing the delay on it kills responsiveness, which
would suggest it doesn't poll() X... for the time being, the above worarounds
don't help Firefox.)

INSTALLING

Dependencies:
  Just make and a C compiler. GNU Autotools as well if working directly from
  the git distribution.

Building:
  If configure does not exist in the project directory, first run
    mkdir m4
    autoreconf -i

  In either case, then run
    ./configure
    make

Installing:
  As root, run
    make install
  (on Ubuntu and such just run "sudo make install" as yourself)

Uninstalling:
  make uninstall
or
  sudo make uninstall


CONFIGURATION

The environment variable CHISTKA_DENY may be set to a colon-separated list of
glob patterns. If any file to be opened for reading matches any of the patterns
(see fnmatch(3)), libchistka will deny the file's existence. Files which are
opened for writing at any time are whitelisted and thereafter not affected by
this option. For example, specifying
  */Cache/?/*:*/OfflineCache/?/*
would prevent Firefox from reading old cache files.

The environment variable CHISTKA_READAHEAD indicates how many megabytes should
be read from each file opened for reading. If unset, it defaults to 64. Setting
it to zero effectively disables readahead.

The amount of time, in seconds, between the signaling of an event and its
processing can be altered with the environment variable CHISTKA_DELAY. The
default value is 5 seconds.

If the environment variable CHISTKA_PROFILE exists, a list of filenames is read
from the file specified in that variable, one file per line.  CHISTKA_DELAY
seconds after the program starts, the contents of these files are read in, up
to CHISTKA_PREREAD megabytes. CHISTKA_PREREAD defaults to 16 megabytes.

The environment variable CHISTKA_SIBLINGS can be set to indicate the number of
megabytes to read from sibling files an a directory. By default it is 16.

CHISTKA_POLL_TIMEOUT indicates the minimum duration of the timeout in calls to
poll(), in milliseconds. The default is 0. Specifying a negative value forces
all timeouts to infinity; this is probably a bad idea for most programs and
should be used with care.

The number of times a file descriptor can be poll()ed and ignored can be
changed with CHISTKA_POLL_IGNORE. It defaults to -1. Setting it to -1
effectively disables this behaviour.

Setting the environment variable CHISTKA_DISABLE to anything disables ALL
special behaviour provided by the shim, and causes all calls to be passed to
the standard C library (or whatever follows chistka in the chain) verbatim.

SYSTEM-WIDE INSTALL

WARNING: The directions in this section could completely break your system!

On GNU/Linux (and possibly others), you can apply libchistka to all programs
ever run on your operating system. After installing libchistka, add
  /path/to/libchistka.so
to
  /etc/ld.so.preload
where /path/to is your prefix (/usr/local/lib by default). Make sure the prefix
is somewhere that is accessable at early boot time; if in doubt, reconfigure
and recompile with
  ./configure --prefix=/
  make clean all install

After writing /etc/ld.so.preload, it is advisable to
  unchistka sync

Note that there is currently no easy way to apply non-default settings or
profiles to the whole system.

Be careful with a system-wide install, as it could break everything. It is
known to work correctly on Debian Squeeze and Wheezy.

ISSUES

In the rare case of a program spawning multiple threads before any call to
open(), where both threads will call open() at about the same time, and when
that program was created by a call to exec() in a program libchistka was
already loaded into, there is an unavoidable race condition that may cause
problems.

If a program accesses a file in a directory and seconds later attempts to get
an exclusive lock in a file within, it may collide with the daemon trying to
preread that file.

Concurrent modification of files may give unexpected results in corner
cases. For example, if the host process opens a file in read-append mode,
libchistka will restore the file pointer to its original location after
prereading, which might no longer be the end of the file (or might not exist
anymore).

On Debian Sid, concurrent startup (the default) sometimes deadlocks at the end
of runlevel S.

On Debian Sid, libchistka deadlocks the update-mandb trigger in the APT
installation process. Debian (and Ubuntu) users are recommended to run
aptitude/dpkg/apt-get, etc, within unchistka for the time being to avoid this
problem.

About

For when power is more important than integrity. (Save energy at the risk of data loss and minor instability.)

Resources

License

Stars

Watchers

Forks

Packages

No packages published