Garbage Collection in LogFS

A Presentation by Jörn Engel

LogFS [1] is the codename for a new flash filesystem, aimed
to replace JFFS2 as the standard flash filesystem for Linux.
Motivation as well as a brief overview of LogFS have been presented
[2] at Linux Kongress 2005 in Hamburg.

As flash technology prohibits in-place updates (see [2]), LogFS has a
log-structured design, similar to Sprite LFS [3]. However, while the
log-structured approach solves one problem, it creates another, namely
the need for Garbage Collection. Usually performed by a so-called
"cleaner" thread, Garbage collection can causes problems calculating
the amount of free space needed to complete an update [4]. Also, the
very act of moving old data to a new location creates a significant
overhead.

In principle, Garbage Collection is implemented by collecting data
that is still valid and moving it to new segments (or eraseblocks in
flash terminology). The long-term goal of Garbage Collection is to
reorganize data such that it occupies fewer segments than before.

A naïve implementation may cause a short-term increase of used
segments, leading to a deadlock where any remaining free segments have
been filled with garbage collected data. JFFS2 [5] handles this
problem by keeping some spare segments for Garbage Collection use - a
strategy that has worked well in practice, but cannot be proven to
suffice under all possible circumstances.

In this paper, we will demonstrate an alternative approach, tailored
to the needs of LogFS, that can be proven to work deadlock-free.
Furthermore, we will show how intelligent organisation of data can
reduce the overall need for Garbage Collection, thereby improving both
lifetime of the flash medium and filesystem performance.

[1] http://wiki.laptop.org/go/Logfs
[2] http://wh.fh-wedel.de/~joern/logfs.pdf
[3] http://citeseer.ist.psu.edu/rosenblum91design.html
[4] http://lwn.net/Articles/190223/
[5] http://sources.redhat.com/jffs2/jffs2.pdf

Direct link to video