A review of GeckoFTL: Scalable Flash Translation Techniques For Very Large Flash Devices

Posted in reviews by Christopher R. Wirz on Wed Apr 04 2018



SSD Page Validity Bitmap (PVB) in the Flash Translation Layer (FTL) under-performs a LSM tree.

Note: This review is in the format of the CS265 Discussion format and may reflect my personal opinion.

1. What is the problem?

Flash memory is organized into pages, which cannot be updated without being erased. Instead of being deleted when a new version is written, the new data is written to free page and the old page is marked invalid by a Page Validity Bitmap (PVB) in the Flash Translation Layer (FTL). The FTL is metadata that tracks the relationship of logical to physical addresses and invalid pages. On flash devices, a small amount of RAM is integrated to store the FTL, but it is expensive.

2. Why is it important?

Static RAM at a cost of around $5/megabyte, but Dynamic RAM is much much (2 orders of magnitude) cheaper. DRAM also uses more power - so the lifetime cost can be higher. The end result is that RAM drives cost.

As SSD metadata increases proportional to capacity, the RAM requirement increases with capacity. Power-failure recovery time also increases proportionally - however, this time increases with the size of the metadata.

3. Why is it hard?

Simply trying to free up RAM by putting large portions of the FTL on the disk prevents metadata on power failure. The issue with this approach is that FTLs require many updates and the number of writes to SSD degrades the device. Also, these additional writes lower the throughput of the drive. Since SSD/Flash devices are becoming more popular, small improvements can now see widespread benefits, but adverse decisions have the same scale.

4. Why existing solutions do not work?

Previous solutions either make too many updates to the storing device, or require too much RAM. In practice we see that only a small portion of the data is frequently updated, so keeping only a portion of the FTL in RAM is congruent with this reality. Also, the transition of the PVB into Flash does not necessarily organize the updates, whereas Gecko does. Gecko claims to reduce the RAM requirement by 95%, which is a much better outcome than previous solutions. In doing this, Gecko exhibits 98% less write amplification than solutions which apply FTLs to flash. In all instances, recovery time is reduced 51% using Gecko.

5. What is the core intuition for the solution?

This paper replaces the Page Validity Bitmap (PVB) of the Flash Translation Layer (FTL) with a Log-structured merge-tree (LSM-tree) called Logarithmic Gecko. This enables cheaper update costs at the expense of longer garbage collection queries.

6. Does the paper prove its claims?

The most important support of the claim is that updates are more frequent than garbage collection - and that writes are more expensive than reads. Based on these two unquestionable statements, they show a reduction in space requirements and recovery time through the use of GeckoFTL.

The garbage collector, while taking more time than previous solutions, accounts for different update frequencies in order to run more sparingly - thus reducing write amplification. A garbage collection run query always traverses recently created entries first (in Gecko, many are handled in the merge operation). Since there is a portion on flash, minimizing garbage collection operations (even though they take longer), will increase device life while temporarily reducing throughput. Gecko accomplishes this with check-points that ensures dirty flags are appropriately paired with writes.

7. What is the setup of analysis/experiments? is it sufficient?

The authors set the number of levels based on (Log(Blocks/Entries) / Log(Size Ratio)). This yields predictable entry merge cost that is less than the cost of a flash read or a flash write. The Least Recently Used (LRU) cache is show to hold 2^19 entries.

Competing FTL designs have been taken from recent research. These implementations include DFTL, LazyFTL, μ-FTL, and IB-FTL. Write Amplification, Recovery Time, and Integrated RAM were the main metrics for comparison.

8. Are there any gaps in the logic/proof?

The paper shows that more metadata is required as flash devices grow in size. In the abstract, the authors state that metadata is stored in integrated RAM, but also that updating it degrades the performance and lifetime of Flash. However, in the paper body, they discuss reducing write amplification to the flash through the merge policies of recently created entries.

Current manufacturers do not disclose the FTL implementation. While this is discussed in the paper, the end result is that comparisons must take place within the EagleTree framework, and not on actual hardware. Additionally, Gecko is compared against a standard PVB simulation.

9. Describe at least one possible next step.

The paper does not discuss the use of bloom filters. These could be used to identify blocks that are updated more quickly - and thus improve garbage collection rates. Broadly, many database systems have seen performance increases through skipping - a next step would be to explore analogous solutions for SSD devices.

BibTex Citation

@inproceedings{dayan2016geckoftl,
  title={GeckoFTL: Scalable Flash Translation Techniques For Very Large Flash Devices},
  author={Dayan, Niv and Bonnet, Philippe and Idreos, Stratos},
  booktitle={Proceedings of the 2016 International Conference on Management of Data},
  pages={327--342},
  year={2016},
  organization={ACM}
}