xref: /spdk/doc/blobfs.md (revision 1e1fd9ac219da3e52bc166c9d2bb2376c62c113d)
1# BlobFS (Blobstore Filesystem) {#blobfs}
2
3## BlobFS Getting Started Guide {#blobfs_getting_started}
4
5## RocksDB Integration {#blobfs_rocksdb}
6
7Clone and build the SPDK repository as per https://github.com/spdk/spdk
8
9~~~{.sh}
10git clone https://github.com/spdk/spdk.git
11cd spdk
12./configure
13make
14~~~
15
16Clone the RocksDB repository from the SPDK GitHub fork into a separate directory.
17Make sure you check out the `6.15.fb` branch.
18
19~~~{.sh}
20cd ..
21git clone -b 6.15.fb https://github.com/spdk/rocksdb.git
22~~~
23
24Build RocksDB.  Only the `db_bench` benchmarking tool is integrated with BlobFS.
25
26~~~{.sh}
27cd rocksdb
28make db_bench SPDK_DIR=relative_path/to/spdk
29~~~
30
31Or you can also add `DEBUG_LEVEL=0` for a release build (need to turn on `USE_RTTI`).
32
33~~~{.sh}
34export USE_RTTI=1 && make db_bench DEBUG_LEVEL=0 SPDK_DIR=relative_path/to/spdk
35~~~
36
37Create an NVMe section in the configuration file using SPDK's `gen_nvme.sh` script.
38
39~~~{.sh}
40scripts/gen_nvme.sh --json-with-subsystems > /usr/local/etc/spdk/rocksdb.json
41~~~
42
43Verify the configuration file has specified the correct NVMe SSD.
44If there are any NVMe SSDs you do not wish to use for RocksDB/SPDK testing, remove them from the configuration file.
45
46Make sure you have at least 5GB of memory allocated for huge pages.
47By default, the SPDK `setup.sh` script only allocates 2GB.
48The following will allocate 5GB of huge page memory (in addition to binding the NVMe devices to uio/vfio).
49
50~~~{.sh}
51HUGEMEM=5120 scripts/setup.sh
52~~~
53
54Create an empty SPDK blobfs for testing.
55
56~~~{.sh}
57test/blobfs/mkfs/mkfs /usr/local/etc/spdk/rocksdb.json Nvme0n1
58~~~
59
60At this point, RocksDB is ready for testing with SPDK.  Three `db_bench` parameters are used to configure SPDK:
61
621. `spdk` - Defines the name of the SPDK configuration file.  If omitted, RocksDB will use the default PosixEnv implementation
63   instead of SpdkEnv. (Required)
642. `spdk_bdev` - Defines the name of the SPDK block device which contains the BlobFS to be used for testing. (Required)
653. `spdk_cache_size` - Defines the amount of userspace cache memory used by SPDK.  Specified in terms of megabytes (MB).
66   Default is 4096 (4GB).  (Optional)
67
68SPDK has a set of scripts which will run `db_bench` against a variety of workloads and capture performance and profiling
69data.  The primary script is `test/blobfs/rocksdb/rocksdb.sh`.
70
71## FUSE
72
73BlobFS provides a FUSE plug-in to mount an SPDK BlobFS as a kernel filesystem for inspection or debug purposes.
74The FUSE plug-in requires fuse3 and will be built automatically when fuse3 is detected on the system.
75
76~~~{.sh}
77test/blobfs/fuse/fuse /usr/local/etc/spdk/rocksdb.json Nvme0n1 /mnt/fuse
78~~~
79
80Note that the FUSE plug-in has some limitations - see the list below.
81
82## Limitations
83
84* BlobFS has primarily been tested with RocksDB so far, so any use cases different from how RocksDB uses a filesystem
85  may run into issues.  BlobFS will be tested in a broader range of use cases after this initial release.
86* Only a synchronous API is currently supported.  An asynchronous API has been developed but not thoroughly tested
87  yet so is not part of the public interface yet.  This will be added in a future release.
88* File renames are not atomic.  This will be fixed in a future release.
89* BlobFS currently supports only a flat namespace for files with no directory support.  Filenames are currently stored
90  as xattrs in each blob.  This means that filename lookup is an O(n) operation.  An SPDK btree implementation is
91  underway which will be the underpinning for BlobFS directory support in a future release.
92* Writes to a file must always append to the end of the file.  Support for writes to any location within the file
93  will be added in a future release.
94