shelf: use files capped to certain size #19

holiman · 2023-07-13T08:03:32Z

This PR adds 'cappedFile' support. A cappedFile behaves like a regular os.File, but it actually maps to a set of files, each capped to a max size.
By swapping out the regular files to cappedFile, as backing for the shelves, billy will can be made to respect max file sizes in filesystem (e.g 4GB in fat32).

The cappedFile is not concurrency-safe for spread-out read/writes. That is, if the data to be read crosses file boundaries, then simultaneous read and write may cause data to be corrupted.

However, this can be easily avoided on the upper layer: the shelf can just ensure that the cappedFile limit is a multiple of the shelf size. So instead of using 2 * 1024 * 1024= 2097152, for shelf-size 10, it could use 2097150. If it did, then the write-offsets (2097140, 2097150,2097160) all occur so no writes crosses file boundaries.

holiman · 2023-07-13T08:03:59Z

cc @karalabe

codecov · 2023-07-13T11:24:35Z

Codecov Report

Merging #19 (a920bfe) into main (1c7e68d) will increase coverage by 0.28%.
The diff coverage is 89.89%.

@@            Coverage Diff             @@
##             main      #19      +/-   ##
==========================================
+ Coverage   87.08%   87.37%   +0.28%     
==========================================
  Files           5        6       +1     
  Lines         395      483      +88     
==========================================
+ Hits          344      422      +78     
- Misses         36       42       +6     
- Partials       15       19       +4

karalabe · 2023-07-17T12:49:06Z

capped_file.go

+			f   *os.File
+			err error
+		)
+		if i == 0 {


Should this perhaps be if nFiles == 1? That way for single files I won't have the strange suffit, but for capped files, I will have a proper .cap.0 suffix for the first file too.

I guess the catch is if someone sets blobpool sizes below 2 GB and then above, ten they will flip-flop between the 2 naming schemes.

Any particular reason for supporting the "old" scheme? Can't we just always have "basename.%d"?

My reasoning was that in most cases, there's only ever going to be one file used. And in that case, the one and only file (the zeroth file) could be just the basename.
Only if it does go above would we start adding suffixes. So in the 'normal case', there would be zero difference between a system using a capped file and one not using capped files.

karalabe · 2023-07-17T12:52:45Z

capped_file.go

+		(offset+uint64(len(data)))/cf.cap >= uint64(len(cf.files)) {
+		return 0, ErrBadIndex
+	}
+	for ; written < totalLength; fileNum++ {


This is strange because from one aspect, you are implementing the File interface, so this is necessary to support. From another perspective, if the cap is a multiple of the shelf size, then it should be impossible to overflow file boundaries, right?

The capped_file, from a library perspective, handles overflowing file boundaries fine.

Billy uses it in a way which doesn't overflow file boundaries, though, which also means we don't have to worry about concurrency.

But those are two separate things, IMO

I could bleed some of the shelf-size logic into capped_file, then the looping wouldn't be needed, I guess

capped_file.go

karalabe · 2023-07-17T13:01:16Z

shelf.go

+	// Max 5 files @ 2GB each.
+	// We also want to ensure that the cap is a multiple of the slot size, so no
+	// slot crosses a file boundary.
+	cap := uint64(2*1024*1024 - 2*1024*1024%slotSize)


This counter does not take into account the billy magic and shelf header metadata. That would push all these "calculated" offsets to be wrong since the first file would have some extra junk that would shift everything (and would need a different cap for the first and subsequent data files).

Ouch! Good catch!

I guess one way to handle if differently, without tying capped file too deeply with shelves, is to have the following rule:

On write: if offset + len(b) > cap, then write to the 'next' file. And similarly:

On read: if offset + len(b) > cap, then read from the 'next' file.

In other words: let the max byte determine the position of the read/write. This has the following characteristics

Reading 1 byte may read from different location than reading n bytes and picking out the first.

The cap will never be exceeded,

As long as read and writes are always done chunkwise (and always done with same offset+lengths),

We would never spread a write over several files (assuming the size of the object is smaller than the cap)

everything works fine. Non-chunked read/writes will behave undefined.

Ah, scrap that, won't work. The other even more simple alternative is to allow chunks to overflow the file-cap, and not spread the read/write out across several files. Then the limit is more of a recommendation, which is fine in our case (we use 2GB but the 'hard' limit is 4GB so we've got space to spare)

holiman · 2023-07-19T08:01:27Z

Changed the behaviour now, but there's something not right still

[user@work billyfuzz]$ go run . 
Opened ./
1 ops, 2 keys active
Reopening db, ops 1386, keys 703
Opened ./
Reopening db, ops 2288, keys 1019
Opened ./
2289 ops, 1018 keys active
panic: bad index: slot 245, slotsize 1048576, EOF

goroutine 1 [running]:
main.doFuzz(0xc00011c780?)
        /home/user/go/src/github.com/holiman/billy/cmd/billyfuzz/main.go:133 +0xc5b
github.com/urfave/cli/v2.(*Command).Run(0xc00011c780, 0xc00002e380, {0xc000014050, 0x1, 0x1})
        /home/user/go/pkg/mod/github.com/urfave/cli/v2@v2.24.1/command.go:271 +0x9eb
github.com/urfave/cli/v2.(*App).RunContext(0xc00017c000, {0x5ffbd8?, 0xc000018110}, {0xc000014050, 0x1, 0x1})
        /home/user/go/pkg/mod/github.com/urfave/cli/v2@v2.24.1/app.go:333 +0x665
github.com/urfave/cli/v2.(*App).Run(...)
        /home/user/go/pkg/mod/github.com/urfave/cli/v2@v2.24.1/app.go:310
main.main()
        /home/user/go/src/github.com/holiman/billy/cmd/billyfuzz/main.go:51 +0x1c5

holiman · 2023-07-19T08:38:35Z

but there's something not right still

fixed now

db.go

holiman · 2023-09-05T07:48:34Z

@karalabe I think the current implementation is the "most sane". Want to have a review chat about this at some point?

holiman force-pushed the capped_files branch from f8c1d5c to d9d9e06 Compare July 13, 2023 11:34

holiman changed the title ~~wip: use files capped to certain size~~ shelf: use files capped to certain size Jul 13, 2023

holiman force-pushed the capped_files branch from d9d9e06 to 6c3cd25 Compare July 13, 2023 11:36

holiman marked this pull request as ready for review July 13, 2023 11:37

holiman force-pushed the capped_files branch 2 times, most recently from 55f08b2 to 1207366 Compare July 15, 2023 18:15

karalabe reviewed Jul 17, 2023

View reviewed changes

capped_file.go Outdated Show resolved Hide resolved

karalabe reviewed Jul 17, 2023

View reviewed changes

holiman added 3 commits July 18, 2023 19:34

implement capped files

3323116

shelf: use capped files

188cebe

capped_file: increase test coverage, fix out-of-bounds check

9125881

holiman force-pushed the capped_files branch from 1207366 to 9125881 Compare July 18, 2023 17:41

inc tests

d3362e9

capped_file: change how file layout works

cd91ed6

holiman force-pushed the capped_files branch from e1bbb7e to cd91ed6 Compare July 19, 2023 08:40

minor

dd5e58e

holiman force-pushed the capped_files branch from a9f77f0 to dd5e58e Compare July 19, 2023 09:04

karalabe reviewed Jul 19, 2023

View reviewed changes

db.go Outdated Show resolved Hide resolved

holiman added 2 commits July 20, 2023 09:17

store_capped; two subtle flaws

7a9da94

better fix

a920bfe

holiman force-pushed the capped_files branch from e8c96a4 to a920bfe Compare July 20, 2023 18:50

shelf: use files capped to certain size #19

Are you sure you want to change the base?

shelf: use files capped to certain size #19

Uh oh!

Conversation

holiman commented Jul 13, 2023

Uh oh!

holiman commented Jul 13, 2023

Uh oh!

codecov bot commented Jul 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

holiman Jul 18, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

holiman commented Jul 19, 2023

Uh oh!

holiman commented Jul 19, 2023

Uh oh!

Uh oh!

holiman commented Sep 5, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Jul 13, 2023 •

edited

Loading

holiman Jul 18, 2023 •

edited

Loading