Opened 11 years ago
Last modified 10 years ago
#8 new task
Improve handling of directories with many files
Reported by: | Martin Ebourne | Owned by: | chris |
---|---|---|---|
Priority: | normal | Milestone: | 0.12 |
Component: | box libraries | Version: | trunk |
Keywords: | Cc: |
Description
Both the backup client and the server housekeeping code use inefficient directory indexing algorithms. The result is that when handling directories with many 1000s of files in them excessive CPU usage is seen.
Change History (2)
comment:1 Changed 11 years ago by
comment:2 Changed 10 years ago by
Milestone: | 0.20 → 0.12 |
---|---|
Owner: | set to chris |
Version: | 0.10 → trunk |
bbackupd gets slower and slower when backing up a directory with many files. The problem appears to be that the directory is rewritten after each file is added, which is O(n2) in number of files in the directory.
Note: See
TracTickets for help on using
tickets.
bbackupd : When a directory has lots of files, there are rather too many compares going on when searching for it's entry in the directory listing retrieved from the server. This scales logarithmically.
bbstored : When housekeeping, bbstored reads every directory. Within each directory, the contents are only scanned linearly a couple of times, so overall it should scale linearly with the number of directories (where number of files in them are far less important).
We do need to move to a ref counted store, to avoid all this scanning. But on review of the code, I don't think the excessive CPU usage on the server is due to inefficient handling of large directories.