wiki:FixingCorruption

Fixing Problems With Corruption on the Server

This page gives help on what to do if your server has suffered corruption, for example, after an unclean shutdown or other OS or hardware problem.

In general, as updates to the store are made in an atomic manner, the most likely result is wasted disc space. However, if really bad things happen, or you believe that there is a lot of wasted space, then these instructions will help to restore your data.

You know you will need to do something if you get strange errors, and bbackupd attempts to contact the server every 100 seconds or so. Or if one of the discs in your RAID disc set has failed.

After following these instructions, the end result will be that bbackupquery will be able to see all the files which were stored on your server, and retrieve them. Some of them may be in lost+found directories in the root of the store (or in their original position if they have been moved) but they will all be able to be retrieved.

After you have retrieved the files you want, bbackupd will upload new versions where necessary, and after about two days, mark any lost+found directories as deleted. Finally, those directories will be removed by the housekeeping process on the server.

These instructions assume you're working on account 1234 - substitute this for whatever account you're actually working on. These will need to be repeated for all affected accounts.

Stop bbackupd

First, make sure that bbackupd is not running on the client machine for the account you are going to recover. Use kill to terminate it.

(This step is not strictly necessary, but is recommended. During any checks on the account, bbackupd will be unable to log in, and after they are complete, the account is marked as changed on the server so bbackupd will perform a complete scan.)

Are You Using RAID on the Server?

At the moment, the raidfile recovery tools have not been written. However, when two out of three files are available, the server will run succesfully, even if it complains a lot in the logs. So, your best bet here is to fix the accounts, if necessary, and retrieve any files you need. Then move the old store directories aside (in case you need them) and start afresh with new accounts, and let the clients upload all their data again.

These utilities will be written shortly!

Check and Fix the Account

First, run the check utility, and see what errors it reports, using the following command:

/usr/local/bin/bbstoreaccounts check 1234

This will take some time, and use a fair bit of memory (about 16 bytes per file and directory). If the output looks plausible and reports errors which need fixing, run it again but with the fix flag:

/usr/local/bin/bbstoreaccounts check 1234 fix

This will fix any errors, and remove unrecoverable files. Directories will be recreated if necessary.

NOTE: The utility may adjust the soft and hard limits on the account to make sure that housekeeping will not remove anything - check these afterwards.

Grab Any Files You Need with bbackupquery

At this point, you will have a working store. Every file which was on the server, and wasn't corrupt, will be available.

On the client, use bbackupquery to log in and examine the store. (type help at the prompt for instructions). Retrieve any files you need, paying attention to any lost+found directories in the root directory of the store.

You can skip this step if you are sure that the client machine is fine - in this case, bbackupd will bring the store up to date.

Restart bbackupd

Restart bbackupd on the client machine. The store account will be brought up to date, and files in the wrong place will be marked for eventual deletion.

Last modified 11 years ago Last modified on Nov 22, 2006, 10:11:42 PM