|Version 9 (modified by chris, 7 years ago) (diff)|
Box Backup Troubleshooting
- Administration Guide
- Program Manuals
- Frequently Asked Questions (FAQ)
- Troubleshooting (wiki)
- Troubleshooting (manual)
- Certificate Discussion
- Mailing Lists
- Feature Requests
- Commercial Support and Hosting
- System Requirements
- Comparison With Other Systems
- Restoring Files
- Verify and Restore Whole Backups
- Checking Space Usage
- Managing Client Accounts
- Regular Maintenance
- Fixing Corruption on Server
- Recovering from Lost Keys
- Pull data from a USB hdd
- Recipies and HOWTOs
- Logging Output
- Related Projects
Note: If you are trying to fix a store after your disc has been corrupted, please see the FixingCorruption page.
If some files are not being backed up when you expect them to be, and you want to know why, or you are having problems with including and excluding files, please see the LogAllFileAccess page instead.
Search through the tickets on this site and see if the problem is already known about. Also, take a look at the KnownBugs page and see if the problem is covered there (new problems should be in the ticket system, however).
Unfortunately, the error messages are not particularly helpful at the moment. We are trying to improve them, and suggestions are welcome. This page lists some of the common errors, and the most likely causes of them.
When an error occurs, you will see a message like 'Exception: RaidFile/OSFileError (2/8)' either on the screen or in your log files. (it is recommended you set up another log file as recommended in ConfiguringAServer.)
This error may not be particularly helpful, although some do have extra information about probable causes. To get further information, check the ExceptionCodes.txt file in the root of the distribution. This file is generated by the ./configure script, so you will need to have run that first.
Some common causes of exceptions are listed below.
Please email me with any other codes you get, and I will let you know what they mean, and add notes here.
This is found either when running bbstoreaccounts or in the bbstored logs.
Problem: The directories you specified in the raidfile.conf are not writable by the _bbstored user.
Resolution: Change permissions appropriately.
This usually occurs when the configuration files can't be opened.
Problem: You created your configurations in non-standard locations, and the programs cannot find them.
Resolution: Explicitly specify configuration file locations to daemons and programs. For example
/usr/local/bin/bbstored /some/other/dir/bbstored.config /usr/local/bin/bbackupquery -c /some/other/dir/bbackupd.config
(daemons specify the name as the first argument, utility programs with the -c option.
Problem: bbstored can't find the raidfile.conf file specified in bbstored.conf.
Resolution: Edit bbstored.conf to point to the correct location of this additional configuration file.
The server can't listen for connections on the IP address specified when you configured it.
Problem: This probably means you've specified the wrong hostname to bbstored-config - maybe your server is behind a NAT firewall?
Resolution: Edit bbstored.conf and correct the ListenAddresses line. You should replace the server address with the IP address of your machine.
This probably means that there is a problem with the Box Backup server (store). If you are the administrator of the server, please run bbstoreaccounts check <account> fix to correct the problem. If you are a Box Backup client user (bbackupd), please contact the server administrator and ask them to check the account.
These errors all relate to connections failing - you may see them during operation if there are network failures or other problems between the client and server. The backup system will recover from them automatically.
Connection Timed Out (TLSReadFailed, TLSWriteFailed)
These problems are often, but not always, caused by a firewall closing the connection between the client and server when it appears to be idle for some time. This can happen if the client spends a long time diffing a large file, for example, or reading a huge directory.
It often looks like this, on the client side:
WARNING: Exception thrown: ConnectionException(Conn_TLSWriteFailed) at SocketStreamTLS.cpp(442) ERROR: Failed to upload file: ...: caught exception: Connection TLSWriteFailed (Probably a network issue between client and server.) (7/33) ERROR: SSL error during Write: error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry WARNING: Exception thrown: ConnectionException(Conn_TLSWriteFailed) at SocketStreamTLS.cpp(442) ERROR: Exception caught (Connection TLSWriteFailed (Probably a network issue between client and server.) 7/33), reset state and waiting to retry...
After this the client should retry automatically after 100 seconds, but it may get stuck on the same file if diffing that file always takes longer than the firewall timeout.
Normally the solution to this problem is to enable SSL Keepalives.
Certificate Verify Failed (TLSHandshakeFailed)
I once got this on the client side:
bbackupd: Opening connection to server xxxx.xxx... bbackupd: SSL err during Connect: error:xxxxxxxx:rsa routines:RSA_padding_check_PKCS1_type_1:block type is not 01 bbackupd: SSL err during Connect: error:xxxxxxxx:rsa routines:RSA_EAY_PUBLIC_DECRYPT:padding check failed bbackupd: SSL err during Connect: error:xxxxxxxx:asn1 encoding routines:ASN1_verify:EVP lib bbackupd: SSL err during Connect: error:xxxxxxxx:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed bbackupd: TRACE: Exception thrown: ConnectionException(Conn_TLSHandshakeFailed) at SocketStreamTLS.cpp(237) bbackupd: Exception caught (7/30), reset state and waiting to retry...
and this on the server side:
bbstored: Incoming connection from xx.xxx.xx.xxx port xxxxx (handling in child xxxxx) bbstored: SSL err during Accept: error:xxxxxxxx:SSL routines:SSL3_READ_BYTES:tlsv1 alert decrypt error bbstored: in server child, exception Connection TLSHandshakeFailed (7/30) -- terminating child
The solution was: needed to create a new CA on the server side and re-generate the client certificate. Re-creating the client certificate request was not necessary.
There is also some information on problems that can be caused by SSL configuration on the OpenSSLNotes page.
If this really doesn't help, then using the DEBUG builds of the system will give you much more information - a more descriptive exception message and the file and line number where the error occurred.
For example, if you are having problems with bbstoreaccounts, build the debug version with the following commands:
cd boxbackup-0.0 cd bin/bbstoreaccounts make
(within the module directories, make defaults to building the debug version. At the top level, it defaults to release.)
This will build an executable in debug/bin/bbstoreaccounts which you can then use instead of the release version. It will give far more useful error messages.
When you get an error message, use the file and line number to locate where the error occurs in the code. There will be comments around that line to explain why the exception happened.
If you are using a debug version of a daemon, these extended messages are found in the log files.
If All Else Fails
If you've found nothing that helps in the above information then you should consider raising a new ticket? for your problem if you're sure it's a problem with Box Backup. If you've a problem with getting Box Backup to work on your system, however, then you should ask a question on the MailingLists instead; tickets should only be used where a change to Box Backup to resolve a problem is expected.