====== rsnapshot-backup ====== rsnapshot-backup is a frontend to rsnapshot for managing larger backup sites (several dozen systems) of un*x-like operation system instances. It is easy to use, esp. additional backup clients are configured in a jiffy. If you have to backup several dozen uni*x-like systems this is your tool. ===== components ====== * rsnapshot-backup * schedules and parallelizes the backup jobs * creates logfiles * logfile retention follows backup retention (or $keeplogs, whichever is less) * rsnapshot-backup -C - sanity checking for * connectivity on all enabled backup jobs * accessiblity i.e. unattended ssh login * remote hostname * rsync executable * rsnapshot-backup-conf * create new backup job description from a template * create corresponding directories * if mkdir fails it prompts for permission to use 'mkdir -p' * rsnapshot-backup-enable * enables a defined backup job after creation * rsnapshot-backup-stats * writes du and df stats to /var/log/rsnapshot-backup/00-stats/ * cronjob @07:07 * /etc/rsnapshot-backup.d/ - config directory * /etc/rsnapshot-backup.d/conf.template - backup job description template * /etc/rsnapshot-backup.d/exclude.template - backup job exclude template * /etc/rsnapshot-backup.d/conf. - backup job description for * /etc/rsnapshot-backup.d/script. - supplementary prerequisite script (e.g. for database dumps) for - (optional) * /etc/rsnapshot-backup.d/exclude. - exclude file for * /etc/rsnapshot-backup.d/enabled/ - config directory, contains symlinks to enabled backup jobs * /etc/rsnapshot-backup.d/enabled/conf. - symlink to enabled backup job description for . * /etc/cron.d/rsnapshot-backup - cronjob for running rsnapshot-backup * /etc/cron.d/rsnapshot-backup-stats - cronjob for running rsnapshot-backup-stats ===== installation ====== cat << EOF >> /etc/apt/sources.list.d/lihas deb http://ftp.lihas.de/debian stable main EOF wget -O - http://ftp.lihas.de/debian/apt-key-lihas.gpg | apt-key add - apt-get update apt-get install rsnapshot-backup * Take a look at /etc/rsnapshot-backup.d/conf * does the ''BACKUPDIR'' setting fit for your setup? * is there enough storage available for your backups? * is the ''parallel'' setting suitable for your environment? This depends on LAN/WAN speed and CPU and I/O speed of server and clients. * Take a look at /etc/rsnapshot-backup.d/conf.template * does the ''snapshot_root'' directory fit for your setup? * does the ''snapshot_root'' directory MATCH the ''BACKUPDIR'' setting from ''/etc/rsnapshot-backup.d/conf''? It has to! * is there enough storage available for your backups? * does the ''interval'' setting meet your requirements? * '''please''' do not remove the '''' tags from the template as they are needed by ''rsnapshot-backup-conf''. * Provide a ssh key pair for the ''root'' user. If you don't have one ''ssh-keygen -t rsa -b 4096 -N "" -C "rsnapshot-backup@$HOSTNAME" -f ~/.ssh/id_rsa'' will do just fine. ===== operation ====== Normally, rsnapshot-backup runs unattended. You just keep a watch on disk space. usage: /usr/sbin/rsnapshot-backup [ -h ] [ -i { hourly | daily | weekly | monthly } ] -h help -i interval. One of hourly, daily, weekly, monthly. Default: daily. -c config file. Use alternate config file. Default: /etc/rsnapshot-backup.d/conf -C check connectivity for all enabled backup jobs. ==== create a backup job ==== * Create a backup job description: ''rsnapshot-backup-conf ''. This creates ''/etc/rsnapshot-backup.d/conf.'' and ''/etc/rsnapshot-backup.d/exclude.''. * Provide the ssh public key to the backup client: ''ssh-copy-id root@'' * Enable the backup job: ''rsnapshot-backup-enable ''.This creates the symlink ''/etc/rsnapshot-backup.d/enabled/conf.'' . * Check for sanity: ''rsnapshot-backup -C'' * To disable a backup job: ''rm /etc/rsnapshot-backup.d/enabled/conf.'' . ===== resources ====== ''/usr/share/doc/rsnapshot-backup'' holds some template and example files: * /usr/share/doc/rsnapshot-backup/examples/conf.template.gz * /usr/share/doc/rsnapshot-backup/examples/exclude.template * /usr/share/doc/rsnapshot-backup/examples/script.mysql5-zrmbackup * /usr/share/doc/rsnapshot-backup/examples/script.remote-backup-preparation ===== see also ====== * ssh * rsync * rsnapshot * cron ===== tips & tricks ====== ==== make rsnapshot-backup aware of ACLs and Extended Attributes ==== Debian 8 (Jessie) uses Capabilities (try: ''getcap /bin/ping'') which are stored as Extended Attributes. Rsync needs ''%%--%%acls %%--%%xattrs'' to handle ACLs and XATTRs. These parameters have to be added per backup job. The copy & paste code below adds these. It extends existing lines beginning with ''rsync_long_args ..'' or inserts ''rsync_long_args\t\t%%--%%delete %%--%%numeric-ids %%--%%relative %%--%%delete-excluded %%--%%acls %%--%%xattrs'' into the files ''/etc/rsnapshot-backup.d/conf.*'' after the last line containing ''rsync_long_args'' which is usually found in a comment block. If these comments don't exist the code throws an error leaving it to the admin to insert the line into the file. This is because the code has no chance of finding the right spot in this case. for C in /etc/rsnapshot-backup.d/conf.* ; do vo -o $C perl -e ' $sq = "\x27"; # single quote $nl = "rsync_long_args\t\t--delete --numeric-ids --relative --delete-excluded --acls --xattrs\n"; open F,$ARGV[0] or die "can${sq}t read ${sq}$ARGV[0]${sq}"; my @L = ; close F; open F,">",$ARGV[0] or die "can${sq}t write ${sq}$ARGV[0]${sq}"; @L2 = grep { /^rsync_long_args\t/ } @L; @L3 = grep { /\brsync_long_args\b/ } @L; if ( @L2 ) { print STDERR "adding to rsync_long_args\n"; for my $l ( @L ) { if ( $l =~ /^rsync_long_args\t/ ) { $ll = $l; $ll =~ s/^rsync_long_args\t\t*//; $ll =~ s/\s*$//; %A = map { $_ => 1 } split /\s+/,$ll; unless ( $A{"--acls"} ) { $l =~ s/\n/ --acls\n/; } unless ( $A{"--xattrs"} ) { $l =~ s/\n/ --xattrs\n/; } } print F $l; } } elsif ( @L3 ) { print STDERR "adding rsync_long_args line\n"; for (my $i=$#L; $i>=0; $i--) { if ( $L[$i] =~ /[^a-zA-Z_+]rsync_long_args\b/ ) { splice(@L,$i+1,0,$nl); last; } } print F @L; } else { # unchanged print STDERR "ERROR: doing nothing, add to ${sq}$ARGV[0]${sq} manually:\n"; print STDERR $nl; print STDERR "Insert TABS after ${sq}rsync_long_args${sq}!\n"; print F @L; } close F; ' $C rcsdiff -u $C vo -i $C done ==== restore from backups created using "--fake-super" ==== The situation: * ''%%--%%fake-super'' stores information requiring privileges (e.g. capabilities, ownership) in extended attributes on the backup volume * notably **symlinks** are stored as **files** with the file content being the original link destination and additional metadata is stored inside an extended attribute e.g. ''user.rsync.%stat="120777 0,0 0:0"'' To restore contents from this kind of backup you again need ''%%--%%fake-super'': * use **''%%--%%fake-super''** for a **push** restore * use **''-M %%--%%fake-super''** for a **pull** restore ==== special considerations using a secondary backup server ==== If you want a secondary backup server replicating the daily.0 backups from the primary backup server (which would be the one fetching the data from the backup clients) there are some things to consider on the secondary (ternary, etc) system(s): * the default option **''%%--%%relative''** resp. **''-R''** has to be switched off. As the default options are ''%%--%%delete %%--%%numeric-ids %%--%%relative %%--%%delete-excluded'' you have to set ''rsync_long_args'' to ''%%--%%delete %%--%%numeric-ids %%--%%delete-excluded''. * the conversion of critical metadata (xattrs, etc.) should have already been done by the primary backup server. So ''%%--%%fake-super'' does nothing helpful here, but is considered harmful (see below). * the converted metadata has to be replicated which is not done per default rsync/rsnapshot. Therefore ''%%--%%xattrs'' has to be called **twice** so rsync will replicate this information. * ''%%--%%fake-super'' and ''%%--%%xattrs %%--%%xattrs'' are mutual exclusive. * long story short: on a subsequent backup server use ''rsync_long_args'' [TABULATOR] ''%%--%%delete %%--%%numeric-ids %%--%%delete-excluded %%--%%hard-links %%--%%acls %%--%%xattrs %%--%%xattrs'' ==== draw a plot showing disk usage of a job ==== JOB=mysystem JOBDIR=$( awk '$1 == "snapshot_root" { print $2 }' /etc/rsnapshot-backup.d/conf.$JOB ) echo -n > /tmp/abs echo -n > /tmp/inc for i in /var/log/rsnapshot-backup/00-stats/*.xml ; do X=$( xmlstarlet sel -t -v 'rsnapshot-stats/du[@sys="'$JOBDIR'"]' $i ) awk '$2 == "daily.0" { print $1 }' <<< "$X" >> /tmp/abs awk '$2 == "daily.1" { print $1 }' <<< "$X" >> /tmp/inc done gnuplot set style data linespoints plot "/tmp/abs","/tmp/inc" ==== draw plots showing disk usage of all jobs ==== fetches the status files from the backup server and draws disk usage plots #!/bin/bash # read cfg. file . rsnapshot-backup-stats-eval.cfg if [ -z "$BACKUPSERVER" ]; then echo "BACKUPSERVER unset. Use BACKUPSERVER=... $0" echo "valid values for BACKUPSERVER would be some hostname or ip address" exit 1; fi if [ "$MODE" = "FLATFILE" ]; then rsync -vaSHAX root@${BACKUPSERVER}:/var/log/rsnapshot-backup/00-stats/status-????????-?????? . rm status.sqlite sqlite3 status.sqlite 'create table status (d int, s int, v varchar(255), bc varchar(255), bset varchar(10));' for i in status-* ; do d=${i#status-} d=${d%-*} echo -n $d "" awk -v d=$d 'BEGIN{ print "begin transaction;"; } { print "insert into status(d,s,v) values ("d","$1",\""$2"\");" } END { print "commit;"; } ' $i | sqlite3 status.sqlite done echo elif [ "$MODE" = "XML" ]; then rsync -vaSHAX root@${BACKUPSERVER}:/var/log/rsnapshot-backup/00-stats/20??-??-??T??:??:??.xml . rm status.sqlite sqlite3 status.sqlite 'create table status (d int, s int, v varchar(255), bc varchar(255), bset varchar(10));' for i in 20??-??-??T??:??:??.xml ; do d=${i%.xml} echo -n $d "" cnt=$( xmlstarlet sel -t -v "count(/rsnapshot-stats/du)" < $i ) echo -n $cnt "" ( # generate sql echo "begin transaction;" for j in $( seq 1 $cnt ); do sys=$( xmlstarlet sel -t -v "/rsnapshot-stats/du[$j]/@sys" < $i ) #dev# echo -n $j $sys "" 1>&2 du_data=$( xmlstarlet sel -t -v "/rsnapshot-stats/du[$j]" < $i ) while read s bset ; do if [ -z "$s" ]; then continue ; fi ## skip empty lines bc=${sys##*/} #dev# echo "BC: $sys -> $bc" 1>&2 echo "insert into status(d,s,v,bc,bset) values (strftime('%Y%m%d','$d'),'$s','$sys/$bset','$bc','$bset');" done <<< "$du_data" done echo "commit;" ) | tee -a sql | sqlite3 status.sqlite #dev# echo -n ": " done echo else echo "MODE unset. Use MODE=... $0" echo "valid values for MODE are: FLATFILE, XML" exit 1; fi echo "*** DB is loaded ***" # normalize: sqlite3 status.sqlite "update status set v = '/u/vhost15/daily.0' where v like '/u/213.178.162.74_vhost15/daily.0';" sqlite3 status.sqlite "update status set v = '/u/vhost14/daily.0' where v like '/u/213.178.162.72_vhost14/daily.0';" sqlite3 status.sqlite "update status set v = '/u/vhost13/daily.0' where v like '/u/213.178.162.70_vhost13/daily.0';" sqlite3 status.sqlite "update status set v = '/u/vhost12/daily.0' where v like '/u/213.178.162.82_vhost12/daily.0';" sqlite3 status.sqlite "update status set v = '/u/vhost15/daily.1' where v like '/u/213.178.162.74_vhost15/daily.1';" sqlite3 status.sqlite "update status set v = '/u/vhost14/daily.1' where v like '/u/213.178.162.72_vhost14/daily.1';" sqlite3 status.sqlite "update status set v = '/u/vhost13/daily.1' where v like '/u/213.178.162.70_vhost13/daily.1';" sqlite3 status.sqlite "update status set v = '/u/vhost12/daily.1' where v like '/u/213.178.162.82_vhost12/daily.1';" DAILY0=$( sqlite3 status.sqlite "select distinct(v) from status where v like '%daily.0';" ) echo "DAILY0: $DAILY0" rm tab.*_D0 img.*_D0.png for i in $DAILY0 ; do h=${i%/*} h=${h##*/}_D0 echo "### $i -> $h ###" sqlite3 status.sqlite -separator " " "select d,s from status where v like '$i';" > tab.$h gnuplot << ..EOF set key below set xdata time set timefmt "%Y%m%d" set format x "%d.%m.\n%Y" set terminal png medium size 1200,400 set output "img.$h.png" plot "tab.$h" using 1:2 title "$h" ..EOF done DAILY1=$( sqlite3 status.sqlite "select distinct(v) from status where v like '%daily.1';" ) echo "DAILY1: $DAILY1" rm tab.*_D1 img.*_D1.png for i in $DAILY1 ; do h=${i%/*} h=${h##*/}_D1 echo "### $i -> $h ###" sqlite3 status.sqlite -separator " " "select d,s from status where v like '$i';" > tab.$h gnuplot << ..EOF set key below set xdata time set timefmt "%Y%m%d" set format x "%d.%m.\n%Y" set terminal png medium size 1200,400 set output "img.$h.png" plot "tab.$h" using 1:2 title "$h" ..EOF done # print hit lists: echo "### absolute sizes ###" sqlite3 status.sqlite -separator " " "select max(s)/1E6,'GB',v from status where v like '%daily.0' group by v order by s;" | tail -10 echo "### incremental sizes ###" sqlite3 status.sqlite -separator " " "select max(s)/1E6,'GB',v from status where v like '%daily.1' group by v order by s;" | tail -10 BACKUPSERVER="backuphost.example" MODE="XML" ==== how much time statistics take per backup client ==== STATSFILE=2017-02-25T07:07:01.xml ( cd /var/log/rsnapshot-backup/00-stats CNT=$( xmlstarlet sel -t -v "count(rsnapshot-stats/du)" $STATSFILE ) echo $CNT for i in $( seq 1 $CNT ) ; do SYS=$( xmlstarlet sel -t -v "rsnapshot-stats/du[$i]/@sys" $STATSFILE ) B=$( xmlstarlet sel -t -v "rsnapshot-stats/du[$i]/@t" $STATSFILE ) E=$( xmlstarlet sel -t -v "rsnapshot-stats/du[$i]/end/@t" $STATSFILE ) D=$(( $( date +%s -d $E ) - $( date +%s -d $B ) )) echo $SYS $D done | sort -n -k2 | awk '{s+=$2; print $0,s}END{print s}' | nl -ba ) ==== show all error messages from last (daily) run ==== # extract timestamp from crontab file HHMM=$( awk '$1 ~ /^[0-9]+$/ && $2 ~ /^[0-9]+$/ && $3 ~ /^\*$/ && $4 ~ /^\*$/ && $5 ~ /^\*$/ && $6 ~ /^[a-zA-Z0-9-]+$/ && $7 == "/usr/sbin/rsnapshot-backup" && $8 == "-i" && $9 == "daily" { print $2 $1-1 }' /etc/cron.d/rsnapshot-backup ) # assume the backup run started yesterday CDATE=$( date -d "yesterday $HHMM" -Is ) # create a file with mtime for comparision CFILE=$( mktemp /tmp/rsnapshot-backup-check-XXXXXX ) touch -d "$CDATE" "$CFILE" FILES=$( find /var/log/rsnapshot-backup -newer "$CFILE" -name "log-error*" ) for f in $FILES; do echo "### $f ####" cat $f done | less # clean up rm "$CFILE" ==== show how much time individual parts of the backup took ==== **rsnapshot-backup-timings** #!/usr/bin/perl use strict; use POSIX; sub WAIT_FOR_RM () { 1 }; sub WAIT_FOR_MV0 () { 2 }; sub WAIT_FOR_MVL () { 3 }; sub WAIT_FOR_CP () { 4 }; sub WAIT_FOR_RSYNC () { 5 }; sub WAIT_FOR_TOUCH () { 6 }; sub WAIT_FOR_END () { 7 }; sub decode_date ($) { my ( $d ) = @_; my ( $dd,$mmm,$yyyy,$hh,$mm,$ss ) = $d =~ m#(\d{2})/([A-Z][a-z][a-z])/(\d{4}):(\d{2}):(\d{2}):(\d{2})#; my $m = {qw{Jan 1 Feb 2 Mar 3 Apr 4 May 5 Jun 6 Jul 7 Aug 8 Sep 9 Oct 10 Nov 11 Dec 12}}->{$mmm} - 1; # print "$d = ( $yyyy,$mmm,$dd,$hh,$mm,$ss ) $m\n"; my $e = POSIX::mktime($ss, $mm, $hh, $dd, $m, $yyyy-1900); # print scalar localtime $e,"\n"; return $e; } FILE: for my $filename ( @ARGV ) { open F,$filename or do { warn "Can't open '$filename': $!"; next FILE; }; my $state = WAIT_FOR_RM; my $d_rm; my $d_mv0; my $d_mvl; my $d_cp; my $d_rsync; my $d_touch; LINE: while (my $l = ) { $state eq WAIT_FOR_RM and do { if ( $l =~ m#^\[(\d{2}/[A-Z][a-z][a-z]/\d{4}:\d{2}:\d{2}:\d{2})\] /bin/rm -rf # ) { my $d = $1; $d_rm = decode_date( $d ) ; $state = WAIT_FOR_MV0; } next LINE; }; $state eq WAIT_FOR_MV0 and do { #dev# print "L: $l"; if ( $l =~ m#^\[(\d{2}/[A-Z][a-z][a-z]/\d{4}:\d{2}:\d{2}:\d{2})\] mv # ) { my $d = $1; $d_mv0 = decode_date( $d ) ; $state = WAIT_FOR_CP; } next LINE; }; $state eq WAIT_FOR_CP and do { #dev# print "L CP: $l"; if ( $l =~ m#^\[(\d{2}/[A-Z][a-z][a-z]/\d{4}:\d{2}:\d{2}:\d{2})\] mv # ) { my $d = $1; $d_mvl = decode_date( $d ) ; $state = WAIT_FOR_CP; } elsif ( $l =~ m#^\[(\d{2}/[A-Z][a-z][a-z]/\d{4}:\d{2}:\d{2}:\d{2})\] /bin/cp # ) { my $d = $1; $d_cp = decode_date( $d ) ; $state = WAIT_FOR_RSYNC; } next LINE; }; #dev# print "L: $l"; $state eq WAIT_FOR_RSYNC and do { if ( $l =~ m#^\[(\d{2}/[A-Z][a-z][a-z]/\d{4}:\d{2}:\d{2}:\d{2})\] /usr/bin/rsync # ) { my $d = $1; $d_rsync = decode_date( $d ) ; $state = WAIT_FOR_TOUCH; } next LINE; }; $state eq WAIT_FOR_TOUCH and do { #dev# print "L: $l"; if ( $l =~ m#^\[(\d{2}/[A-Z][a-z][a-z]/\d{4}:\d{2}:\d{2}:\d{2})\] touch # ) { my $d = $1; $d_touch = decode_date( $d ) ; $state = WAIT_FOR_END; } last LINE; }; } # /LINE: close F; if ( ! defined $d_touch or ! defined $d_rsync or ! defined $d_cp or ! defined $d_mv0 or ! defined $d_rm ) { warn "incomplete log '$filename' (touch='$d_touch' rsync='$d_rsync' cp='$d_cp' mv0='$d_mv0' rm='$d_rm')"; next FILE; } #dev# print "RM ",scalar localtime $d_rm,"\n"; #dev# print "MV0 ",scalar localtime $d_mv0,"\n"; #dev# print "CP ",scalar localtime $d_cp,"\n"; #dev# print "RSYNC ",scalar localtime $d_rsync,"\n"; #dev# print "TOUCH ",scalar localtime $d_touch,"\n"; printf "%s %d T %d RM %d %0.2f MV %d %0.2f CP %d %0.2f RSYNC %d %0.2f\n", $filename, $d_rm, $d_touch - $d_rm, $d_mv0 - $d_rm, ( $d_mv0 - $d_rm ) * 100 / ( $d_touch - $d_rm ), $d_cp - $d_mv0, ( $d_cp - $d_mv0 ) * 100 / ( $d_touch - $d_rm ), $d_rsync - $d_cp, ( $d_rsync - $d_cp ) * 100 / ( $d_touch - $d_rm ), $d_touch - $d_rsync, ( $d_touch - $d_rsync ) * 100 / ( $d_touch - $d_rm ), ; } === Usage per backup run === rsnapshot-backup-timings /var/log/rsnapshot-backup//log-2* > /tmp/backup-times- gnuplot set xdata time set timefmt "%s" plot "backup-times-" using 2:4 with linespoints plot "backup-times-" using 2:6 with linespoints plot "backup-times-" using 2:13 with linespoints # percentage: plot "backup-times-" using 2:5 with linespoints plot "backup-times-" using 2:7 with linespoints plot "backup-times-" using 2:14 with linespoints === Usage per most recent backup runs === cd /var/log/rsnapshot-backup LATESTLOGS=$( for i in $( find . -maxdepth 1 -type d -mtime -2 ) ; do for j in $( find $i -maxdepth 1 -type f -mtime -1 -name "log-2*" ) ; do echo $j done done ) /usr/local/bin/rsnapshot-backup-timings $LATESTLOGS > /tmp/backup-times-latest gnuplot set xdata time set timefmt "%s" # total plot "backup-times-latest" using 2:4 with impulses # rm plot "backup-times-latest" using 2:6 with impulses # cp plot "backup-times-latest" using 2:12 with impulses # rsync plot "backup-times-latest" using 2:15 with impulses # rm % plot "backup-times-latest" using 2:7 with impulses # cp % plot "backup-times-latest" using 2:13 with impulses # rsync % plot "backup-times-latest" using 2:16 with impulses #sums: # T awk '{ s+= $4 ; $c++ }END{print s}' backup-times-latest # rm awk '{ s+= $6 ; $c++ }END{print s}' backup-times-latest # cp awk '{ s+= $12 ; $c++ }END{print s}' backup-times-latest # rsync awk '{ s+= $15 ; $c++ }END{print s}' backup-times-latest ===== development information ===== [[project:rsnapshot-backup:development]] ===== changelog ===== [[project:rsnapshot-backup:changelog]]