Table of Contents

rsnapshot-backup

rsnapshot-backup is a frontend to rsnapshot for managing larger backup sites (several dozen systems) of un*x-like operation system instances. It is easy to use, esp. additional backup clients are configured in a jiffy. If you have to backup several dozen uni*x-like systems this is your tool.

components

installation

cat << EOF >> /etc/apt/sources.list.d/lihas
deb http://ftp.lihas.de/debian stable main
EOF
wget -O - http://ftp.lihas.de/debian/apt-key-lihas.gpg | apt-key add -
apt-get update
apt-get install rsnapshot-backup

operation

Normally, rsnapshot-backup runs unattended. You just keep a watch on disk space.

usage: /usr/sbin/rsnapshot-backup [ -h ] [ -i { hourly | daily | weekly | monthly } ]
-h  help
-i interval. One of hourly, daily, weekly, monthly. Default: daily.
-c config file. Use alternate config file. Default: /etc/rsnapshot-backup.d/conf
-C check connectivity for all enabled backup jobs.

create a backup job

resources

/usr/share/doc/rsnapshot-backup holds some template and example files:

see also

tips & tricks

make rsnapshot-backup aware of ACLs and Extended Attributes

Debian 8 (Jessie) uses Capabilities (try: getcap /bin/ping) which are stored as Extended Attributes.

Rsync needs --acls --xattrs to handle ACLs and XATTRs.

These parameters have to be added per backup job.

The copy & paste code below adds these. It extends existing lines beginning with rsync_long_args .. or inserts rsync_long_args\t\t--delete --numeric-ids --relative --delete-excluded --acls --xattrs into the files /etc/rsnapshot-backup.d/conf.* after the last line containing rsync_long_args which is usually found in a comment block. If these comments don't exist the code throws an error leaving it to the admin to insert the line into the file.

This is because the code has no chance of finding the right spot in this case.

for C in /etc/rsnapshot-backup.d/conf.* ; do
vo -o $C
perl -e '
    $sq = "\x27"; # single quote
    $nl = "rsync_long_args\t\t--delete --numeric-ids --relative --delete-excluded --acls --xattrs\n";
    open F,$ARGV[0] or die "can${sq}t read ${sq}$ARGV[0]${sq}";
    my @L = <F>;
    close F;
    open F,">",$ARGV[0] or die "can${sq}t write ${sq}$ARGV[0]${sq}";
    @L2 = grep { /^rsync_long_args\t/ } @L;
    @L3 = grep { /\brsync_long_args\b/ } @L;
    if ( @L2 ) {
        print STDERR "adding to rsync_long_args\n";
        for my $l ( @L ) {
            if ( $l =~ /^rsync_long_args\t/ ) {
                $ll = $l;
                $ll =~ s/^rsync_long_args\t\t*//;
                $ll =~ s/\s*$//;
                %A = map { $_ => 1 } split /\s+/,$ll;
                unless ( $A{"--acls"} ) {
                    $l =~ s/\n/ --acls\n/;
                }
                unless ( $A{"--xattrs"} ) {
                    $l =~ s/\n/ --xattrs\n/;
                }
            }
            print F $l;
        }
    } elsif ( @L3 ) {
        print STDERR "adding rsync_long_args line\n";
        for (my $i=$#L; $i>=0; $i--) {
            if ( $L[$i] =~ /[^a-zA-Z_+]rsync_long_args\b/ ) {
                splice(@L,$i+1,0,$nl);
                last;
            }
        }
        print F @L;
    } else {
        # unchanged
        print STDERR "ERROR: doing nothing, add to ${sq}$ARGV[0]${sq} manually:\n";
        print STDERR $nl;
        print STDERR "Insert TABS after ${sq}rsync_long_args${sq}!\n";
        print F @L;
    }
    close F;
' $C
rcsdiff -u $C
vo -i $C
done

restore from backups created using "--fake-super"

The situation:

To restore contents from this kind of backup you again need --fake-super:

special considerations using a secondary backup server

If you want a secondary backup server replicating the daily.0 backups from the primary backup server (which would be the one fetching the data from the backup clients) there are some things to consider on the secondary (ternary, etc) system(s):

draw a plot showing disk usage of a job

JOB=mysystem
JOBDIR=$( awk '$1 == "snapshot_root" { print $2 }' /etc/rsnapshot-backup.d/conf.$JOB )

echo -n > /tmp/abs 
echo -n > /tmp/inc
for i in /var/log/rsnapshot-backup/00-stats/*.xml ; do
	X=$( xmlstarlet sel -t -v 'rsnapshot-stats/du[@sys="'$JOBDIR'"]' $i ) 
	awk '$2 == "daily.0" { print $1 }' <<< "$X" >> /tmp/abs
	awk '$2 == "daily.1" { print $1 }' <<< "$X" >> /tmp/inc
done 

gnuplot
  set style data linespoints
  plot "/tmp/abs","/tmp/inc"

draw plots showing disk usage of all jobs

fetches the status files from the backup server and draws disk usage plots

rsnapshot-backup-stats-eval
#!/bin/bash
 
# read cfg. file
. rsnapshot-backup-stats-eval.cfg
 
if [ -z "$BACKUPSERVER" ]; then
    echo "BACKUPSERVER unset. Use BACKUPSERVER=... $0"
    echo "valid values for BACKUPSERVER would be some hostname or ip address"
    exit 1;
fi
 
if [ "$MODE" = "FLATFILE" ]; then
 
    rsync -vaSHAX root@${BACKUPSERVER}:/var/log/rsnapshot-backup/00-stats/status-????????-?????? .
 
    rm      status.sqlite
    sqlite3 status.sqlite 'create table status (d int, s int, v varchar(255), bc varchar(255), bset varchar(10));'
 
    for i in status-* ; do
        d=${i#status-}
        d=${d%-*}
        echo -n $d ""
        awk -v d=$d 'BEGIN{
                print "begin transaction;";
            }
            { print "insert into status(d,s,v) values ("d","$1",\""$2"\");" }
            END {
                print "commit;";
            }
            ' $i |
            sqlite3 status.sqlite
    done
    echo
 
elif [ "$MODE" = "XML" ]; then
    rsync -vaSHAX root@${BACKUPSERVER}:/var/log/rsnapshot-backup/00-stats/20??-??-??T??:??:??.xml .
 
    rm      status.sqlite
    sqlite3 status.sqlite 'create table status (d int, s int, v varchar(255), bc varchar(255), bset varchar(10));'
 
    for i in 20??-??-??T??:??:??.xml ; do
        d=${i%.xml}
        echo -n $d ""
        cnt=$( xmlstarlet sel -t -v "count(/rsnapshot-stats/du)" < $i )
        echo -n $cnt ""
        (  # generate sql
            echo "begin transaction;"
            for j in $( seq 1 $cnt ); do
                sys=$(
                    xmlstarlet sel -t -v "/rsnapshot-stats/du[$j]/@sys" < $i
                )
                #dev# echo -n $j $sys "" 1>&2
                du_data=$(
                    xmlstarlet sel -t -v "/rsnapshot-stats/du[$j]" < $i
                )
                while read s bset ; do
                    if [ -z "$s" ]; then continue ; fi  ## skip empty lines
                    bc=${sys##*/}
                    #dev# echo "BC: $sys -> $bc" 1>&2
                    echo "insert into status(d,s,v,bc,bset) values (strftime('%Y%m%d','$d'),'$s','$sys/$bset','$bc','$bset');"
                done <<< "$du_data"
            done
            echo "commit;"
        ) | tee -a sql | sqlite3 status.sqlite
        #dev# echo -n ": "
    done
    echo
 
else
    echo "MODE unset. Use MODE=... $0"
    echo "valid values for MODE are: FLATFILE, XML"
    exit 1;
fi
 
echo "*** DB is loaded ***"
 
# normalize:
sqlite3 status.sqlite "update status set v = '/u/vhost15/daily.0' where v like '/u/213.178.162.74_vhost15/daily.0';"
sqlite3 status.sqlite "update status set v = '/u/vhost14/daily.0' where v like '/u/213.178.162.72_vhost14/daily.0';"
sqlite3 status.sqlite "update status set v = '/u/vhost13/daily.0' where v like '/u/213.178.162.70_vhost13/daily.0';"
sqlite3 status.sqlite "update status set v = '/u/vhost12/daily.0' where v like '/u/213.178.162.82_vhost12/daily.0';"
 
sqlite3 status.sqlite "update status set v = '/u/vhost15/daily.1' where v like '/u/213.178.162.74_vhost15/daily.1';"
sqlite3 status.sqlite "update status set v = '/u/vhost14/daily.1' where v like '/u/213.178.162.72_vhost14/daily.1';"
sqlite3 status.sqlite "update status set v = '/u/vhost13/daily.1' where v like '/u/213.178.162.70_vhost13/daily.1';"
sqlite3 status.sqlite "update status set v = '/u/vhost12/daily.1' where v like '/u/213.178.162.82_vhost12/daily.1';"
 
DAILY0=$(
    sqlite3 status.sqlite "select distinct(v) from status where v like '%daily.0';"
)
echo "DAILY0: $DAILY0"
 
rm tab.*_D0 img.*_D0.png
 
for i in $DAILY0 ; do
  h=${i%/*}
  h=${h##*/}_D0
  echo "### $i -> $h ###"
  sqlite3 status.sqlite -separator " " "select d,s from status where v like '$i';" > tab.$h
  gnuplot << ..EOF
  set key below
  set xdata time
  set timefmt "%Y%m%d"
  set format x "%d.%m.\n%Y"
  set terminal png medium size 1200,400
  set output "img.$h.png"
  plot "tab.$h" using 1:2 title "$h"
..EOF
 
done
 
 
DAILY1=$(
    sqlite3 status.sqlite "select distinct(v) from status where v like '%daily.1';"
)
echo "DAILY1: $DAILY1"
 
rm tab.*_D1 img.*_D1.png
 
for i in $DAILY1 ; do
  h=${i%/*}
  h=${h##*/}_D1
  echo "### $i -> $h ###"
  sqlite3 status.sqlite -separator " " "select d,s from status where v like '$i';" > tab.$h
  gnuplot << ..EOF
  set key below
  set xdata time
  set timefmt "%Y%m%d"
  set format x "%d.%m.\n%Y"
  set terminal png medium size 1200,400
  set output "img.$h.png"
  plot "tab.$h" using 1:2 title "$h"
..EOF
 
done
 
# print hit lists:
echo "### absolute sizes ###"
sqlite3 status.sqlite -separator " " "select max(s)/1E6,'GB',v from status where v like '%daily.0' group by v order by s;" | tail -10
echo "### incremental sizes ###"
sqlite3 status.sqlite -separator " " "select max(s)/1E6,'GB',v from status where v like '%daily.1' group by v order by s;" | tail -10
rsnapshot-backup-stats-eval.cfg
BACKUPSERVER="backuphost.example"
MODE="XML"

how much time statistics take per backup client

STATSFILE=2017-02-25T07:07:01.xml

(
cd /var/log/rsnapshot-backup/00-stats
CNT=$(     xmlstarlet sel -t -v "count(rsnapshot-stats/du)"     $STATSFILE )
echo $CNT
for i in $( seq 1 $CNT ) ; do
    SYS=$( xmlstarlet sel -t -v "rsnapshot-stats/du[$i]/@sys"   $STATSFILE )
    B=$(   xmlstarlet sel -t -v "rsnapshot-stats/du[$i]/@t"     $STATSFILE )
    E=$(   xmlstarlet sel -t -v "rsnapshot-stats/du[$i]/end/@t" $STATSFILE )
    D=$(( $( date +%s -d $E ) - $( date +%s -d $B ) ))
    echo $SYS $D
done | sort -n -k2 | awk '{s+=$2; print $0,s}END{print s}' | nl -ba
)

show all error messages from last (daily) run

# extract timestamp from crontab file
HHMM=$(
awk '$1 ~ /^[0-9]+$/ && $2 ~ /^[0-9]+$/ && 
     $3 ~ /^\*$/     && $4 ~ /^\*$/     &&
	 $5 ~ /^\*$/     && $6 ~ /^[a-zA-Z0-9-]+$/ && 
	 $7 == "/usr/sbin/rsnapshot-backup" &&
	 $8 == "-i" &&
	 $9 == "daily" { print $2 $1-1 }' /etc/cron.d/rsnapshot-backup
) 
# assume the backup run started yesterday
CDATE=$( date -d "yesterday $HHMM" -Is ) 
# create a file with mtime for comparision
CFILE=$( mktemp /tmp/rsnapshot-backup-check-XXXXXX )
touch -d "$CDATE" "$CFILE"
FILES=$( 
	find /var/log/rsnapshot-backup -newer "$CFILE" -name "log-error*"
) 
for f in $FILES; do 
	echo "### $f ####"
	cat  $f
done | less
# clean up 
rm "$CFILE"

show how much time individual parts of the backup took

rsnapshot-backup-timings

#!/usr/bin/perl


use strict;
use POSIX;

sub WAIT_FOR_RM    () { 1 };
sub WAIT_FOR_MV0   () { 2 };
sub WAIT_FOR_MVL   () { 3 };
sub WAIT_FOR_CP    () { 4 };
sub WAIT_FOR_RSYNC () { 5 };
sub WAIT_FOR_TOUCH () { 6 };
sub WAIT_FOR_END   () { 7 };

sub decode_date ($) {
        my ( $d ) = @_;
        my ( $dd,$mmm,$yyyy,$hh,$mm,$ss ) = $d =~ m#(\d{2})/([A-Z][a-z][a-z])/(\d{4}):(\d{2}):(\d{2}):(\d{2})#;
        my $m = {qw{Jan 1 Feb 2 Mar 3 Apr 4 May 5 Jun 6 Jul 7 Aug 8 Sep 9 Oct 10 Nov 11 Dec 12}}->{$mmm} - 1;
        # print "$d = ( $yyyy,$mmm,$dd,$hh,$mm,$ss ) $m\n";
        my $e = POSIX::mktime($ss, $mm, $hh, $dd, $m, $yyyy-1900);
        # print scalar localtime $e,"\n";
        return $e;
}

FILE: for my $filename ( @ARGV ) {
        open F,$filename or do {
                warn "Can't open '$filename': $!";
                next FILE;
        };

        my $state = WAIT_FOR_RM;
        my $d_rm;
        my $d_mv0;
        my $d_mvl;
        my $d_cp;
        my $d_rsync;
        my $d_touch;

        LINE: while (my $l = <F>) {
                $state eq WAIT_FOR_RM and do {
                        if ( $l =~ m#^\[(\d{2}/[A-Z][a-z][a-z]/\d{4}:\d{2}:\d{2}:\d{2})\] /bin/rm -rf # ) {
                                my $d = $1;
                                $d_rm = decode_date( $d ) ;
                                $state = WAIT_FOR_MV0;
                        }
                        next LINE;
                }; 
                $state eq WAIT_FOR_MV0 and do {
                #dev# print "L: $l";
                        if ( $l =~ m#^\[(\d{2}/[A-Z][a-z][a-z]/\d{4}:\d{2}:\d{2}:\d{2})\] mv # ) {
                                my $d = $1;
                                $d_mv0 = decode_date( $d ) ;
                                $state = WAIT_FOR_CP;
                        }
                        next LINE;
                }; 
                $state eq WAIT_FOR_CP and do {
                #dev# print "L CP: $l";
                        if ( $l =~ m#^\[(\d{2}/[A-Z][a-z][a-z]/\d{4}:\d{2}:\d{2}:\d{2})\] mv # ) {
                                my $d = $1;
                                $d_mvl = decode_date( $d ) ;
                                $state = WAIT_FOR_CP;
                        } elsif (  $l =~ m#^\[(\d{2}/[A-Z][a-z][a-z]/\d{4}:\d{2}:\d{2}:\d{2})\] /bin/cp # ) {
                                my $d = $1;
                                $d_cp = decode_date( $d ) ;
                                $state = WAIT_FOR_RSYNC;
                        }
                        next LINE;
                }; 
                #dev# print "L: $l";
                $state eq WAIT_FOR_RSYNC and do {
                        if ( $l =~ m#^\[(\d{2}/[A-Z][a-z][a-z]/\d{4}:\d{2}:\d{2}:\d{2})\] /usr/bin/rsync # ) {
                                my $d = $1;
                                $d_rsync = decode_date( $d ) ;
                                $state = WAIT_FOR_TOUCH;
                        }
                        next LINE;
                }; 
                $state eq WAIT_FOR_TOUCH and do {
                #dev# print "L: $l";
                        if ( $l =~ m#^\[(\d{2}/[A-Z][a-z][a-z]/\d{4}:\d{2}:\d{2}:\d{2})\] touch # ) {
                                my $d = $1;
                                $d_touch = decode_date( $d ) ;
                                $state = WAIT_FOR_END;
                        }
                        last LINE;
                }; 
        } # /LINE: 
        close F;
        if ( ! defined $d_touch or ! defined $d_rsync or ! defined $d_cp or ! defined $d_mv0 or ! defined $d_rm ) {
                warn "incomplete log '$filename' (touch='$d_touch' rsync='$d_rsync' cp='$d_cp' mv0='$d_mv0' rm='$d_rm')";
                next FILE;
        }

        #dev# print "RM    ",scalar localtime $d_rm,"\n";
        #dev# print "MV0   ",scalar localtime $d_mv0,"\n";
        #dev# print "CP    ",scalar localtime $d_cp,"\n";
        #dev# print "RSYNC ",scalar localtime $d_rsync,"\n";
        #dev# print "TOUCH ",scalar localtime $d_touch,"\n";

        printf "%s %d T %d  RM %d %0.2f   MV %d %0.2f   CP %d %0.2f   RSYNC %d %0.2f\n",
                $filename,
                $d_rm,     
                $d_touch - $d_rm,  
                $d_mv0   - $d_rm,    ( $d_mv0   - $d_rm    ) * 100 / ( $d_touch - $d_rm ),
                $d_cp    - $d_mv0,   ( $d_cp    - $d_mv0   ) * 100 / ( $d_touch - $d_rm ),
                $d_rsync - $d_cp,    ( $d_rsync - $d_cp    ) * 100 / ( $d_touch - $d_rm ),
                $d_touch - $d_rsync, ( $d_touch - $d_rsync ) * 100 / ( $d_touch - $d_rm ),
        ;
}

Usage per backup run

  
rsnapshot-backup-timings /var/log/rsnapshot-backup/<HOSTNAME>/log-2* > /tmp/backup-times-<HOSTNAME>
gnuplot
set xdata time
set timefmt "%s"
plot "backup-times-<HOSTNAME>" using 2:4 with linespoints
plot "backup-times-<HOSTNAME>" using 2:6 with linespoints
plot "backup-times-<HOSTNAME>" using 2:13 with linespoints
# percentage:
plot "backup-times-<HOSTNAME>" using 2:5 with linespoints
plot "backup-times-<HOSTNAME>" using 2:7 with linespoints
plot "backup-times-<HOSTNAME>" using 2:14 with linespoints

Usage per most recent backup runs

cd /var/log/rsnapshot-backup

LATESTLOGS=$(
for i in $( find . -maxdepth 1 -type d -mtime -2 ) ; do
    for j in $( find $i -maxdepth 1 -type f -mtime -1 -name "log-2*" ) ; do
        echo $j
    done
done
)
/usr/local/bin/rsnapshot-backup-timings $LATESTLOGS > /tmp/backup-times-latest
gnuplot
set xdata time
set timefmt "%s"
# total
plot "backup-times-latest" using 2:4 with impulses
# rm
plot "backup-times-latest" using 2:6 with impulses
# cp
plot "backup-times-latest" using 2:12 with impulses
# rsync
plot "backup-times-latest" using 2:15 with impulses

# rm %
plot "backup-times-latest" using 2:7 with impulses
# cp %
plot "backup-times-latest" using 2:13 with impulses
# rsync %
plot "backup-times-latest" using 2:16 with impulses

#sums:
# T
awk '{ s+= $4 ; $c++ }END{print s}' backup-times-latest
# rm
awk '{ s+= $6 ; $c++ }END{print s}' backup-times-latest
# cp
awk '{ s+= $12 ; $c++ }END{print s}' backup-times-latest
# rsync
awk '{ s+= $15 ; $c++ }END{print s}' backup-times-latest

development information

development

changelog

changelog