النسخ الإحتياطي للأرشيف - أشكال ألوان - English

من ويكي أضِف
اذهب إلى التنقل اذهب إلى البحث

There has not been a complete agreed upon solution for the backup of Ashkal Alwan Archive, but a few guidelines were drawn in order to make sure that a simple yet comprehensive backup process can take place, and that it is also meaningful in relation to this particular data; an organized audio visual content. below is an elaboration description of them:

Logistics

Ashkal alwan Current Storage that holds all of their data is a LaCie cube raided device, that can hold 16TB of data, at the moment of writing this document, there are 5.3 TB of them used.

In order to streamline a simple and regular backup process, it was agreed that ashkal alwan will use 4TB chunks of storage as a storage/backup unit.

Those Chunks would be easier to track for changes than tracking a messy ever growing collection of files.

Those chunks, presented as partitions represent volumes within pan.do/ra backend

Every Chunk is to be named, proposed naming scheme (and already used one) desribed below, every back up chunk is also identifiable with its original:


Naming Scheme

Every Chunk would be named with a source identifier (in order to accomodate future bulk bequests or similar), and would have a unique sequence number within the group of volumes coming from that source, those chunks are considered living volumes meaning, it is expected that they do change.

Every Backup MUST retain the name of its originating living volume, it also has to retain a sequence number that represents the number of copies that it transends at the time of backup, in addition to a time stamp represting the date when this backup was performed, in the following format:

XYZNN[BN[YYYYMMDD[-N]]]

Where XYZ is an abbreviation of the source name, N is a sequential numerical value, YYYYMMDD represents a date

Example

AAA01 : Ashkal Alwan Archive Chunk/Disk/Partition/Volume NumberOne (01)

AAA01B120151027 : A Backup, numbered #1 of the above volume, dated on the 27th of October(10) 2015

Those names SHOULD reflect naming in pandora backend for easier tracking.

Current backup situation

There is now 2 external disks, each holds around 4TB of data, they are to be used rotationaly to backup data from the LaCie disks, also, the LaCie disks should be by now partitoned into 4 “living” partitions, where one of them is almost full, and the other is just begining to be populated.

Suggestions and Advise

Due to the value of structures holding files as a poorly described/explained yet explicit information about existing data, keeping those structures for future refrence might be a good idea, this can be accomplished by using (and retaining relation to, backing up...etc) hard links.

hard links can also be used to : keep snapshots of structures across the life cycle of a living volue and can be used to retain old structures when newer structures are agreed upon to store data.

A simple proposition for a solution

As Interlink did not get back to us with a specific product name nor with its features and capabilities, i shall describe below my depiction for a very simple and free yet powerful backup procedure, I shall also mention caveats where possible.


Keep Old Strucutres

After every backup cycle, clone the structure tree using simple unix command :

cp -lr volpath/filesRoot volpath/.filesRootBYYYYMMDD

Example:

cp -lr /Volumes/AAA01/NEO_DIGITAL_ARCHIVE /Volumes/AAA01/.NEO_DIGITAL_ARCHIVE_B20151031

into a parallel tree of hard links, that parallel tree should be hidden and should retain a timestamp

Create carbon copies, keep hardlinks

Create a carbon copy of the Volumes/Partitons that retains hard link relations as well using the rsync tool, the following syntax is suggested

rsync -raiHv --info=progress2 SRCDISK/ DSTDISK/

(you might remove --info=progress2 or replace it with --progress on mac) Example

rsync -raiHv /Volumes/AAA01/ /Volumes/
Cycle carefully between disks

When creating a new backup, pick a new disk, or if not at hand, pick a backup disk that you have access to a later backup of its living disk (i.e, find AAA01B2YYYYMMDD and then use AAA01B1YYYYMMDD as your backup disk).  either take clear note that you will need to reset your backup counter; renaming backup volume names outside the backup process is not a good idea.

Compare files on disk to identify changes

To compare disks (that of a backup and current files), also use rsync in "dry run mode" adding optin -n to above syntax:

Example:

rsync -raiHvn /Volumes/AAA01/ /Volumes/AAA01B20151031/
Preventing data loss

Keep as many copies as you can, Keep as many copies offsite as you can. preserving digital data is solely dependent on abundance of copies.

Database Backup

There is a backup of the database, everyday, and a dump is stored in /srv/pgdumps ; this is controlled by scripts /usr/local/bin/pg_dump_db.sh and /usr/local/bin/pg_dump_housekeeping.sh, and scheduled with two crontabs entries in /etc/crontab


Further Ideas

Metadata to files