Search This Blog

2008-08-14

Syncing thousands of files across multiple destination in UNIX

Product: UNIX, Linux

UNIX file system, regardless of which type, most of them do not have self defrag feature, or it is an extra license plus configuration (re-format, re-mount) to use it. This post indicated a free and relatively quick way to defrag, and improve the file system respond time

Some applications that created 100,000 of files size 4 GB, and needs to move out to its dedicated mount point.  "ls" command alone takes 3 min to return, and tar-untar takes 45 min to complete.

The challenge is how to complete the migration within 30 min of downtime.

Assumption, there are files that are old.  Time stamp can be use to indicate whether time is changed

Solution
  1. Copy current files to new destination: (cd /dir2; tar cvf - /dir1 | tar xvf -)
  2. Find out files changes in last half day, if supported, or 1 day and save it. This will reduce time require to copy new files /tmp/newfile.txt: find /dir1 -mtime -0.5 (only in Linux)
  3. There are 2 approaches to sync up new files now
  4. Option 1: use "for F in $(cat /tmp/newfile.txt); do G=$(basename $F); if [ ! -f /dir2/$G ]; then cp $F /dir2; fi; done
  5. Option 2: Use rsync and supply /tmp/newfile.txt to sync between /dir1 and /dir2
This approach can optimize file system which gives a slow respond time when executing "ls" command. It is near impossible, or very time consuming to identify it is a fragmented inode (technical word). It could applies to database, CRM, Genesys log, Crystal Reports output directory, any html location, web servers with many scripts or logs, etc

No comments: