Monday, November 10, 2014

Files as data sources - Part 3 - Moving files around

So now that file has been processed, lets look at what we should do next.

I like to have an archive folder on the file system to keep any files that have been processed, this is so that I can go back for auditing and/or dealing with any queries about the data that may come up.

I also update the STATUS field in the SYS_FILE table that references the file to COMPLETE.

Lets start with the update function:

If (db_type('Datawarehouse') = 'Oracle')
BEGIN
    $L_FileName = replace_substr( $P_FileName,'*','%');
END

sql('Datawarehouse','UPDATE SYS_FILE SET STATUS = \'COMPLETE\' WHERE FILE_LOCATION = {$P_Directory} AND FILE_NAME like {$L_FileName} AND AUDIT_ID = [$G_AuditID]');

For this function I pass in the directory and file name, look up the relevant row\s in the SYS_FILE table and then update them to COMPLETE.

First thing I do is check the database type. If I am using Oracle I will need to convert any *s to % as that is the wildcard character in oracle. You can add additional database types in here as required.

I look for the rows in the SYS_FILE table that match the file name. The reason I update more than one row at a time is because when you have multiple files processed through the file object such as File1, File2 and File3, and you refer to the file name in your file object as File*; then these files are collated into one batch for processing. So if the dataflow completed successfully then all of the files have been loaded.

Once I've update the rows in the SYS_FILE table I move the files to the archive directory with a little FileMove function:

$L_FileName = word_ext( $P_FileName ,1,'.');

exec('sh',' -c "mv ' || $P_SourceDirectory || $L_FileName  || ' ' || $P_TargetDirectory || '"',8);

In this example my job server is running on a UNIX OS, but you could re-write this a Microsoft OS by replace the 'sh' with cmd, and using 'move' instead of 'mv'.

I use the word_ext function at the top to avoid any issues with file extensions and wildcards. You may not have this issue so will not necessarily have to include the first line of the function.




No comments:

Post a Comment