In this series of posts I am going to take you through how to setup a job control table, and the many benefits of having a job control table.
At the most basic level, the point of the job control table is to keep a record of every job run, when it started, when it finished, and whether it was successful.
Here is a sample of the type of data you might have in a control table:
So looking at the above data we can see a few things.
- The job normally takes around 40 mins
- It took a lot longer on the 3rd
- It failed on the 4th
- It was started again on the 4th and is still running
We are now able to build up some statistics about our job using this simple table.
Here is a simple SQL Server script that will create the above table:
CREATE TABLE [dbo].[SYS_AUDIT_JOB](
[AUDIT_ID] [int] NOT NULL,
[JOB_NAME] [varchar](100) NULL,
[JOB_START_DATETIME] [datetime] NULL,
[JOB_END_DATETIME] [datetime] NULL,
[STATUS] [varchar](15) NULL,
CONSTRAINT [PK_SYS_AUDIT_JOB] PRIMARY KEY CLUSTERED
(
[AUDIT_ID] ASC
)
) ON [PRIMARY]
The reason I put the prefix SYS on this table is to differentiate from the rest of the warehouse tables that would normally have DIM or FACT as their prefix. SYS means its a table used by the ETL, and is not intended for reporting. At least not by end users.
In order to load this table with the correct information I have created some Data Services functions.
The first function is called NewAuditRecord. As the name implies, its purpose is to create a new record in the audit table when you start the job. Here is the code for the function:
#################################################################
#This function will log the start of the job and its details into
#the SYS_AUDIT_JOB table and return an unique Audit_ID for the job
#################################################################
#Check if the input job name is not null. If not then get the next value from the sequance
#number and return as Audit_ID. And insert a new record into the SYS_AUDIT_JOB table marking
#the start of the job with current date and time
if($P_JobName is not null)
begin
#Get the unique Audit_id
$L_AuditID = SQL('DATAMART_SQL_TARGET_DS','SELECT max(AUDIT_ID) from SYS_AUDIT_JOB') + 1;
#Insert a new record into the SYS_AUDIT_JOB table
$L_Ret = SQL('DATAMART_SQL_TARGET_DS','INSERT INTO SYS_AUDIT_JOB (AUDIT_ID,JOB_NAME,JOB_START_DATETIME, JOB_END_DATETIME,STATUS, ETL_VERSION) VALUES ([$L_AuditID],{$P_JobName},{$P_EffectiveDateAndTime},NULL,\'STARTED\',{$G_ETLVersion})');
print('****************************************************');
print('** ');
print('** Inserted new record into SYS_AUDIT_JOB.');
print('** ');
print('** Return value = {$L_Ret}');
print('** Audit ID = {$L_AuditID}');
print('** ');
print('****************************************************');
#return the unique Audit_Id for this Job
Return $L_AuditID ;
end
Return 0;
As you can see its a pretty straight forward function in that you just pass it you job name, and the current datetime, and it inserts a row in the SYS_AUDIT_JOB table. Finally it prints out some information to the trace table. It also returns the audit_id to a variable called $G_AuditID - this will come in handy in part 2.
You may also notice I set the status to STARTED.
Here is how I call the function in the initialise script of my job:
$G_AuditID = NewAuditRecord($G_JobName, $G_EffectiveDateAndTime);
$G_JobName is a global variable and I populate like this:
$G_JobName = job_name();
$G_EffectiveDatAndTime is also a global variable that I populate like this:
$G_EffectiveDateAndTime = sysdate();
Once the job is complete I have another function called EndJob which bascially has this piece of code in it:
SQL('DATAMART_SQL_TARGET_DS','UPDATE SYS_AUDIT_JOB SET JOB_END_DATETIME = getdate(), STATUS = \'COMPLETE\' WHERE AUDIT_ID = {$P_AuditID} AND JOB_NAME = {$P_JobName}');
You'll notice that not only did I update the end time, but also set the status to COMPLETE.
One more thing...........
What if the job fails?
Well, in my try\catch block I have a function called HandleException. This sends out an email on failure, but it also closes off the line in the order table with the fail time and a status of FAILED.
SQL('DATAMART_SQL_TARGET_DS','UPDATE SYS_AUDIT_JOB SET JOB_END_DATETIME = getdate(), STATUS = \'
FAILED\' WHERE AUDIT_ID = {$G_AuditID} AND JOB_NAME = {$G_JobName}');
So now that we have setup the job control table, my next few posts will take you through the benefits of having it beyond just simple job statistics.