Got caught up with some batch performance testing this week. In one respect batch testing can be seen as relatively easy. You create the data the hard part and then run the batch schedule the easy part. We had focused on the creating the data, while our system admin colleagues would migrate the batch schedule to the test environment ready for testing.
So, we created the data and then comes the day for batch testing. Of course this happens to be 7PM on Friday night but that is the life of a performance tester some times. It was also because the sys admin migrated the start time for the jobs from production and did not what to change the start time to a more convenient time.
This is where the problems occurred. Many of the batch jobs are user scheduled primarily for creating data for reports. Coincidently these don’t have and SLA requirements and are just killed if they exceed a preset execution time. These jobs couldn’t run as in the new test database as their user credential where not configured. The next problem was pending jobs, several of the main jobs where not configured correctly this meant they went into a pending state, luckily the sys admins could correct and restart the job however, this meant that many jobs where restarted in parallel and the timings of their execution is disturbed by other jobs executing.
So a few things I want to remember for next time:
1) Do you need to test the batch schedule or just the execution of the key long jobs?
2) Can the scheduling mechanism be deployed into the test environment?
3) Make sure all user account, privileges and settings are migrated correctly!
4) Have a policy about what to do about pending jobs, restarting etc?
5) The inconvenience of changing the schedule start time may outweigh dealing with batch issues @ 3 AM in the morning.