From time to time, we’re asked to fix broken sites built by other agencies. This can be extremely tricky, but if it’s technologies we know and love (PHP, MySQL, Apache, Memcached, WordPress, CodeIgniter, Laravel, Slim, etc) we usually say yes.
If a site keeps falling every n hours/days, I start by checking if there are any cronjobs around. In this example we’ll pretend that we have a cronjob running a PHP-script every minute, creating an index of all articles in the database. The first few months, this wasn’t a problem as there weren’t that much content to index. Running the cronjob took 10-15 seconds.
6 months later, there are many more articles in the database, and the index takes longer to build. All of a sudden it takes more than 1 minute to complete, and now things get hairy. After 1 minute we have 2 jobs running, and after a few hours we have hundred of cronjobs running. Eventually the server won’t have any more memory to go around, and it’ll crash.
The solution is simple, and has been around in the UNIX world since forever: implement a lock file.
I’ve left out one little bit in the gist above: If the server goes down for reboot while the script is running, there will be a lock-file preventing new crons to run. How you handle this is up to you, for me this differs from time to time. Instead of just creating an empty lock file, you could write the process id (PID) to the lock file instead and use this to check if the script really is running.