Multi-Process Persistent Applications: Part 2 - Multiple Children
Introduction
In the first part of this tutorial, we covered the basics of forking and detaching from the terminal. That tutorial was very procedural and straight forward. In this tutorial we're going to be dealing with multiple child forks. This part is going to be significantly more complicated than the previous, so don't feel bad if you begin to feel overwhelmed at some point. It will all come to you in time!
The Base Class: Forking, Signal Handling
The first step in handling multiple forks is going to be isolating the forking mechanism in a class, so that we can utilize it many times over. The way I'm going to do this is by creating a base class that will handle the forking itself, and create a loop where child classes can implement functionality. That is, my first class ForkedProcess will be abstract and never instantiated. It will expect a child class to implement a method which will be executed N times per second. It is at this point of execution that the child class can choose to do work, or it can do nothing. This will be the persisting loop of that particular process, and upon being broken, that process will exit.
So, first off, let's create a class. I have created a fairly basic implementation which we will later expand to handle more complicated functionality. For now, our goal is merely to create a class which handles forking. It is not yet ready to be used.
ForkedProcess.php
<?php
declare (ticks = 1);
abstract class ForkedProcess
{
protected $continue_execution;
protected $detached = FALSE;
protected $sleep_time;
protected $PID;
protected $signal_cache = array();
/**
* @param int $sleep_time Time to sleep between each poll
*/
public function __construct($sleep_time = 100000)
{
$this->sleep_time = $sleep_time;
$this->signal_cache = array_fill(0, 64, FALSE);
}
/**
* Function that stores signals very quickly. Acts as the signal handler for php.
*
* @param int $signal
*/
protected function handleSignal($signal)
{
$this->signal_cache[$signal] = TRUE;
}
/**
* @param int $signal
*/
protected function enableSignal($signal)
{
pcntl_signal($signal, array($this, "handleSignal"));
}
/**
* returns TRUE if the signal has been received. If it has been received,
* the signal is reset in the signal cache.
*
* @param int $signal
* @return bool
*/
protected function hasSignal($signal)
{
if ($this->signal_cache[$signal])
{
$this->signal_cache[$signal] = FALSE;
return TRUE;
}
return FALSE;
}
/**
* @param string $message
*/
protected function debug($message)
{
echo "{$this->PID}\\".get_class($this)."> " . $message . "\n";
}
/**
* Create background fork.
* @return int The PID of the child process
*/
public function fork($detach = TRUE)
{
$PID = pcntl_fork();
if ($PID == -1)
{
throw new Exception("Unable to fork");
}
else if ($PID > 0)
{
return $PID;
}
$this->enableSignal(SIGTERM);
$this->enableSignal(SIGINT);
if ($detach == TRUE)
{
if (posix_setsid() == -1)
{
throw new Exception("Unable to detach from controlling terminal!");
}
$this->detached = TRUE;
}
$this->PID = posix_getpid();
$this->continue_execution = TRUE;
$this->onStartup();
while ($this->continue_execution)
{
if ($this->hasSignal(SIGTERM) || $this->hasSignal(SIGINT))
{
$this->quit();
}
$this->tick();
usleep($this->sleep_time);
}
exit(0);
}
protected function quit()
{
$this->continue_execution = FALSE;
$this->onExit();
}
protected abstract function onStartup();
protected abstract function onExit();
protected abstract function tick();
}
?>
You'll notice a few very important additions to the functionality, as well as some simple event handler prototypes which do nothing. The big addition here is the handling of signals. Signals are sent to the process by the operating system and act as instructions to do something. Usually, this is a kill order. There are constants in PHP which represent integer values of the signals typically sent. To see a full list of the signals on your system, you can run the following command:
root@localhost ~ # kill -l
1) SIGHUP 2) SIGINT 3) SIGQUIT 4) SIGILL
5) SIGTRAP 6) SIGABRT 7) SIGBUS 8) SIGFPE
9) SIGKILL 10) SIGUSR1 11) SIGSEGV 12) SIGUSR2
13) SIGPIPE 14) SIGALRM 15) SIGTERM 16) SIGSTKFLT
17) SIGCHLD 18) SIGCONT 19) SIGSTOP 20) SIGTSTP
21) SIGTTIN 22) SIGTTOU 23) SIGURG 24) SIGXCPU
25) SIGXFSZ 26) SIGVTALRM 27) SIGPROF 28) SIGWINCH
29) SIGIO 30) SIGPWR 31) SIGSYS 34) SIGRTMIN
35) SIGRTMIN+1 36) SIGRTMIN+2 37) SIGRTMIN+3 38) SIGRTMIN+4
39) SIGRTMIN+5 40) SIGRTMIN+6 41) SIGRTMIN+7 42) SIGRTMIN+8
43) SIGRTMIN+9 44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12
47) SIGRTMIN+13 48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14
51) SIGRTMAX-13 52) SIGRTMAX-12 53) SIGRTMAX-11 54) SIGRTMAX-10
55) SIGRTMAX-9 56) SIGRTMAX-8 57) SIGRTMAX-7 58) SIGRTMAX-6
59) SIGRTMAX-5 60) SIGRTMAX-4 61) SIGRTMAX-3 62) SIGRTMAX-2
63) SIGRTMAX-1 64) SIGRTMAX
The signals we care the most about are SIGTERM (OS telling us to exit) and SIGINT (Similar to SIGTERM but usually caused by Ctrl+C from the terminal). Typically, PHP handles these for us. The reason we're taking over this process is to create a standard way of handling shutdown tasks within an object, without using register_shutdown_function, destructors, or anything like that. We specifically want to do fork-specific shutdown tasks which should not happen in certain situations. The base class itself does not have any shutdown tasks, but it expects the child class to implement a couple of functions: onExit() and onStartup() which is meant to be overriden in child classes, so that those classes can do special clean up. The ForkedParentProcess, for example, will want to clean up child processes before it is allowed to exit. PHP provides access to this functionality through the pcntl_signal() function, which we wrap for ease of use. An important thing to remember is that a signal handler should always exit as quickly as possible. That is why we keep a private variable ($signal_cache) which represents any signals received.
Another thing you should notice is this line:
declare(ticks = 1);
PHP has a method of invoking function known as tick functions. I won't go into grave detail here as it's not really important, but you can read about them in the documentation here.
Now that we have our base functionality, we need to define what types of forked processes we need. The way we handle multiple forked child processes is by having a common parent which manages all of these children. This way, we can issue a kill to the parent process, and it will happily tell all of its children to exit as well. In addition to that necessary functionality, it provides a common point of communication. If we have work to delegate, the parent process can handle that. So, we need a type of forked process that can, itself, create forks. We can implement this by extending the base forked class to create a ForkedParentProcess. We will also implement a ForkedWorkerProcess. Below is a diagram to help illustrate the relationships of these processes.

In this diagram, the Starter Process is the process which merely starts, forks and backgrounds the Controlling Process, which is a ForkedParentProcess. The Starter Process then terminates, leaving the Controlling Process to create its own pool of children and to more or less be our persistent process. The Worker Processes in the diagram are represented in our code by the ForkedWorkerProcess. If the Controlling process exits, the children must be instructed to exit as well, since they are useless without a parent and will become rogue processes if we do not deal with them properly. Below is another diagram which shows the flow of the individual processes, from inception to termination.

Parent Process, Child Reaping
Now that we have a firm understanding of how our processes should work, we can move forward actually writing the code for them. In this next section of code, I will introduce the concept of reaping which is the process of properly cleaning up exited child processes. There's a lot of code below, so take some time to soak it up. I will explain it all.
ForkedParentProcess.php
<?php
require_once 'ForkedProcess.php';
require_once 'ForkedWorkerProcess.php';
/**
* Represents a process which spawns other (worker) processes
*/
class ForkedParentProcess extends ForkedProcess
{
/**
* Max number of workers to maintain
* @var int
*/
private $worker_count;
/**
* PIDs of our child workers
* @var array
*/
private $children = array();
/**
* @param int $worker_count Number of workers
* @param int $sleep_time Number of microseconds to sleep between each tick
*/
public function __construct($worker_count = 5, $sleep_time = 100000)
{
parent::__construct($sleep_time);
$this->worker_count = $worker_count;
}
/**
* Called when the process is exiting
*/
public function onExit()
{
$this->killChildren();
}
/**
* Called when the process has just started
*/
public function onStartup()
{
$this->debug("Alive");
while (count($this->children) < $this->worker_count)
{
$worker = new ForkedWorkerProcess();
$this->children[] = $worker->fork(FALSE);
}
$this->enableSignal(SIGCHLD);
}
/**
* Issues SIGTERM to all workers
*/
protected function killChildren()
{
foreach ($this->children as $child_pid)
{
posix_kill($child_pid, SIGTERM);
}
}
/**
* Find any exited workers and clean them up
*
*/
protected function reapChildren()
{
while ($return_code = pcntl_wait($status, WNOHANG | WUNTRACED))
{
$this->debug("reaping child " . $return_code);
}
}
/**
* Called each cycle
*/
protected function tick()
{
if ($this->hasSignal(SIGCHLD))
{
$this->reapChildren();
}
/**
* TODO: Distribute work.
*/
}
}
?>
Most of this should be fairly obvious. We have an object which represents the leader of a pool of workers. When the process first starts up, it spawns all of its workers. An interesting piece of code here is the concept of "reaping", which is the act of cleaning up a process which you own, after it has exited. When a child process has exited, a signal is sent to the parent process by the operating system: SIGCHLD. We look for this signal, and when it occurs, we clean up the child. This is done using the process control function pcntl_wait() (http://php.net/pcntl_wait). We're not expecting a specific child to exit, and we don't want to block if no process has exited. This function will clean up all dead child processes. If you fail to clean up child processes, you will be left with defunct processes (also called zombie processes).
The Worker Process
The following code is for the worker process. It's more or less empty, but it prepares us for the next steps of this system.
ForkedWorkerProcess
<?php
require_once 'ForkedProcess.php';
/**
* Class representing a single worker process
*/
class ForkedWorkerProcess extends ForkedProcess
{
/**
* Called when the process starts
*/
public function onStartup()
{
$this->debug("Alive");
}
/**
* Called when the process is exiting.
*/
public function onExit()
{
}
/**
* Called each cycle
*/
public function tick()
{
/**
* TODO: Check for work
*/
}
}
?>
Pretty self-explanatory. You notice I left behind a couple "TODO" points in the code. Those represent the place where we're going to implement the interprocess communication (IPC) functions. For now, we simply have a parent and a bunch of workers. We do not have a way to tell our workers to do things. I will cover this in the next section.
Checkpoint: Testing Our Classes
We have 3 classes above. Let's try a little test. Create a new php file.
example3.php
<?php
require_once 'ForkedParentProcess.php';
$parent = new ForkedParentProcess(3);
$parent->fork();
?>
If you run this script you should see output similar to this:
root@localhost ~/process_tutorial # php example3.php
20634\ForkedParentProcess> Alive
20635\ForkedWorkerProcess> Alive
20636\ForkedWorkerProcess> Alive
20637\ForkedWorkerProcess> Alive
In short, you have spawned a parent process (20634), with 3 child processes (20635 through 20637). Your process IDs will naturally be different, but the idea is the same. If you use the ps command, you can view your PHP processes. See below:
root@localhost ~/process_tutorial # ps -ef | grep php
root 20634 1 0 10:31 ? 00:00:00 php example3.php
root 20635 20634 0 10:31 ? 00:00:00 php example3.php
root 20636 20634 0 10:31 ? 00:00:00 php example3.php
root 20637 20634 0 10:31 ? 00:00:00 php example3.php
If you issue a kill to the parent process, it should exit and take its children with it:
root@localhost ~/process_tutorial # ps -ef | grep php
root 20634 1 0 10:31 ? 00:00:00 php example3.php
root 20635 20634 0 10:31 ? 00:00:00 php example3.php
root 20636 20634 0 10:31 ? 00:00:00 php example3.php
root 20637 20634 0 10:31 ? 00:00:00 php example3.php
root 20649 13958 0 10:35 pts/0 00:00:00 grep --color php
root@localhost ~/process_tutorial # kill 20634
root@localhost ~/process_tutorial # ps -ef | grep php
root@localhost ~/process_tutorial #
Conclusion
In summation, this article has covered a variety of features. We learned how to manage multiple children, handle signals and properly clean up after our child processes. We have laid the framework for a basic workload distribution system. This is a very useful starting point, but it is missing a key element: The ability to communicate. Once the child processes have spawned, they become autonomous, taking direction from no one. In the next part of this tutorial, I will continue developing these classes (and adding a couple new classes) to provide this functionality. I will introduce the concept of message queues.

Leave a comment
You must be logged in to post a comment.