PHP jobs with Gearman and Supervisor

The problem

Often in a PHP project there could be operations that need to be executed asynchronously. Some example are: processing mail queues, indexing data, computation that requires long elaboration time.

A common behavior is handle those operations by using cron to execute processes in background. However, using cron requires expedients to avoid cross executions and forces us to implement some specific procedures and mechanism to store data needed to elaborate.

Solution: Gearman + Supervisor

The solution that involves Gearman and Supervisor, instead, don’t require any kind of data storage mechanism and supply a very simple way to develop processes in PHP.

From Gearman homepage:

Gearman provides a generic application framework to farm out work to other machines or processes that are better suited to do the work. It allows you to do work in parallel, to load balance processing, and to call functions between languages. It can be used in a variety of applications, from high-availability web sites to the transport of database replication events. In other words, it is the nervous system for how distributed processing communicates.

Moreover, to develop clients and workers for Gearman in PHP is very simple as you can see in this Gearman PHP documentation example.

On the other hand, as written on Supervisor homepage:

Supervisor, is a client/server system that allows its users to monitor and control a number of processes on UNIX-like operating systems.

Actually, combination of Gearman and Supervisor creates a stable, flexible and scalable infrastructure to handle all our jobs. As shown in the following diagram, it provides us a client/server structure and cares about communication and (if we want) distribution. We only need to take care of code to complete our tasks.

Gearman stack

Installation

I’m going to show how to install and setup Gearman and Supervisor in debian wheezy environment.

Normally, following packages should be present in a php ready system, but we want to be sure:

$ apt-get install apache2 php5 libapache2-mod-php5 php5-dev

Now we are ready to start Gearman installation. Required command is the following:

$ apt-get install gearman-job-server libgearman-dev

At this point, we need to install Gearman Pecl extension in order to use required PHP Classes. In a perfect World, we should simply install latest version, but (at the moment) latest stable Gearman Pecl requires a libgearman version newer then that included in debian repositories and, if we try to install it, we should recieve an error like this:

configure: error: libgearman version 1.1.0 or later required
ERROR: `/tmp/pear/temp/gearman/configure' failed

To resolve this issue, we could compile and install libgearman by ourself, or, as I prefer, install an older pecl extension version by running following commands:

$ apt-get install php-pear
$ pecl install gearman-1.0.3

Note: If you prefer compile libgearman yourself, you will easly find a way on Google.

To complete installation, following commands activate Gearman extension for Apache2:

$ echo "extension=gearman.so" > /etc/php5/conf.d/gearman.ini
$ service apache2 restart

At this point, we are ready to create our Gearman workers and clients as described in Gearman section of PHP documentation.

But we are not satisfied yet: we also want to handle our php processes in a smart way. We want to be able to control them, monitor them and specify things like: how many instances, priority, etc. In a few words, we want to run them through Supervisor.

So, let’s start to install it:

$ apt-get install supervisor

In debian, supervisor is a deamon and can be started/stopped using usual commands:

$ service supervisor start
$ service supervisor stop

Last step is configure our Gearman workers in Supervisor. To do that, we will create files in /etc/supervisor/conf.d/ that will contain something like this:

/etc/supervisor/conf.d/myprocess.conf:

[program:myprocess]
command=php /path/to/project/myprocess.php
numprocs=12
directory=/path/to/project/spool/myprocess/
autostart=true
autorestart=true
stdout_logfile=/path/to/project/log/myprocess.log
stdout_logfile_maxbytes=1MB
stderr_logfile=/path/to/project/log/myprocess.log
stderr_logfile_maxbytes=1MB

For a complete list of options, please see Supervisor documentation.

References:

Articoli correlati

  • Excellent article!

    • Valerio Galano

      Thanks!

  • This is my worker in console

    class WorkerAttendanceUpdateOptimizationCommand extends CConsoleCommand {

    public function run() {

    $gmworker = new GearmanWorker();
    $gmworker->addServer(‘127.0.0.1′,’4730’);
    $gmworker->addFunction(“updateAttendanceLog”, array($this, “updateAttendanceLog”));
    print “Waiting for job…n”;
    while ($gmworker->work()) {
    if ($gmworker->returnCode() != GEARMAN_SUCCESS) {
    echo “return_code: ” . $gmworker->returnCode() . “n”;
    break;
    }
    }
    }

    public function updateAttendanceLog($job) {

    $user = unserialize($job->workload());
    $user_id = ($user[‘user_id’] != null) ? $user[‘user_id’] : null;
    $tenant_id = ($user[‘tenant_id’] != null) ? $user[‘tenant_id’] : null;
    $date = strtotime($user[‘date’]);

    if (!empty($user_id)) {
    $user_shifts = array(UserShift::model()->user($user_id)->active()->find());
    } elseif (!empty($tenant_id)) {
    $user_shifts = UserShift::model()->tenant($tenant_id)->active()->findAll();
    } else {
    $user_shifts = UserShift::model()->active()->findAll();
    }

    if (!empty($user_shifts)) {
    foreach ($user_shifts as $i => $shift) {
    $source = UserAttendance::SOURCE_WEB;
    $userShift = UserShift::getUserShift($shift->user_id, $shift->tenant_id);
    if (!empty($userShift)) {
    if ($userShift->clocking_priority->clocking_priority == UserShift::SOURCE_BIOMETRIC) {
    $source = UserAttendance::SOURCE_HARDWARE;
    }
    }else{
    echo “User shift is not found for user “.$shift->user_id.”n”;
    }
    $from = $date;
    $to = $from + (24 * 3600);
    $userAttendance = UserAttendance::getAttendanceLog($shift->user_id, $from, $to, $shift->tenant_id);
    if (!empty($userAttendance)) {
    foreach ($userAttendance as $log) {
    $log->source = $source;
    $log->save();
    }
    }else{
    echo “User attendance log not found for user “.$shift->user_id.”n”;
    }
    if(UserAttendanceLog::createAttendanceLog($shift->user_id, $date, $shift->tenant_id)){
    echo “Status 1 User attendance updated successfully. “.$shift->user_id.”n”;
    }else{
    echo “Staus 0 User attendance Not updated . “.$shift->user_id.”n”;
    }
    }
    echo “All job done successfully.n”;
    }else{
    echo “User shifts not found.n”;
    }
    }
    }

    This is my client

    public function run($args) {
    $gmclient = new GearmanClient();
    $gmclient->addServer(‘127.0.0.1′,’4730’);
    echo “Sending jobn”;
    $dates = (isset($args[0])) ? $args[0] : date(‘Y-m-d’, strtotime(date(‘Y-m-d’)) – (24 * 3600));
    $tenant_id = (isset($args[1])) ? $args[1] : null;
    $user_id = (isset($args[2])) ? $args[2] : null;

    // To check for current and future date
    if (self::checkDate($dates)) {

    $user_array = array(
    “date” => $dates,
    “tenant_id” => $tenant_id,
    “user_id” => $user_id
    );

    $user = serialize($user_array);
    $result = $gmclient->doBackground(“updateAttendanceLog”, $user);

    # Check for various return packets and errors.
    switch ($gmclient->returnCode()) {
    case GEARMAN_WORK_STATUS:
    list($numerator, $denominator) = $gmclient->doStatus();
    echo “Status: $numerator/$denominator completen”;
    break;
    case GEARMAN_WORK_FAIL:
    echo “Failedn”;
    exit;
    case GEARMAN_SUCCESS:
    echo “Job process successfullyn”;
    break;
    default:
    echo “RET: ” . $gmclient->returnCode() . “n”;
    exit;
    }
    echo $result . PHP_EOL;
    } else {
    echo “Can’t create logs for current and future dates. n”;
    }
    }

    when i run this process i got segmentation fault error and gearman stop.
    Please help with good solution.

    • Valerio Galano

      I think that best practice could be to use a debugger (xdebug, for example) and find row that generate segmentation fault. Then, you can try to understand how to fix it.

      • Thanks for your reply.
        After tracing with xdebug and gdb
        I am got this error.

        Program received signal SIGSEGV, Segmentation fault.
        0x00000000006b9d6f in zend_mm_remove_from_free_list (heap=0xdd6ac0, mm_block=0x1ede848) at /build/buildd/php5-5.3.10/Zend/zend_alloc_canary.c:889
        889 /build/buildd/php5-5.3.10/Zend/zend_alloc_canary.c: No such file or directory.

        Any thoughts on this error.

        • Valerio Galano

          Sincerely, I don’t know.
          Are you able to find which line triggers segmentation fault?

          • No but my loop is running for number of user in between i got this error so i could not find the error where it is.can u give me proper flow of Gearman. i have multiple user and i wanted to save there shift in my database through gearman.
            How can i do??

          • Valerio Galano

            Proper flow for Gearman is described at http://www.php.net/manual/en/gearman.examples.php.

            Anyway, I’m going to write a tutorial about ZF2 + Gearman client/worker implementation. Maybe this could help.

          • Thanks dear.