Torsten Förtsch
IT System Development & Security
Kaum macht man's richtig, schon geht's, ;-)

>> Home >> ModPerl >> Request hand-over


How to hand long-running requests over to another process

Modperl worker processes are usually an expensive resource. Hence locking up such a thing for a long time is undesirable. So, perhaps it's possible to free the apache worker before the request is actually done and handing the client connection over to another more light-weight process? That's the topic of this page.

One note at first, by the time of this writing the released modperl version is not able to do this by the reason that there is no interface to fetch the client socket. You'll have to use the newest SVN version. This is quite experimental stuff. Maybe the interface will be changed.

And a second note, this technique definitely won't work on Windows. I have tried it only on Linux.

The idea

Okay, let's start. On UNIX-like operating systems the lowest level representation of a file handle is the file descriptor, a small integer number. Further, UNIX implements a special type of sockets, UNIX-domain-sockets. They can be used to pass open file descriptors from one process to another. The idea is to have a light-weight, event-driven server process waiting for requests on a UNIX-domain-socket. The apache worker will then connect to that socket and pass the client connection along with a request to the server. The server will that do the long-running stuff and the apache worker can finish the request and accept connections from other clients.

The example will solve the same problem as my potentially-infinte-output article, namely a clock.

The tools

One of the best frameworks to do event-driven stuff in Perl is Coro. The tool to do the descriptor passing is IO::Handle::Record

The server

Install the server on the machine where the WEB server is running. The script is quite easy to understand (hopefully). So, I won't comment it here.

To start the server you need write permission for /tmp. Start it as:

/path/to/timeserver -d 2

A message will appear stating the server is listening on /tmp/timeserver.

Now, let's try out the server. Put the following in a script:

use strict;

use IO::Socket::UNIX;
use IO::Handle::Record;

my $s=IO::Socket::UNIX->new('/tmp/timeserver');
$s->fds_to_send=[1];                  # tell $s to pass STDOUT (fd=1) to the server with the next request
$s->write_record(clock=>'boundary');  # send the req and pass the fd.
my @reply=$s->read_record;            # read the reply
print "reply: @reply\n"               # done

The script connects to the server and sends it the clock command. Along with the request the script's STDOUT is sent. IO::Handle::Record provides that functionality. Call the script and send the output to a cat process. The latter makes sure the infinite output can be stopped by killing the cat. Otherwise you'd have to kill the timeserver or to close the terminal window to stop it.

perl script.pl | cat

You'll see infinite output like this

62                                                                       
--boundary                                                               
Content-Type: text/html                                                  

<html><body><h1>Mon Apr 19 16:42:03 2010</h1></body></html>
reply: OK                                                  
62                                                         
--boundary                                                 
Content-Type: text/html                                    

<html><body><h1>Mon Apr 19 16:42:04 2010</h1></body></html>
62                                                         
--boundary                                                 
Content-Type: text/html                                    

...

The line marked red is the only thing that is written by the script. All the other output comes directly from the timeserver. Check with ps to see that the perl script has disappeared from the process list. It is not needed. The timeserver talks directly to the cat process. The messages on the timeserver console say a new connection has been accepted and shortly after the connection has been closed. Now kill the cat process and you'll see a 3rd line stating the clock thread has finished.

Integrate it with mod_perl

Now, that the server runs let's do the modperl stuff. Install the following handler as PerlResponseHandler for a location.

use Apache2::RequestRec ();
use Apache2::Connection ();
use APR::Socket ();
use POSIX ();
use IO::Socket::UNIX;
use IO::Handle::Record;

sub handler {
  my ($r)=@_;

  my $boundary='The Final Frontier';
  $r->content_type('multipart/x-mixed-replace;boundary="'.$boundary.'"');
  $r->rflush;

  my $ts=IO::Socket::UNIX->new('/tmp/timeserver')
    or die "ERROR: cannot connect to timeserver: $!";

  my $fd=$r->connection->client_socket->fileno;
  $ts->fds_to_send=[$fd];
  $ts->write_record('clock', $boundary);
  my @reply=$ts->read_record;

  $r->connection->client_socket->close;

  return Apache2::Const::OK;
}

The red marked lines are the things to understand. After setting the content type an $r->rflush is issued. I ensures the header fields are sent to the browser but no other output. The ->fileno then fetches the file descriptor of the client connection. It is passed to the timeserver. Now, the apache worker can close the client connection because it is held by the timeserver. Remember not to set a Content-Length header or Transfer-Encoding wouldn't be chunked.

Watch the access_log file and you'll see something like this.

127.0.0.1 - - [19/Apr/2010:17:28:36 +0200] "GET /test2 HTTP/1.1" 200 - "-" ...

That means the request cycle is finished and the apache worker is ready to serve again.

But the clock in the browser still ticks. It is handled by the timeserver without apache interaction.

Drawbacks

An SSL connection cannot be handled this way, of course.

Letzte Aktualisierung: 22.04.2010