How to hand long-running requests over to another process
Modperl worker processes are usually an expensive resource. Hence locking up such a thing for a long time is undesirable. So, perhaps it's possible to free the apache worker before the request is actually done and handing the client connection over to another more light-weight process? That's the topic of this page.
One note at first, by the time of this writing the released modperl version is not able to do this by the reason that there is no interface to fetch the client socket. You'll have to use the newest SVN version. This is quite experimental stuff. Maybe the interface will be changed.
And a second note, this technique definitely won't work on Windows. I have tried it only on Linux.
The idea
Okay, let's start. On UNIX-like operating systems the lowest level representation of a file handle is the file descriptor, a small integer number. Further, UNIX implements a special type of sockets, UNIX-domain-sockets. They can be used to pass open file descriptors from one process to another. The idea is to have a light-weight, event-driven server process waiting for requests on a UNIX-domain-socket. The apache worker will then connect to that socket and pass the client connection along with a request to the server. The server will that do the long-running stuff and the apache worker can finish the request and accept connections from other clients.
The example will solve the same problem as my potentially-infinte-output article, namely a clock.
The tools
One of the best frameworks to do event-driven stuff in Perl is Coro. The tool to do the descriptor passing is IO::Handle::Record
The server
Install the server on the machine where the WEB server is running. The script is quite easy to understand (hopefully). So, I won't comment it here.
To start the server you need write permission for /tmp. Start it as:
/path/to/timeserver -d 2
A message will appear stating the server is listening on /tmp/timeserver.
Now, let's try out the server. Put the following in a script:
use strict;
use IO::Socket::UNIX;
use IO::Handle::Record;
my $s=IO::Socket::UNIX->new('/tmp/timeserver');
$s->fds_to_send=[1]; # tell $s to pass STDOUT (fd=1) to the server with the next request
$s->write_record(clock=>'boundary'); # send the req and pass the fd.
my @reply=$s->read_record; # read the reply
print "reply: @reply\n" # done
The script connects to the server and sends it the clock command. Along with the request the script's STDOUT
is sent. IO::Handle::Record provides that functionality.
Call the script and send the output to a cat process. The latter makes sure the infinite output can be stopped by killing the
cat. Otherwise you'd have to kill the timeserver or to close the terminal window to stop it.
perl script.pl | cat
You'll see infinite output like this
62
--boundary
Content-Type: text/html
<html><body><h1>Mon Apr 19 16:42:03 2010</h1></body></html>
reply: OK
62
--boundary
Content-Type: text/html
<html><body><h1>Mon Apr 19 16:42:04 2010</h1></body></html>
62
--boundary
Content-Type: text/html
...
The line marked red is the only thing that is written by the script. All the other output comes directly from the timeserver.
Check with ps to see that the perl script has disappeared from the process list. It is not needed. The timeserver
talks directly to the cat process. The messages on the timeserver console say a new connection has been
accepted and shortly after the connection has been closed. Now kill the cat process and you'll see a 3rd line stating the
clock thread has finished.
Integrate it with mod_perl
Now, that the server runs let's do the modperl stuff. Install the following handler as PerlResponseHandler
for a location.
use Apache2::RequestRec ();
use Apache2::Connection ();
use APR::Socket ();
use POSIX ();
use IO::Socket::UNIX;
use IO::Handle::Record;
sub handler {
my ($r)=@_;
my $boundary='The Final Frontier';
$r->content_type('multipart/x-mixed-replace;boundary="'.$boundary.'"');
$r->rflush;
my $ts=IO::Socket::UNIX->new('/tmp/timeserver')
or die "ERROR: cannot connect to timeserver: $!";
my $fd=$r->connection->client_socket->fileno;
$ts->fds_to_send=[$fd];
$ts->write_record('clock', $boundary);
my @reply=$ts->read_record;
$r->connection->client_socket->close;
return Apache2::Const::OK;
}
The red marked lines are the things to understand. After setting the content type an $r->rflush is issued.
I ensures the header fields are sent to the browser but no other output. The ->fileno then fetches the
file descriptor of the client connection. It is passed to the timeserver. Now, the apache worker can close the
client connection because it is held by the timeserver. Remember not to set a Content-Length header or
Transfer-Encoding wouldn't be chunked.
Watch the access_log file and you'll see something like this.
127.0.0.1 - - [19/Apr/2010:17:28:36 +0200] "GET /test2 HTTP/1.1" 200 - "-" ...
That means the request cycle is finished and the apache worker is ready to serve again.
But the clock in the browser still ticks. It is handled by the timeserver without apache interaction.
Drawbacks
An SSL connection cannot be handled this way, of course.
Letzte Aktualisierung: 22.04.2010

