Sunday, July 11, 2010

sendfile 0.7.1

I have just uploaded sendfile 0.7.1 to hackage.

The sendfile library exposes zero-copy sendfile functionality in a portable way. If a platform does not support sendfile, a fallback implementation in Haskell is provided. It currently has zero-copy support for Linux, Darwin, FreeBSD, and Windows.

The sendfile functionality typically reduces CPU-load and (possibly) increases IO throughput.

The new release of sendfile adds the ability to hook into the send loop. This is useful if you want to tickle timeouts or update a progress bar while the file is being sent.

This turned out to be rather tricky because each platform implements sendfile a little differently. But, the point of the sendfile library is to provide a unified interface so that other developers do not have to know any of the platform specific details.

The solution in 0.7.1 is to use a simple, specialized iteratee. Each pass of the sendfile loop can end in one of three states:

(1) the requested number of bytes for that iteration was sent
successfully, there are more bytes left to send.

(2) some (possibly 0) bytes were sent, but the file descriptor
would now block if more bytes were written. There are more bytes
left to send.

(2) All the bytes were sent, and there is nothing left to send.

We handle these three cases by using a type with three
constructors:

data Iter
= Sent Int64 (IO Iter)
| WouldBlock Int64 Fd (IO Iter)
| Done Int64

All three constructors provide an Int64 which represents the
number of bytes sent for that particular iteration. (Not the total
byte count).

The Sent and WouldBlock constructors provide IO Iter as their
final argument. Running this IO action will send the next block of
data.

The WouldBlock constructor also provides the Fd for the output
socket. You should not send anymore data until the Fd would not
block. The easiest way to do that is to use threadWaitWrite to
suspend the thread until the Fd is available.

A very simple function to drive the Iter might look like:

runIter :: IO Iter -> IO ()
runIter iter =
do r <- iter
case r of
(Done _n) -> return ()
(Sent _n cont) -> runIter cont
(WouldBlock _n fd cont) ->
do threadWaitWrite fd
runIter cont

You would use it as the first argument to a *IterWith function, e.g.

sendFileIterWith runIter outputSocket "/path/to/file" 2^16

If we want to do something fancier, such as update timeouts or a progress bar, we can do it in a custom runIter function. If we are using a non-standard I/O manager, we might be able to suspend the thread via a call other than threadWaitWrite.

What Next?


The new version of sendfile will be used to improve the timeout handling in the Haskell web framework, Happstack.

It would be nice if the sendfile library could export a low-level function like:

sendfile :: Fd -> Fd -> Int64 -> Int64 -> IO (Bool, Int64)

It would take the output socket, and input file descriptor, an offset, and length, and return the number of bytes written, and whether the output socket blocked.

Unfortunately, it is not possible to provide a portable implementation of this sendfile function. That would require functions which can operate directly on the Fds. But those functions live in the unix package, which is not portable.

Another non-solution is to have a module like, Network.Socket.SendFile.LowLevel which is only exported on the platforms which provide a low-level sendfile implementation. However, it is my understanding that this is not really allowed by the cabal policy because there would be no way to specify that you require a version of the sendfile library that exports .LowLevel.

So, I believe a more correct solution is to create a *new* package, sendfile-lowlevel, which exports Network.Socket.SendFile.LowLevel. This assumes that there is some way to mark that a package is only available on certain platforms. However, I am not sure if that can be done.

Hopefully the new API provides enough flexibility that there is no need for an even lower-level API to be exposed. If you think you need something lower-level, let me know, and let's see if we can work something out.

1 comment: