Monday, November 15, 2010
ANN: happstack-heist now available
Detailed documentation on using Heist with Happstack is available in the Happstack Crash Course.
Happstack is a flexible Haskell Web Framework with many supporting optional components.
Heist is an XML templating engine. The static portions of your templates are written in XML files which are loaded by the server at runtime. This makes it easy to modify the templates without having to recompile and restart the server. It also makes it easy to work with template designers who do not know Haskell.
The dynamic portions of the templates are generated in Haskell and are spliced into the templates. This means you have the full expressive power of Haskell at your disposal for the generated portions of the templates. That is a lot nicer than trying to using something like XSLT (though Happstack does support that as well).
Happstack offers a wide variety of templating solutions including BlazeHtml, HSP, Hamlet, and HStringTemplate. But Heist fills a nice hole in the spectrum, and we are pleased to be able to offer it now.
It is available in darcs and on hackage. It has been tested against Happstack from darcs, but should work against Happstack stable as well.
If there are any bugs or improvements you would like to see, let us know!
- jeremy
Wednesday, October 20, 2010
Recompile your Haskell-based templates faster than you can hit F5.
1. DSLs/libraries such as BlazeHtml, HSP, Hamlet, etc, where your templates are written in Haskell and compiled at compile time.
2. Libraries like heist, HStringTemplate, etc, where your templates are written in some external template file and read at runtime by the server.
Each method has strengths and weaknesses -- and so each project needs to pick the solution that works best for them.
For my projects I love using HSP. I like having the full expressive power of Haskell in my templates, and the added safety that the type checker provides. But I hate having to recompile, relink, and restart my app server dozens and dozens of times when I am developing my templates. And, so it is with great pleasure that I present the triumphant return of happstack-plugins!
happstack-plugins
happstack-plugins leverages the recently revived plugins package so that individual page templates can be automatically recompiled and reloaded into a running happstack application. happstack-plugins uses hinotify to watch the haskell source files containing your page templates. Whenever you save changes, the page is automatically recompiled and reloaded into the running server. Typically this happens fast enough that by the time you switch to the browser and hit reload, the updated page is already available.
You can see a demo of happstack-plugins in action here:
How to use happstack-plugins
Using happstack-plugins is very straight-forward. First you need to install the happstack-plugins library which is currently only available in the happstack darcs repository:
darcs get http://patch-tag.com/r/mae/happstack
For best performance you should put each page template in its own module so that it can be recompiled and reloaded faster.
The templates themselves require no special modifications. Here is a simple
helloPage
template:> module HelloPage where
>
> import Happstack.Server
>
> helloPage :: String -> ServerPart Response
> helloPage noun = ok (toResponse $ "hello, " ++ noun)
This template takes a single String argument and returns a
text/plain
page which says, "hello, <string>". We could just as well use BlazeHTML, HSP, etc, but using String keeps this example short and simple.As I mentioned, there is nothing new going on here, it just a normal happstack ServerPart.
The interesting changes are in the Main module. There are only 3 simple changes required to support templates. But first, some boring stuff at the top of the module:
> {-# LANGUAGE CPP, TemplateHaskell #-}
> module Main where
>
> import Control.Monad (msum)
> import Happstack.Server
1. Here we #ifdef some module imports. These two modules provide the same interface. The Dynamic version actually does page recompilation and reloading. The Static version just links things in the normal way. This makes it easy to use dynamic loading during development but static linking for the live server by simply defining or undefining PLUGINS.
> #ifdef PLUGINS
> import Happstack.Server.Plugins.Dynamic
> #else
> import Happstack.Server.Plugins.Static
> #endif
> import HelloPage
2. In main we call
initPlugins
which starts the recompiler/reloader and hinotify. If you import Happstack.Server.Plugins.Static
, initPlugins
is a 'noop', so we do not have to add any extra #ifdefs
.> main :: IO ()
> main =
> do ph <- initPlugins
> simpleHTTP nullConf $ pages ph
3. Here is where we actually specify a template to load dynamically:
> pages :: PluginHandle -> ServerPart Response
> pages ph =
> msum [ $(withServerPart 'helloPage) ph $ \helloPage ->
> (helloPage "hello")
> ]
Normally we would just have:
> pages :: PluginHandle -> ServerPart Response
> pages ph =
> msum [ helloPage "world"
> ]
So the new part is the template haskell function
withServerPart
which effectively takes three arguments:1. the name of the symbol to dynamically load
2. the
PluginHandle
which initPlugins
returned3. a function which will use the loaded symbol
so,
withServerPart
effectively has the type:> withServerPart :: (MonadIO m, ServerMonad m) => Name -> PluginHandle -> (a -> m b) -> m b
Even though we are dynamically reloaded the page at runtime, the compiler will still check that the types are correct when will compile the main application.
If we change
helloPage "hello"
to helloPage 1
and try to build Main.lhs we will get the error.
Main.lhs:50:28:
No instance for (Num String)
arising from the literal `1' at Main.lhs:50:28
Possible fix: add an instance declaration for (Num String)
In the first argument of `helloPage', namely `1'
In the expression: (helloPage 1)
In the second argument of `($)', namely
`\ helloPage -> (helloPage 1)'
Failed, modules loaded: HelloPage.
What's left to do?
There are two big features on the TODO list. If you think happstack-plugins is cool, I encourage you to work on them!
1. The underlying plugins library is broken when it comes to hierarchical modules. Ideally I would put all the pages in
Pages.*
. For example Pages.HelloPage
. But, that does not work. As a hack, you can modify System.Plugins.Make.build
and comment out output
in the let flags = ...
declaration. This fixes hierachical modules, but requires you to run your app with its working directory set to the root directory of your project. That is fine for happstack app development, but not an ideal solution for all users of the plugins library. If someone could fix hierarchical module support in plugins, that would be great for everyone.2. hinotify is only supported under Linux. However, it should not be that hard to make hinotify support optional (via a compile time flag). With out hinotify, we would just do a quick
stat()
everytime the template is invoked and see if a recompilation is needed. When a compilation is needed, you will have to wait for that page to recompile and reload -- but it will still be much faster than rebuilding and restarting the whole server.
Wednesday, October 13, 2010
Is the RqData monad still needed?
cdsmitch recently asked if RqData
is really needed in Happstack. The answer is, "no, but it is still useful sometimes."
I can say "no" with certainty because in the darcs version of Happstack, it is already optional.
The new and improved RqData
Functions like look
now work in any monad which is an instance of HasRqData
:
> look :: (Functor m, Monad m, HasRqData m) => String -> m String
Since there is a HasRqData
instance for ServerPart
, we effectively have the function:
> look :: String -> ServerPart String
Here is an example of using look
with out having to jump through any hoops:
> module Main where
>
> import Happstack.Server (ServerPart, look, nullConf, simpleHTTP, ok)
>
> helloPart :: ServerPart String
> helloPart =
> do greeting <- look "greeting"
> noun <- look "noun"
> ok $ greeting ++ ", " ++ noun
>
> main :: IO ()
> main = simpleHTTP nullConf $ helloPart
Now if we visit http://localhost:8000/?greeting=hello&noun=rqdata, we will get the message hello, rqdata
Sweet!
But why keep RqData around?
Using look
in the ServerPart
monad is simple. But when it fails, it just calls mzero
. That can be very frustrating if you are debugging your forms or debugging calls to your web service API. Instead of an error telling you what parameter was missing, you simply get a generic 404 error.
Using the RqData
monad/applicative functor gives you the option to provide detailed error messages when something goes wrong:
> module Main where
>
> import Control.Applicative ((<$>), (<*>))
> import Happstack.Server (ServerPart, badRequest, nullConf, ok, simpleHTTP)
> import Happstack.Server.RqData (RqData, look, getDataFn)
>
> helloRq :: RqData (String, String)
> helloRq =
> (,) <$> look "greeting" <*> look "noun"
>
> helloPart :: ServerPart String
> helloPart =
> do r <- getDataFn helloRq
> case r of
> (Left e) ->
> badRequest $ unlines e
> (Right (greet, noun)) ->
> ok $ greet ++ ", " ++ noun
>
> main :: IO ()
> main = simpleHTTP nullConf $ helloPart
If you visit http://localhost:8000/?greeting=hello&noun=world, you will get the familiar greeting hello, world.
But if you leave off the query parameters http://localhost:8000/, you will get a list of errors:
Parameter not found: greeting
Parameter not found: noun
This is really nice when you are debugging your code.
Now with more composability!
Since RqData
and ServerPart
are instances of Applicative
and Alternative
you can now reuse many functions from those libraries. For example, if a query parameter is optional, you can simply write:
> do greet <- optional $ look "greeting"
There is also a new combinator checkRq
which can be used to validate query parameters, or to convert a query parameter to another type:
> checkRq :: (Monad m, HasRqData m) => m a -> (a -> Either String b) -> m b
If you are curious be sure to check out the Happstack Crash Course where the new RqData module is documented in detail with many working examples.
I would love to hear feedback on the new and improved RqData
module, and any suggestions for improvement!
Also, be on the look out for a future blog post about the RqData
Arrow
. :)
Monday, July 19, 2010
Changes to request body and RqData in head
I have just pushed some patches which affect the way the Request
body and RqData
are handled in happstack 0.6. This contains user visible changes which will affect you if you:
- Use
RqData
- Directly use the
rqBody
field inRequest
- Directly use the
rqInput
field inRequest
- Directly work with the
Input
type - Allow file uploads
Some of the changes fix bugs (design flaws), and others are for new features and functionality. The non-compatible API changes are pretty small, so it should be easy to port your code. It basically comes down to:
getDataFn, withDataFn, etc
take an extra argument of the typeBodyPolicy
getDataFn, withDataFn, etc
returnEither [String] a
instead ofMaybe a
- the
inputValue
field of theInput
type is nowEither FilePath L.ByteString
instead ofL.ByteString
- you have to explicitly import the module
Happstack.Server.RqData
In this post I will describe what motivated these changes. I am
hoping to also get feedback and these changes before we release 0.6 since it will be less painful to make further changes now.
the Request body and space usage
In the old code the Request
type stores the request body as a simple lazy ByteString
:
> newtype RqBody = Body { unBody :: L.ByteString } deriving (Read,Show,Typeable)
>
> data Request = Request { ...
> , rqBody :: RqBody
> }
This feels nice, because it is a simple, pure value. Unfortunately, it is really not a great idea in practice. The request body does not initially require any space, because it is an unevaluated lazy ByteString
. But the ServerPart
holds the Request
in its environment, and that means the garbage collection can not free the RqBody
as you evaluate it. If the request body contained gigabytes of data, that could be disastrous.
The solution in Happstack 0.6 is to use an MVar
to hold the request body:
>
> data Request = Request { ...
> , rqBody :: MVar RqBody
> }
Instead of using rqBody
directly, it is better to use takeRequestBody
, so that your code will not break if we switch to IORef
or something else.
> takeRequestBody :: Request -> IO (Maybe RqBody)
> takeRequestBody rq = tryTakeMVar (rqBody rq)
Now, when you process the RqBody
the Request
will not be holding onto it, so the garbage collection can free it (assuming your code to not hold onto it and introduce a new space leak).
This does have a drawback however. A ServerPart
can call mzero
at anytime, and processing will move onto the next ServerPart
. However, if you have already taken the RqBody
then the next ServerPart
may be missing critical data it needs. But, if we left the RqBody
intact, that would result in the space leak. I think that in practice, if a ServerPart
made enough progress that it started consuming the RqBody
and then failed, it is unlikely that another ServerPart
would succeed and need the RqBody
. If another ServerPart
succeeds, it is probably just a 404 Not Found
handler or something similar, which does not need the request body. So it seems like it is better to have the default behavior be the more space friendly solution.
We will also provide peekRequestBody
and/or putRequestBody
functions so that you can opt to leave the request body intact. It is up to you to be sensible about using them.
BodyInput
and space usage
In RqData, the cookies, QUERY_STRING, and request body (when appropriate) are parsed into a [(String, Input)]
, where String
is the name of the key, and Input
is the value.
In Happstack 0.6, Input
will be the type:
> data Input = Input
> { inputValue :: Either FilePath L.ByteString
> , inputFilename :: Maybe FilePath
> , inputContentType :: ContentType
> } deriving (Show,Read,Typeable)
In Happstack 0.5 the inputValue
is simply a L.ByteString
. Once again, this seems fine at first. After all, the inputValues
are lazy ByteString
, so we can process them lazily, right? Well, not quite. In the unprocessed request body, the key/value pairs are laid out like this:
key1
value1
key2
value2
key3
value3
key4
value4
...
If we were to consume the key/value pairs in a sequential manner, then we would be ok. But, generally we want to use functions which can lookup a specific key. Imagine we want to look up key4
. In order to do that we have to first read in all the preceding key/value pairs. If we knew we only cared about key4
then we could just toss the rest. But with the monadic RqData
code we don't know that. (A future post will talk about an arrow based alternative where we do know that). So, we have to store all the key/value pairs in case we want to lookup key1
after key4
.
In Happstack 0.5, we store all those values in RAM. But, some of those values might be (huge) files. That clearly isn't going to work. So we once again trade off a bit a simplicity/elegance for the practical matter of not having unlimited amounts of RAM. Instead we store some values in RAM and some values on the disk. How do we decide what goes where? That brings us to BodyPolicy
.
BodyPolicy
When parsing the request body, we need some way to decide what values should be stored in RAM and what values should be saved to disk. Additionally, we want to impose limits on how much data can be stored in either location. If a user decides to post the contents of /dev/random you are likely to want to cut them off at some point. However, the specific values for the quotas are application specific. In fact, they may be specific to the particular form that is being processed. For example, an admin user might have higher quotas than a regular user.
The answers to these questions are provided by the BodyPolicy
, which looks like:
> data BodyPolicy
> = BodyPolicy { inputWorker :: Int64 -> Int64 -> Int64 -> InputWorker
> , maxDisk :: Int64 -- ^ maximum bytes to save to disk (files)
> , maxRAM :: Int64 -- ^ maximum bytes to hold in RAM
> , maxHeader :: Int64 -- ^ maximum header size (this only affects headers in the multipart/form-data)
> }
The inputWorker
is the function that actually decides where values should be saved, and implements the quotas. Its Int64
arguments are the quotas for the disk, ram, and other headers which don't really get saved, but which can temporarily take up space. The next three fields are the values to pass to the inputWorker
.
In most cases, you do not need to write you own inputWorker
. It is sufficient to use the defaultBodyPolicy
:
> defaultBodyPolicy :: FilePath -> Int64 -> Int64 -> Int64 -> BodyPolicy
The first argument is the directory to store temporary files in, and the next three arguments are the quota values. I am not going to cover defaultBodyPolicy
in detail in this post. But it is well documented in the Happstack Crash Course.
Improvements to RqData
The new RqData
module also includes a number of new features.
There is now an Applicative
functor instance for RqData
. The applicative functor instance accumulates errors. This means if you try look up multiple invalid keys, the error message will report all the missing values, not just the first one. This is nice when you are debugging your code, and is also nice if you provide a web service (REST API, etc) and want to provide your API users with detailed error messages instead of "Invalid Request".
We now provide two filters (body
and queryString
) which limit the scope of the look* functions to either the request body or the QUERY_STRING.
A new function lookFile
is provided to assist with handling file uploads.
A new function checkRq
is provided to help you convert
request parameters to Haskell types, or to check that a value meets some conditions.
Summary
This post gives some of the background on the changes to how we handle the request body and form data. To actually see what the changes look like in practice, you should check out the RqData section in the Happstack Crash Course. It gives detailed examples of all the features and changes I talked about in this post. I have also updated the haddock documentation in darcs.
I would love to hear your opinions. Do you love the changes? Hate the changes? Have better ideas about how to solve the problems? In terms of handling the raw request body, I believe both Yesod and Snap use the same basic approach -- the first handler to try to use the request body gets the whole thing, and everyone else gets nothing. (And they provide ways to put the request body back if you want to..).
Sunday, July 11, 2010
sendfile 0.7.1
The sendfile library exposes zero-copy sendfile functionality in a portable way. If a platform does not support sendfile, a fallback implementation in Haskell is provided. It currently has zero-copy support for Linux, Darwin, FreeBSD, and Windows.
The sendfile functionality typically reduces CPU-load and (possibly) increases IO throughput.
The new release of sendfile adds the ability to hook into the send loop. This is useful if you want to tickle timeouts or update a progress bar while the file is being sent.
This turned out to be rather tricky because each platform implements sendfile a little differently. But, the point of the sendfile library is to provide a unified interface so that other developers do not have to know any of the platform specific details.
The solution in 0.7.1 is to use a simple, specialized iteratee. Each pass of the sendfile loop can end in one of three states:
(1) the requested number of bytes for that iteration was sent
successfully, there are more bytes left to send.
(2) some (possibly 0) bytes were sent, but the file descriptor
would now block if more bytes were written. There are more bytes
left to send.
(2) All the bytes were sent, and there is nothing left to send.
We handle these three cases by using a type with three
constructors:
data Iter
= Sent Int64 (IO Iter)
| WouldBlock Int64 Fd (IO Iter)
| Done Int64
All three constructors provide an
Int64
which represents thenumber of bytes sent for that particular iteration. (Not the total
byte count).
The
Sent
and WouldBlock
constructors provide IO Iter
as theirfinal argument. Running this IO action will send the next block of
data.
The
WouldBlock
constructor also provides the Fd
for the outputsocket. You should not send anymore data until the
Fd
would notblock. The easiest way to do that is to use
threadWaitWrite
tosuspend the thread until the
Fd
is available.A very simple function to drive the Iter might look like:
runIter :: IO Iter -> IO ()
runIter iter =
do r <- iter
case r of
(Done _n) -> return ()
(Sent _n cont) -> runIter cont
(WouldBlock _n fd cont) ->
do threadWaitWrite fd
runIter cont
You would use it as the first argument to a *IterWith function, e.g.
sendFileIterWith runIter outputSocket "/path/to/file" 2^16
If we want to do something fancier, such as update timeouts or a progress bar, we can do it in a custom runIter function. If we are using a non-standard I/O manager, we might be able to suspend the thread via a call other than
threadWaitWrite
.What Next?
The new version of sendfile will be used to improve the timeout handling in the Haskell web framework, Happstack.
It would be nice if the sendfile library could export a low-level function like:
sendfile :: Fd -> Fd -> Int64 -> Int64 -> IO (Bool, Int64)
It would take the output socket, and input file descriptor, an offset, and length, and return the number of bytes written, and whether the output socket blocked.
Unfortunately, it is not possible to provide a portable implementation of this
sendfile
function. That would require functions which can operate directly on the Fds
. But those functions live in the unix
package, which is not portable. Another non-solution is to have a module like,
Network.Socket.SendFile.LowLevel
which is only exported on the platforms which provide a low-level sendfile implementation. However, it is my understanding that this is not really allowed by the cabal policy because there would be no way to specify that you require a version of the sendfile library that exports .LowLevel
.So, I believe a more correct solution is to create a *new* package,
sendfile-lowlevel
, which exports Network.Socket.SendFile.LowLevel
. This assumes that there is some way to mark that a package is only available on certain platforms. However, I am not sure if that can be done.Hopefully the new API provides enough flexibility that there is no need for an even lower-level API to be exposed. If you think you need something lower-level, let me know, and let's see if we can work something out.