- Blog -

(A cheesy homepage for Justin Collins)
Sqwee and Unicode

As someone quite correctly pointed out, Sqwee wasn’t supporting Unicode very well. In fact, not really at all. You could put it in, but then things got weird. You might not think this would affect me much (I am generally a pretty ASCII kind of guy), but what happens when I want to do a little mathematical notation? Perhaps I need to write down {x, y} ∪ {z, y} for some reason. That would make Sqwee mildly blow up.

So I tentatively started poking into the Unicode world, expecting at any moment to have a dragon fry me to a crisp or to fall into some large crevasse. Interestingly, neither happened, but it was a little bit of a journey.

First, I found out all I needed to do was add a nice little meta-tag:

<meta http-equiv="Content-Type" content="text/html;charset=utf-8" >

Simple enough, right? I tried it. Didn’t help a bit. Manually switching the encoding (in Firefox) did work, though. After examining the headers using Live Headers, it was clear the content type was still being sent as iso-8895-1.

I figured this was due to the Lighttpd webserver not setting the content type correctly, so I went about trying to change that. I thought it would be possible to change it in the mimetypes:

mimetype.assign             = (
  ".html" => "text/html; content-type: utf-8" 
)

I saw this suggested somewhere, but it didn’t help.

Then I read that in Prototype Unicode doesn’t work with “get” and only with “post”, so I tried that:

 new Ajax.Updater("main", "index.rhtml", {
           method: "post",
           //...
        });

But it didn’t help either!

At this point I was getting somewhat frustrated, but then I remembered that eruby generates its own headers. Perhaps that was the problem?

Indeed it was! Fortunately, it is possible to set the content type via a commandline switch:
usage: ./eruby [switches] [inputfile]

  -d, --debug           set debugging flags (set $DEBUG to true)
  -K[kcode]             specifies KANJI (Japanese) code-set
  -M[mode]              specifies runtime mode
                          f: filter mode
                          c: CGI mode
                          n: NPH-CGI mode
  -C [charset]          specifies charset parameter for Content-Type
  -n, --noheader        disables CGI header output
  -v, --verbose         enables verbose mode
  --version             print version information and exit
Now I was in business! I went ahead and changed the Lighttpd config file to add in the appropriate switches:

cgi.assign = (".rhtml" => "/home/justin/sqwee/bin/eruby -Ku -C utf-8")
Unfortunately, this doesn’t work at all, because Lighttpd uses stat to check if the specified program exists and the options mess it up. To work around this, I needed to create a little script to launch eruby with the appropriate options:

#!/bin/env ruby
require 'pathname'
exec("#{Pathname.new(__FILE__).dirname}/eruby -KU -C utf-8 #{ARGV.join(' ')}")

That worked perfectly, and suddenly Sqwee had Unicode! That means time for a new release.