(A cheesy homepage for Justin Collins)
Much nicer closure arguments in Brat

I am just so excited right now. I was watching Dave Thomas’ (not the Wendy’s guy) keynote talk from RubyConf 2008, in which he proposes several “forks” of Ruby. At about 40 minutes in, he discusses having a fork of Ruby that has real closures instead of blocks. I thought to myself, “Self, that sounds a lot like Brat.” Five seconds later, he mentioned the problem with passing in multiple closure literals to a function: the comma is ugly.

some_method { do_stuff }, { 1 + 2 }

I agree! I think this is an ugly issue in Brat right now.

But then he proposed an awesome solution: if two closures are next to each other in the argument list, you don’t need a comma! Now you can do this:

some_method { do_stuff } { 1 + 2 }

This took approximately 2 minutes to implement for Brat.

Now instead of

while { x < 1 },
    { 
        p x
        x = x + 1 
    }

You can do

while { x < 1 } { 
        p x
        x = x + 1 
    }

I think this is awesome and makes Brat a lot more attractive.

In fact, I went totally crazy so now you can do

x = 101
true? x > 100
      { p "> 100!" }
      { p "<= 100!" }

Just be careful when using bare variables:

true? a { p "truth!" } { p "lies!" }

This is parsed into:

true?(a({ p "truth!" }, { p "lies!" }))

This is so cool…now to update all of Brat’s docs with this syntax. :)


Operator Precedence and the Shunting Yard Algorithm

Previously, I had pretty much avoided implementing any kind of operator precedence for Brat and covered up my reluctance by saying that was just how it was. So binary operations were strictly left-to-right affairs. But that makes things like 1 + 3 * 4 very disappointing. Since I keep feeling like Brat should at least pretend to be a real language, I decided I really needed to do something about this.

What is so hard about it? It doesn’t seem difficult at first. It seems like there ought to be a nice recursive solution to the problem. And there probably is, but if you are using a typical recursive descent parser, this recursion will not be possible.

There are a few ways to go at this issue, but I wanted the simplest and easiest to deal with. A while ago, I had been thinking about this and ran across a reference to the shunting yard algorithm. At the time, I couldn’t really see how it was related. This time around, I looked at it a bit more closely.

The purpose of the algorithm is basically to convert equations which use infix notation (typical math notation with operators in between their arguments) into Reverse Polish Notation. Why would you want to do that? Well, because RPN is explicit about which order operations are evaluated in. It is also pretty straight-forward to implement (which is the important part, really.)

Let’s take a look at the two main parsing rules, written in Treetop, which deal with binary operations. I have elided the related code:

  rule binary_operation
        lhs:binary_operation_chain rhs:expression { ... }
  end

  rule binary_operation_chain
        (lhs:binary_lhs_expression ~space op:operator ~space)+  { ... }
   end

The first rule will match any binary operations. The second rule is a helper, which matches a list of binary operations, all except for the far right expression. There may have been some explicit reason why I separated these before, but it remains convenient in the code to do so.

The second rule will essentially return a list of alternating values and operators, just what the shunting yard algorithm needs.(Well, actually it’s simpler than the algorithm needs.) This list will be fed into the algorithm, which will spit out a new list of values and operators, now in the proper RPN form.

The parser then “evaluates” the RPN expression. It essentially runs through the algorithm for evaluating RPN, outputting code to evaluate expressions and using temporary values. In other words, the RPN algorithm is not explicitly emitted in the code, but the intent is in the order that things are evaluated.

The whole thing ends up being about 50 lines of code, which is far less than I thought I would be looking at for this.

Anyhow, just wanted to share. I’m pretty excited that Brat has this feature now, especially since it can be implemented in such a flexible way. Operator precedence is stored in a hash table which just associates operators with an integer representing its precedence level, so it is quite simple to add more operators (operators not in the table are just given the lowest precedence, though, so arbitrary operators are still completely supported).


More about Brat

Now, I won’t claim that Brat is a well-planned, coherent language, but I have tried to keep a few things in mind.

1. Common things are shorter, simpler

For example, in Python you pass a function as a parameter by removing parentheses (meth), but invoke it using parentheses (meth()). In Ruby, this is a very weird process and makes named callbacks a bit of a pain (easier to pass a block which calls the method instead). But I invoke methods more often than I pass them as parameters, so a bare variable name means to invoke the method, while an arrow (->meth) means to use it as a value.

Another example. In Ruby, you use puts and gets to print a line (plus newline) and get a line (with the newline attached). In Java, this is System.out.println and…well, input is more of a pain. In Brat, you use p and g, and g strips the newline by default. The less common case (wanting to print without a newline) is just print.

Yet another example. If you want to do string replacement in Ruby, you use gsub to replace all matches, and sub to replace just the first. But I have pretty much never used sub, so in Brat sub replaces all matches and sub_first just does the first (and is clearer about its semantics, too.)

2. As few special cases as possible

In Brat, there are only objects and functions (and I was seriously considering making functions be objects, too, but it was too much of a pain and ruined some things). That is it. There are no keywords, and only a few “special” symbols, like ->. Instead of having separate literals for arrays and hash tables, they share the [] syntax. Operators work like in Ruby, where they are actually converted to method calls, but in Brat all operators are treated equally. Unfortunately, this means math is a little bit of a pain. I’m still thinking about how to make that better.

3. Provide as much flexibility as possible

This is the reasoning for how binary (and unary) operators work: you can define pretty much any operator you want, as long as it only contains symbols. And there are no keywords you might run into conflicts with (although you might overwrite default built-in functions/methods.)

Objects in Brat are not really constrained in any way. You can even share or swap methods between objects. It uses prototyping instead of static (or fairly static) class hierarchies, which I am still getting used to.

4. Worry about performance later

I mean, seriously. I am not going to worry about this until there is more than one person using the language.


Actual Brat Release

I’ve been working on my little language a bit more lately, instead of doing work I really should be doing.

Now that Brat is working pretty well, I thought I would make a release. It’s basically just a snapshot of the current Subversion repository, but I know people would rather download a tar file than use SVN to check something out. It is only for Linux at the moment, but maybe I will get it working on Windows sometime. It should work with Ruby 1.8.6, 1.8.7, and 1.9.

Alright, on to language stuff. I recently changed the scoping rules to be more what a person might expect: any scope can access its outer scopes (rather than before, which used the Neko default of copying the outer scope). This makes the toplevel variables something like globals, except not quite.

Example:
a = { x }
x = 1
p a   #Error - x was not defined when a was
b = { x }
p b   #Prints '1'
c = { x = x + 1 }
c
p b   #Prints '2'

And recursive functions no longer need to be attached to an object:

rec = { x | false? x < 1, { p x; rec(x - 1) } }
rec 10

Bad news is that my current approach does not do a good job of releasing memory. But that is a problem for another day.

Of course, many bugs large and small have been fixed along the way. I hope to at least get this language to the point where I use it regularly. I think it is an interesting and probably odd mix of object-oriented and functional programming.

Also, I set up a discussion group in case Brat generates any attention.


Does this mean..?

Does this mean Brat will never be taken seriously?

a_!?-*+^&@1~\\><$ = new
a_!?-*+^&@1~\\><$.-!_+~%~+_!- = {d0~!@><?&&<>\/+-*^&% | d0~!@><?&&<>\/+-*^&%}
a_!?-*+^&@1~\\><$.@==||------> = { a_!?-*+^&@1~\\><$ }
a_!?-*+^&@1~\\><$.<------||==@ = { ->@==||------> }
@==||------>a_!?-*+^&@1~\\><$ -!_+~%~+_!- a_!?-*+^&@1~\\><$.<------||==@

Brat syntax change

I realize that Brat owes a lot to Smalltalk, so I was looking over the Smalltalk wikipedia page.

I noticed that the syntax for blocks (functions) looked like this:
[ :param | ... ]
That made me think about Brat’s function syntax, which looked like this:
{ | param | ... }

Why is the Brat syntax like that? Well, because I am used to Ruby, where that is what blocks look like. But I noticed something. In Ruby, there are two vertical bars around the parameters. Why? I assume because of parsing issues, especially when you consider that curly braces are also used for hash literals. But, in Brat, square braces are used for arrays and hashes, curly braces are only used for functions. That means the first vertical bar is completely superfluous!

So I changed it to look more like Smalltalk, thereby saving a character and reducing syntactic clutter:
{ param | ... }

It does look a little unbalanced to me, with only that one vertical bar there, but I think that is just because I am used to reading Ruby. I am sure it will grow on me.

Note: This does introduce some ambiguity when you consider that ’|’ can also be a binary operator. In the situation where this could be ambiguous, the formal parameter interpretation has higher priority.

number.| = {rhs| my + rhs }
x = 1
f1 = { 1 | x }  #this is not ambiguous, numbers cannot be formal parameters
p f1  #=> 2
f2 = { x | x }
p f2 5  #=> 5
f3 = {| x | x }
p f3  #=> 2

How are you?

I was messing around, seeing what silly things could be done using Brat, and I came up with several variations of this “How are you?” program.

Here’s one:
  How = {|x| p "How", x[0], x[1] }
  are = {|x| [" are "] + x }
  you? = ["you?"]
  I = {|x| p "I" , x[0], x[1], x[2] }
  am = {|x, y| x + y }
  fine = [" fine, "]
  thank = {|x| ["thank"] + x }
  you! = [" you!"]

  How are you?
  I am fine, thank you!

Pretty nifty, I thought.

Then I thought, “Well, I’m only using a question mark and an exclamation mark, I can probably replicate this in Ruby, too.”

This is the 2-minute direct translation I came up with:

  def How x
        print "How", x[0], x[1], "\n" 
  end

  def are x
        [" are "] + x
  end

  def you?
        ["you?"]
  end

  def I x
        print "I", x[0], x[1], x[2], "\n" 
  end

  def am x, y
        x + y
  end

  def fine
        [" fine, "]
  end

  def thank x
        ["thank"] + x
  end

  def you!
        [" you!"]
  end

  How are you?
  I am fine, (thank you!)

I was a little disappointed that Ruby has to be furtive about how it says thank you, but it doesn’t parse for me otherwise.