Brainfuck on Rubinius
Brainfuck is a very powerful and useful language — surely, millions of people write their shell scripts in Brainfuck. The simplicity of its syntax makes it even more powerful — only 8 instructions. this is if you don’t know Brainfuck.
So, the most common use-case of Brainfuck is implementing it. We could easily write an interpreter for it, but we can make it even funnier. We can generate bytecode to make it run even faster! And Rubinius provides what we need to generate such bytecode. :)
The parser
As I said, Brainfuck is very complex. We must definitely use a parser to get AST
nodes out of Brainfuck code. But I’ll still do it by hand, regardless of
fear. Not only would I dare to do it with String#[]
, I will require one of
those rusted classes from stdlib — one of the underused ones. StringScanner
.
So, let’s write one of those parsers. The first thing we need is a bunch of AST
nodes. We will only have a few instructions: printing the current character
(Out
), asking the user for input (Inp
), changing the current value using
+ and - (ValVar
), changing the current cell using > and < (PosVar
), and
loops (Loop
). So we just need empty classes for now.
module Brainfuck
module AST
class Out
end
class Inp
end
ValVar = Struct.new :size do
end
PosVar = Struct.new :size do
end
Loop = Struct.new :seq do
end
# top-level node
Script = Struct.new :seq do
end
end
end
Now, the parser! It will simply go through the Brainfuck code, and generate
nodes for each part of the code. We’ll keep track of the amount of position and
value variations. When we reach an instruction that doesn’t change the position,
we reset the position counter and generate the PosVar
; same with ValVar
.
require 'strscan'
module Brainfuck
module Parser
class Error < StandardError
def initialize
super "unmatched brackets" # only possible error :)
end
end
module_function
def parse(code)
AST::Script.new scan(StringScanner.new(code), 0)
end
def scan(scanner, depth)
ret = []
pos_var = val_var = 0
until scanner.eos?
case scanner.getch
when '.'
generate_variations(ret, pos_var, val_var)
pos_var = val_var = 0
ret << AST::Out.new
when ','
generate_variations(ret, pos_var, val_var)
pos_var = val_var = 0
ret << AST::Inp.new
when '+'
generate_pos_variation(ret, pos_var)
pos_var = 0
val_var += 1
when '-'
generate_pos_variation(ret, pos_var)
pos_var = 0
val_var -= 1
when '>'
generate_val_variation(ret, val_var)
val_var = 0
pos_var += 1
when '<'
generate_val_variation(ret, val_var)
val_var = 0
pos_var -= 1
when '['
generate_variations(ret, pos_var, val_var)
val_var = pos_var = 0
ret << AST::Loop.new(scan(scanner, depth + 1))
when ']'
generate_variations(ret, pos_var, val_var)
val_var = pos_var = 0
if depth.zero?
raise Error
else
return ret
end
end
end
if depth.zero?
ret
else
raise Error
end
end
def generate_variations(output, pos_var, val_var)
generate_pos_variation(output, pos_var)
generate_val_variation(output, val_var)
end
def generate_pos_variation(output, var)
unless var.zero?
output << AST::PosVar.new(var)
end
end
def generate_val_variation(output, var)
unless var.zero?
output << AST::ValVar.new(var)
end
end
end
end
Does it work? Let’s try it!
pry(main)> load "bf_rbx.rb"
=> true
pry(main)> cd Brainfuck::Parser
pry(Brainfuck::Parser):1> parse "test"
=> (Brainfuck::AST::Script < Struct) {
:seq => []
}
pry(Brainfuck::Parser):1> parse "[,.]"
=> (Brainfuck::AST::Script < Struct) {
:seq => [
[0] (Brainfuck::AST::Loop < Struct) {
:seq => [
[0] #<Brainfuck::AST::Inp:0x3798>,
[1] #<Brainfuck::AST::Out:0x379c>
]
}
]
}
pry(Brainfuck::Parser):1> parse ">>>< . >>+--< . -+ ."
=> (Brainfuck::AST::Script < Struct) {
:seq => [
[0] (Brainfuck::AST::PosVar < Struct) {
:size => 2,
},
[1] #<Brainfuck::AST::Out:0x3ea0>,
[2] (Brainfuck::AST::PosVar < Struct) {
:size => 2,
},
[3] (Brainfuck::AST::ValVar < Struct) {
:size => -1,
},
[4] (Brainfuck::AST::PosVar < Struct) {
:size => -1,
},
[5] #<Brainfuck::AST::Out:0x3eb4>,
[6] #<Brainfuck::AST::Out:0x3eb8>
]
}
pry(Brainfuck::Parser):1> parse "test +[-]+"
=> (Brainfuck::AST::Script < Struct) {
:seq => [
[0] (Brainfuck::AST::ValVar < Struct) {
:size => 1
},
[1] (Brainfuck::AST::Loop < Struct) {
:seq => [
[0] (Brainfuck::AST::ValVar < Struct) {
:size => -1,
}
]
},
[2] (Brainfuck::AST::ValVar < Struct) {
:size => 1
}
]
}
pry(Brainfuck::Parser):1> parse "]"
Brainfuck::Parser::Error: unmatched brackets
from ./bf_rbx.rb:179:in `scan'
pry(Brainfuck::Parser):1> parse "[]]"
Brainfuck::Parser::Error: unmatched brackets
from ./bf_rbx.rb:179:in `scan'
pry(Brainfuck::Parser):1> parse "["
Brainfuck::Parser::Error: unmatched brackets
from ./bf_rbx.rb:191:in `scan'
This looks like what I expect! :)
Bytecode generation
This part is the core of the work: it tells Rubinius how to run our code. Each
AST node must know how to generate its own bytecode. To teach this to our
objects, we’ll give them a bytecode
method.
AST::Script
This one is where the code starts, so it will setup our environment: set the current line (let’s forget about that and just set it to one), the cursor position, and the buffer. Then we just generate the bytecode for all the nodes in the script. It will also be responsibly of setting the return value of the brainfuck code (we’ll set that to the used buffer).
Since Rubinius expects us to use the index of local variables, let’s keep them as constants to make the code clearer.
module Brainfuck
module AST
Buffer = 0
Pointer = 1
BufferSize = 3000
Script = Struct.new :seq do
def bytecode(g) # g is short for generator
# let's say everything is in one line
# (I'm sure setting it to the real value is useful to write a
# brainfuck debugger...)
g.set_line 1
g.meta_push_0 # push 0
g.set_local Pointer # assign it to pointer
g.pop # pop 0
# (we must pop values we won't use)
g.push_literal Array.new(BufferSize, 0) # push array
g.set_local Buffer # assign it to buffer
g.pop # pop array
seq.each do |o| # run code
o.bytecode g
end
# This seems to make Rubinius check the variables we need, etc.
g.use_detected
g.push_local Buffer # push buffer
g.ret # return it
end
end
end
end
AST::Node
We’ll need to change the value of the buffer quite often, so writing a mixin to generate code used for this will help us to avoid repititions.
module Brainfuck
module AST
module Node
def push_current(g)
# This is how you do a method call:
# Buffer[Pointer]
g.push_local Buffer # 1. push the receiver
g.push_local Pointer # 2. push arguments
g.send :[], 1 # 3. send method, and specify argument count
end
def set_current(g)
# Buffer[Pointer] = value
g.push_local Buffer # push array
g.push_local Pointer # push index
yield g # use a block to push value
g.send :[]=, 2 # call this
g.pop # pop result
end
end
end
end
AST::Out
This one simply writes the current value to stdout. For instance if current element is set to 97, it will print “a”.
module Brainfuck
module AST
class Out
include Node
def bytecode(g)
g.push_literal $stdout # push $stdout
push_current g # push g
g.send :chr, 0 # replace it with g.chr
g.send :print, 1 # print it
g.pop # pop result
end
end
end
end
AST::Inp
This is just the opposite: get a byte from stdin. We’ll call to_i on the result so we’ll get 0 at EOF (in which case getbyte returns nil).
module Brainfuck
module AST
class Inp
include Node
def bytecode(g)
set_current g do
g.push_literal $stdin # push $stdin
g.send :getbyte, 0 # get a byte
g.send :to_i, 0 # call to i
end
end
end
end
end
AST::ValVar
This could just be “add size to the current value”, but it would not work. We
want our value to stay a valid byte (we could implement Brainfuck with unicode
codepoints or anything else if we wanted to, though), so we must stay between 0
and 256. We’ll thus set the current value to (current + size) % 256
.
module Brainfuck
module AST
ValVar = Struct.new :size do
include Node
def bytecode(g)
set_current g do
push_current g # push current value
g.push_literal size # push size
g.send :+, 1 # add them
g.push_literal 256 # push 256
g.send :%, 1 # mod
end
end
end
end
end
AST::PosVar
This is pretty much the same as ValVar
, except we do it for the cursor,
which must stay a valid index (between 0 and BufferSize
, 3000 in this
case).
module Brainfuck
module AST
PosVar = Struct.new :size do
def bytecode(g)
g.push_local Pointer # push pointer
g.push_literal size # push size
g.send :+, 1 # add them
g.push_literal BufferSize # push buffer size
g.send :%, 1 # mod
g.set_local Pointer # set pointer
g.pop # don't forget pop
end
end
end
end
AST::Loop
Loops are the funniest part: you must implement this yourself:
while buffer[pointer] != 0
run_instructions
end
To do this, the generator won’t provide something to implement while yourself. Instead, you’ll have to use labels and gotos yourself. :)
module Brainfuck
module AST
Loop = Struct.new :seq do
include Node
def bytecode(g)
start, check = g.new_label, g.new_label # create a few labels
g.goto check # first goto the place where we check if we must loop
start.set! # this is the beginning of the loop
seq.each do |o| # run the body
o.bytecode g
end
check.set! # this is where we check the current value
push_current g # push current value
g.meta_push_0 # push 0
g.send :==, 1 # compare
g.goto_if_false start # continue if it's not 0
end
end
end
end
Compiler
Rubinius’ compiler interface uses stages to go from our code to whatever is the output. We’ll only use two stages, and let Rubinius handle the end of the work: firstly, going from Brainfuck to AST, then, from AST to bytecode.
First step is quite easy, since we already have our parser to do it:
module Brainfuck
module Stages
class Generator < Rubinius::Compiler::Stage
# later!
end
class Code < Rubinius::Compiler::Stage
stage :bf_code
next_stage Generator
def initialize(compiler, last)
super
# tell the compiler we want to get the code
compiler.parser = self
end
attr_accessor :code
def run
@output = Brainfuck::Parser.parse(@code)
run_next
end
end
end
end
…and second step is easy as well, since we have AST::Script#bytecode
:)
module Brainfuck
module Stages
class Generator < Rubinius::Compiler::Stage
next_stage Rubinius::Compiler::Encoder
def initialize(compiler, last)
super
compiler.generator = self
end
def run
@output = Rubinius::Generator.new
@input.bytecode @output
@output.close
run_next
end
end
end
end
To trigger those steps, we’ll just create or compiler class, but a very simple one: we’ll just create an instance to go from Brainfuck to a compiled method.
module Brainfuck
class Compiler < Rubinius::Compiler
def self.compile_code(code)
compiler = new :bf_code, :compiled_method
compiler.parser.code = code
compiler.run
end
end
end
And now comes the method all this has been written for: Brainfuck#run
. We
just compile the code, setup a few things for Rubinius to work, and call it, at
last.
module Brainfuk
module_function
def run(code)
meth = Compiler.compile_code(code)
meth.scope = binding.static_scope.dup
meth.name = :__eval__
script = Rubinius::CompiledMethod::Script.new(meth, "(eval)", true)
script.eval_binding = binding
script.eval_source = code
meth.scope.script = script
be = Rubinius::BlockEnvironment.new
be.under_context(binding.variables, meth)
be.from_eval!
be.call
end
end
Now let’s see if all those efforts were vain, or if this works!
pry(main)> load "bf_rbx.rb"
=> true
pry(main)> Brainfuck.run("++++++++++[>+++++++>++++++++++>+++>+<<<<-]>++.>+.
pry(main)* +++++++..+++.>++.<<+++++++++++++++.>.+++.------.--------.>+.>.");
Hello World!
Now you can have fun and run complex scripts like this one on the Rubinius VM. :)