Rust, a new frontier

2021-02-13

The premise

In the past few years I've been slowly coming in contact with more and more Rust code. Whether it's packaging or debugging, it felt like it's about time to start learning it in more detail.

So my first project for doing this was porting over the bitte-cli from Crystal to Rust, and hoping I wouldn't completely get bogged down trying to appease the compiler.

Work is going on in the rust branch, but it might be in master by the time you're reading this and everyone is happy with the new implementation.

Given that the code was already well-typed and working with Crystal, there wasn't a lot of mental work trying to figure out /how/ it should work. By simply focusing on translation I could focus on the patterns of Rust instead.

The code is really still rather ugly and inefficient, but since I prefer learning by doing over reading books, i simply started hacking using the awesome Emacs LSP and rust-analyzer combo. It took a bit of fiddling getting the analyzer to work, but after I figured out a working nix shell, it was a pleasure to have instance feedback on type and syntax errors and a bunch of more advanced refactoring utilities.

Here's a gist of that, but check out the code for the exact versions of nixpkgs, since it seems the requirements change rather often.

pkgs.mkShell {
  RUST_SRC_PATH = pkgs.rustPlatform.rustLibSrc;
  RUST_BACKTRACE = 1;

  buildInputs = with pkgs; [
    openssl
    pkg-config
    rustc
    cargo
    (rustracer.overrideAttrs (old: { checkPhase = null; }))
    rust-analyzer
    rustfmt
    clippy
  ];
};

Now to the language itself. A lot has been written about it already, so I'm trying to keep this short.

The good

Tooling around the language is really good. About the same level as the Go ecosystem. There is rustfmt so you never have to worry about writing well-formatted code. With rust-analyzer you more or less write the code for you via automated imports, auto completion, type hints, jumping between definitions, and lots more.

The biggest difference to Go related to tooling is cargo. It's a shame every language ecosystem reinvents an ad hoc informally-specified, bug-ridden, slow implementation of half of Nix. But at least this one has a rather nice lock file with checksums.

It also helps that the Nix community has done a tremendous amount of work around packaging Rust applications, although the best option in nixpkgs now seems to be using FOD (fixed output derivation).

An alternative to this for many projects could be haskell.nix, a fork of Naersk by IOHK and able to do incremental compilation based on the hashes in the Cargo.lock itself. This gives a major boost in compilation speed, but doesn't quite work with all Rust projects out there yet.

Community adoption is quite good with Rust. There are a large number of crates available for most things I needed so far.

Option/Result types all the way. This is a rather big annoyance with Crystal, where there is often no indication of which things explode on you unless you read the docs and use ?-methods all over the place. Having someone bark at you when you forget to deal with a result (but still compiling) is IMHO the best compromise between the way Crystal and Go handle those issues.

The bad

Finding good crates is a bit hard. Squatting nice names for crates seems quite common, so the name space is polluted and many projects settle for more obscure names.

It's not immediately obvious which crates use unsafe code, although there might be some way to find out, I haven't found it yet. For a language that touts its safety, this seems like something you should display prominently next to each project.

Build speed is, like a lot of compiled languages, still rather subpar. This is most likely related to the large number of dependencies you end up with for even seemingly simple projects, caused by the almost microscopically small standard library.

Speaking of standard library. If you're coming from well-endowed languages like Ruby, Crystal, Go, or even JavaScript, Erlang and Python... you'll be in for a bit of a shock. There aren't even regular expressions in the stdlib, never mind JSON or HTTP handling. This makes it pretty hard to even start a project without immediately having to add a ton of dependencies to it. bitte-cli is currently at nearly 300 crates, and it feels like a huge liability given that you're usually supposed to take ownership of everything that goes into your result.

The ugly

Semicolons... semicolons everywhere! I know this is just my personal preference, and Nix got me a bit more accustomed to making sure they are where they need to be, but this is just not something I can understand for a language released way after many other languages have already proven that newlines are perfectly acceptable semicolons. See automatic semicolon insertion in Go) for example. Anyway, much has been said about this issue already, and I'm not bringing anything new to the table, but it's just a sad state of affairs.

The other side-effect of having a tiny stdlib is that there is no consensus around a common set of crates for basic functionality, and you end up with duplication of many parts of your stack.

For example I'm using restson for talking with the Terraform API, but for the AWS API I'm using a crate called rusoto, which uses hyper under the hood... and suddenly my whole app has to also use tokio and async/await sprinkled all over the code, even though async is nearly pointless in this context.

The strange

No post about Rust would be complete without mentioning the borrow checker. It's something that will bend your mind in novel ways, especially if you've so far mostly used garbage-collected languages.

I should probably start reading some more up on this, so far I've just been shifting things around until the compiler didn't yell at me anymore. But I recently learned that most of the things I'm doing are probably going to tank performance. Which isn't really an issue for a tiny CLI application, but will definitely have to think about it when doing something more intensive.

Euphenix

2019-08-15

I ended up having to move some sites from Weebly to something more affordable and reliable, since most of them sit around for months without changes. I also think their stance of charging for TLS is silly, and the lack of an export function for your content is just evil.

That said, they are good for their CMS part, which allows relatively fast prototyping of sites, and eventually I'm also looking for a replacement for this, but for now it's not urgent.

After having a look around, i decided on moving them to Netlify. They have a pretty good free tier with proper TLS and form handling. Deploys are also quite easy to automate in different ways.

I evaluated a bunch of different static site generators (most of them listed on https://www.staticgen.com/), the ones I liked the most were Hugo and Hakyll.

What also caught my eye was https://styx-static.github.io/styx-site/, which is implemented entirely in Nix and looked very good.

I started with porting the site of Finesco, which my wife originally made on Weebly. It's not too fancy and apart from the two blogs it looked like it would be easy to reimplement in Hugo.

Turns out that Hugo wasn't flexible enough after all, given its compiled nature I struggled trying to make some kind of archive page for each year, and ended up with a lot of hacks around taxonomies.

That's when I started looking at Hakyll. I've been dabbling with Haskell for a long time, and quickly got the site ported from Hugo, with a few extra features to boot.

After that, I tried building a Docker image with Nix for site, and it turned out that the image would be in the gigabyte range instead of the few megabyte i was hoping for.

Apparently responsible for this was the pandoc dependency, which prevented me from compiling statically, and I wasn't versed enough in Haskell to figure out how to inject a precompiled pandoc instead. (Now I might be able to do it, but good luck even finding that option form their docs).

Since I was on a learning journey anyway, I then took a look at Styx, but noped out of there pretty soon when I realized that the templates are also written in Nix, and it had a huge performance penalty and very weak editor support.

I still was intrigued by the idea of having something as simple as Hugo but with the power of Nix for generating my sites, and so I started to work on a little project in my spare time to flesh out that idea.

After a couple of iterations, I finally arrived at this:

let
  euphenix = (import (fetchTarball {
    url =
      "https://github.com/manveru/euphenix/archive/eaaee37df12e8fccced3a4ac93402d6b3e5fcf54.tar.gz";
  }) { }).extend (self: super: {
    parseMarkdown =
      super.parseMarkdown.override { flags = { prismjs = true; }; };
  });
  inherit (euphenix.lib) take;
  inherit (euphenix) build sortByRecent loadPosts;
in euphenix.build {
  rootDir = ./.;
  layout = ./templates/layout.liquid;
  favicon = ./static/img/favicon.svg;

  variables = { liveJS = true; };

  expensiveVariables = rec {
    posts = sortByRecent (loadPosts "/blog/" ./blog);
    latestPosts = take 5 posts;
  };
}

That is the whole configuration needed to build the site you're reading right now. The rest of it is pretty standard HTML, CSS, and a few sprinkles of Javascript for eye-candy and syntax-highlighting (until I generate that statically as well).

I'm still pondering using a different templating engine or implementing the templating myself instead of relying on Infuse. Infuse was great for getting started, but it's originally meant for configuration files and simpler use-cases.

Just getting a list of required variables from the template automatically would be great without having to do that with some unreliable regular expressions.

Another relatively easy option would be to use Liquid, which already has a massive user-base and I'm pretty familiar with it. It's also closer in syntax to Go templates than many alternatives, so wouldn't require a lot of effort to port the Infuse templates.

For now though, I'll keep the expensiveVariables hack around until I get time again to build something better. I hope the name is warning enough that people won't rely on it too much.

The trick with expensiveVariables is that you annotate in the front-matter of tmeplates which variables are required for rendering. And while that's not a DRY solution, it's not used often and should be relatively straight-forward. I just need a better name for it.

So right now you can write something like this:

<!--
requires: latestPosts
-->
{{ range .latestPosts }}{{ .meta.title  }}{{ end }}

And through the magic of lazyness in Nix, the page will only be rebuilt if latestPosts has changed.

Anyway, this is it for today, there are many more topics that could be covered by proper documentation, but writing like this makes me think things through a bit better.

If you read until now, you may as well check the project: Euphenix

Ruby & Nix

2019-04-13

Why do you write about this?

As some of you know, I've been using Ruby for well over a decade now, and it's still a language I use almost every day for work or pleasure in some fashion, even if it's a quick calculation in irb.

I regularly see questions in #nixos and the NixOS Discourse from folks that want to run some simple Ruby application that hasn't been packaged yet or work on their Ruby stuff with all the gems they need.

While there is documentation about this in the nixpkgs manual, I thought I'd finally share my usual approach to this to clear things up and save me writing the same answers to every question.

Why do i need Nix?

This is usually the first thing that goes through your mind if you are already used to using Ruby and Bundler in an imperative fashion. Of course you care about what version of Ruby you're using, so you use another package manager for that, something like rvm, chruby, asdf, etc.

After that, you just go into some project, run bundle install, and with a bit of luck everything works more or less.

That's of course, until the project starts using gems that aren't pure Ruby or use some 3rd-party executable. The most popular gems with this issue at the time of writing are things like tzinfo, nokogiri, ffi, pg, sqlite3.

Each of these gems in turn has some prejudice about your environment and may or may not find the required dependencies.

So you start writing installation instructions for each system you want to support, with a few lines for each of their respective package manager. You have apt, pacman and brew and hopefully someone will contribute instructions for other systems because that's all you have experience with.

So, the next logical step is to not only have all these instructions, but also provide a Dockerfile that pulls random stuff from the internets and finally provides the one true environment that your application is supposed to run in.

I mean, who'd ever object to such elegance?

How do you use Nix?

What if we could express all of the above in a declarative fashion that can be used by everyone else coming to the project with a single command, doesn't require containers or causes conflicts because your old project uses PostgreSQL 9.4 but your new one uses 9.10?

It turns out that there is another way that is:

  • Reproducible
  • Cachable
  • Portable
  • Reliable

And that's where I'd like to present the Nix approach to project management. Meet our shell.nix:

let
  pkgs = import (
    fetchTarball {
      url = https://github.com/nixos/nixpkgs-channels/archive/0c0954781e257b8b0dc49341795a2fe7d96945a3.tar.gz;
      sha256 = "05fq11wg8mik4zvfjy2gap59r8n0gfbklsw61r45wlqi7a2zsl0y";
    }
  ) {};
  gems = pkgs.bundlerEnv { name = "my-gems"; gemdir = ./.; };
in pkgs.mkShell { buildInputs = [ gems gems.wrappedRuby ]; }

This may seem intimidating at a first glance for people unfamiliar with Nix, but the principle is simple. We import a snapshot of nixpkgs. Just based on this, the version of Ruby, tzinfo, libxml, libzmq, postgresql, sqlite3, and everything else you need, is already specified and used correctly.

The tedious part is deciding which version of nixpkgs you want, but I usually use the latest unstable with this script:

#!/usr/bin/env bash

set -e

url="https://github.com/nixos/nixpkgs-channels"
channel="${@:-nixpkgs-unstable}"
rev="$(git ls-remote "$url" "$channel" | cut -f1)"
archive="$url/archive/$rev.tar.gz"
sha=$(nix-prefetch-url --unpack "$archive")
cat <<EOF
fetchTarball {
  url = $archive;
  sha256 = "$sha";
}
EOF

This is useful for development because even if you come back in a few years and want to run your tests, it will still build the same and should work without any problems.

If you want to change the Ruby version you're using, you can do so by passing ruby = pkgs.ruby_2_6; to bundlerEnv. Otherwise it'll always use the Ruby marked as most stable in nixpkgs, which might lag one version behind.

What about applications?

We covered developing any kind of Ruby library or application above. A common question is "how do i run application X on NixOS and/or via Nix".

Answering this may not be hard, but you have to know where to look, and that's the main reason I'm writing this post, so it might pop up in your searches and save you some time.

To run any kind of Ruby application, the first thing you want to do is write a Gemfile for it (if it's on rubygems, we'll cover other cases later):

source 'https://rubygems.org' do
  gem 't'
end

After this, run the following command:

$ nix-shell -p bundler bundix --run 'bundle lock && bundix'

This will generate two files: Gemfile.lock and gemset.nix.

Finally we need the real core of this, the default.nix:

{ bundlerApp }:
bundlerApp {
  pname = "t";
  gemdir = ./.;
  exes = [ "t" ];
}

That is all.

Now we can try to build and run it:

$ nix-build -E '(import <nixpkgs> {}).callPackage ./. {}'
these derivations will be built:
  /nix/store/smipql38q4vyhg0ba1bfn5fgp5wav9ry-t-3.1.0.drv
  ...
building '/nix/store/smipql38q4vyhg0ba1bfn5fgp5wav9ry-t-3.1.0.drv'...
...
/nix/store/1x3m7rmfzcg2c6x3mwax7qppq157c6mn-t-3.1.0

The last line is the location where our new t command is installed. We also got a new symlink in the current directory called result, that points there.

So we try to run this (fairly old) application:

$ ./result/bin/t authorize
Welcome! Before you can use t, you'll first need to register an
application with Twitter. Just follow the steps below:
  1. Sign in to the Twitter Application Management site and click
     "Create New App".
  2. Complete the required fields and submit the form.
     Note: Your application must have a unique name.
  3. Go to the Permissions tab of your application, and change the
     Access setting to "Read, Write and Access direct messages".
  4. Go to the Keys and Access Tokens tab to view the consumer key
     and secret which you'll need to copy and paste below when
     prompted.

Press [Enter] to open the Twitter Developer site.

Enter your API key: 66616b6520617069206b6579
Enter your API secret: 6f626e6f78696f75732066616b652061706920736563726574
Traceback (most recent call last):
    7: from ./result/bin/t:18:in `<main>'
    6: from ./result/bin/t:18:in `load'
    5: from /nix/store/xy3g0pv6f7j8j217dnhxprs0hx7gfqjk-t-3.1.0/lib/ruby/gems/2.5.0/gems/t-3.1.0/bin/t:20:in `<top (required)>'
    4: from /nix/store/bp0i1lys4sypc3x7jjnnn774sqpy82aa-ruby2.5.5-thor-0.20.3/lib/ruby/gems/2.5.0/gems/thor-0.20.3/lib/thor/base.rb:466:in `start'
    3: from /nix/store/bp0i1lys4sypc3x7jjnnn774sqpy82aa-ruby2.5.5-thor-0.20.3/lib/ruby/gems/2.5.0/gems/thor-0.20.3/lib/thor.rb:387:in `dispatch'
    2: from /nix/store/bp0i1lys4sypc3x7jjnnn774sqpy82aa-ruby2.5.5-thor-0.20.3/lib/ruby/gems/2.5.0/gems/thor-0.20.3/lib/thor/invocation.rb:126:in `invoke_command'
    1: from /nix/store/bp0i1lys4sypc3x7jjnnn774sqpy82aa-ruby2.5.5-thor-0.20.3/lib/ruby/gems/2.5.0/gems/thor-0.20.3/lib/thor/command.rb:27:in `run'
/nix/store/4x9z03ahnmi85jsh1zgayg679pagcrhx-ruby2.5.5-t-3.1.0/lib/ruby/gems/2.5.0/gems/t-3.1.0/lib/t/cli.rb:82:in `authorize': uninitialized constant Twitter::REST::Client::BASE_URL (NameError)

(don't mind my carefully chosen fake credentials ;)

So we check the issues for this gem and find out that it's indeed a bit ancient and lots of people have this problem, and the solution is to pin the twitter dependency to an older version. So let's do that quickly by editing the Gemfile:

source 'https://rubygems.org' do
  gem 't'
  gem 'twitter', '~> 6.1.0'
end

Now run:

$ rm gemset.nix Gemfile.lock
$ nix-shell -p bundler bundix --run 'bundle lock && bundix'

and finally build again:

$ nix-build -E '(import <nixpkgs> {}).callPackage ./. {}'

Et voilà! t finally works again!

A week of Haskell

2019-04-05

The premise

So we had a HackWeek at XING, and I chose learning Haskell as my project this time. Around the same time last year, I built a small savegame editor for Starbound in Haskell. I didn't really grasp what the language is all about, and ran into trouble constantly.

I still wanted to know the language better, but in between doing $WORK, Nix stuff, keeping up with the usual JS and Ruby insanity and relearning Elixir & Phoenix, there wasn't much time for it.

For this time, I chose the Learn You a Haskell for Great Good! book. Mostly because it's free and in HTML, so easy to download and search.

Some nice quotes about it from #learn-haskell are:

qu1j0t3: also LYAH is just kind of ick
Peter_Storm_: Yeah, haskellbook is about a million times better than LYAH
lunabo: I liked some parts of LYAH but it doesn't prepare you very well for writing more realistic Haskell code

At the point when I read those remarks, my week was almost over, so I didn't end up buying the recommended Haskell Programming from first principles. Although in retrospect I wish I had.

The good

Well, with that out of the way, let's get to Haskell itself. It's a good language for problems that you can model in your mind beforehand. At least that's my experience. Once you got a mental model of how the data should flow, you can basically just write that down and reasonably expect it to work.

I really enjoy this aspect also in other languages, and with more practice, maybe I'll try using it for some of my otherwise hacky scripts that I usually write in Ruby or Go. Using runhaskell isn't bad, but you have to take care of dependencies a lot. The stdlib may look large, but a lot of it is simply related to what is needed to build GHC.

It doesn't have an HTTP client/server, JSON or YAML parser, and various other things I'd like to have handy when writing some simple plumbing script. In this regard it's similar to Erlang/Elixir, which just focus on the core language platform, that everything is built on, and less on the developer experience out of the box.

I can get behind this, but once you get to dependency handling it's not a simple script anymore. That's where nix-shell can help a lot.

The ugly

The LYaH tries to be funny and technical at the same time, and seems to fail at both. I got distracted by typos and some sections are simply a list of functions with a short description of what it does. There are very few exercises until you finally work on a Todo list around chapter 9. I made it to printing Hello, World! on Thursday.

The first two days were mostly spent trying to understand the relationship between Stack and Nix and how to properly setup my editor with Haskell, and there are various half-completed ways to do that.

I never found the ideal setup before giving up on that, and just went with a more-or-less working setup of the Haskell IDE Engine.

For some reason the code formatting is still broken sometimes, auto-completion is hit-or-miss, automatically inserting type signatures is still broken, it tries to add dependencies via cabal instead of stack, and I didn't spend more time on trying to get it to work with direnv or nix-sandbox, so I just used a user-global Haskell and HIE installation for Emacs.

Phoenix LiveView

2019-03-31

I'm playing with Phoenix LiveView this weekend to port a little Goban I wrote in Elm last year to Elixir.

One feature that's missing from LiveView currently is handling hovering, which I use to show where you can place your stones. I found the delightfully small source for the Javascript side of LiveView.

It's really easy to read and understand (given you're a bit familiar with event handling in JS), and it's also pretty straight-forward to add new events. The only drawback is that I have to duplicate closestPhxBinding, PHX_VIEW, and PHX_VIEW_SELECTOR, so hopefully they won't change too often.

function closestPhxBinding(el, binding) {
  do {
    if(el.matches(`[${binding}]`)){ return el }
    el = el.parentElement || el.parentNode
  } while(el !== null && el.nodeType === 1 && !el.matches("[data-phx-view]"))
  return null
}

window.addEventListener("mouseover", e => {
  const outBinding = liveSocket.binding("mouseout")
  const overBinding = liveSocket.binding("mouseover")

  const target = closestPhxBinding(e.target, overBinding)
  const overPhxEvent = target && target.getAttribute(overBinding)
  const outPhxEvent = target && target.getAttribute(outBinding)

  if(!overPhxEvent || !outBinding){ return }

  function outHandler(e) {
    target.removeEventListener("mouseout", outHandler, false)
    e.preventDefault()
    liveSocket.owner(target, view => view.pushEvent("mouseout", target, outPhxEvent))
  }

  target.addEventListener("mouseout", outHandler, false)

  e.preventDefault()
  liveSocket.owner(target, view => view.pushEvent("mouseover", target, overPhxEvent))
}, false)

I'll just share this here because the project doesn't allow for new features yet and it might come in handy for someone.

Back to the roots

2019-03-23

Well, it's been a while since I've updated this blog of mine, and this is certainly not going to be read by anybody, but just wanted to say that I'll try to write at least occasionally again about random stuff.

A lot has happened in the past few years, and I never made a proper overview of the languages and projects I've worked with.

This blog still runs on the same software I wrote back in 2012, and I'll probably get around to modifying it eventually and fixing a few minor bugs.

But it's a testament to the stability of Go, I can run the same code nearly unmodified (just two dependencies changed location and I ported it to use modules) 6 years later.

Who would've thought that they would actually achieve that back then? I'm still using Go occasionally for things that I just want to write a very stable version of once, and then forget about it.

It's relatively easy to package, cross-platform, reasonably fast, and requires very little of my working language memory to write stuff in, compared to something like Ruby where I constantly forget method names or arguments and proper auto-completion is still an issue.

Anyway, I've also been quite intersted in ActivityPub, and will try to get some of that functionality going here, similar to what write freely is doing (which is also in Go, so I can probably borrow some ideas :)

After having my various blogs hosted on so many different 3rd party sites, I finally got tired of moving around when they inevitably change/close. So instead I'll finally host it myself.

There's a lot of hype around IPFS and similar projects for blogs, but I value simplicity in this case more.

So that's all I can write about for today, hope you didn't miss me too much ;)

Using sysconf in Ruby with FFI (Update)

2012-10-16

Turns out I was wrong yesterday and what I got was garbage.

So let's do it right today, but unfortunately this is going to be a lot less pretty than I'd hoped for. Also, this is going to use a C compiler in the backround every time you run it. Might be worth caching.

require 'ffi'
require 'ffi/tools/const_generator'

module Sysconf
  extend FFI::Library
  ffi_lib ["c"]

  fcg = FFI::ConstGenerator.new do |gen|
    gen.include 'unistd.h'
    %w[
    _SC_PAGE_SIZE
    _SC_VERSION
    ].each do |const|
      ruby_name = const.sub(/^_SC_/, '').downcase.to_sym
      gen.const(const, "%d", nil, ruby_name, &:to_i)
    end
  end

  CONF = enum(*fcg.constants.map{|_, const|
    [const.ruby_name, const.converted_value]
  }.flatten)

  attach_function :sysconf, [CONF], :long
end


p page_size: Sysconf.sysconf(:page_size)
p version: Sysconf.sysconf(:version)

Using sysconf in Ruby with FFI

2012-10-15

Just a quick snippet that makes it easier to use getconf by using the sysconf function.

require 'ffi'

module Sysconf
  extend FFI::Library
  ffi_lib ["c"]

  SYSCONF_ARGS = [
    :_SC_ARG_MAX,
    :_SC_CHILD_MAX,
    :_SC_CPU_TIME,
    :_SC_THREAD_CPU_TIME,
  ]

  enum SYSCONF_ARGS
  attach_function :sysconf, [:int], :long

  class << self
    SYSCONF_ARGS.each do |e|
      method_name = e.to_s.sub(/^_SC_/, '').downcase
      define_method(method_name){ sysconf(e) }
    end
  end
end

p arg_max: Sysconf.arg_max
p cpu_time: Sysconf.cpu_time
p thread_cpu_time: Sysconf.thread_cpu_time

Keep testing your iron.io Ruby Workers

2012-07-23

One of the best practices in software development is making sure your software actually keeps working.

If you are using IronWorker to parallelize processing, it's good to know that your code actually works before scheduling it. The most common way to do so is Test::Unit, which conveniently comes in the Ruby standard library. Of course you may use any other framework, be it RSpec, bacon, or even kintama.

One of the things our users sometimes ask for is how they can avoid using a separate file just to use the executable as a library, so their worker layout would look something like this:

.
├── foo_lib.rb
├── foo.rb
├── foo.worker
└── test_foo.rb

Instead they prefer something more like:

.
├── foo.rb
├── foo.worker
└── test_foo.rb

So here is just a quick example of how to use $0 and __FILE__ in Ruby to conditionally use a file as a library or as an executable. It's a pattern at least as old as Ruby, but still quite useful.

First we start with our actual worker, which does the heavy lifting, trying to increase entropy in the universe. Little does it know that rand is only pseudo-random. At the end of the file, after the whole EntropyGatherer class has been executed, we put our if $0 == __FILE__, and place whatever code is needed to initialize and run the class. Of course, you could just have a single method instead of a class, or use a module with module_function, but this will have to do for our purposes.

I call this file entropy_gatherer.rb

class EntropyGatherer
  attr_reader :results

  def initialize
    @results = []
  end

  def run
    1.upto 100 do
      @results << rand
    end
  end
end

if $0 == __FILE__
  gatherer = EntropyGatherer.new
  gatherer.run
  puts gatherer.results
end

And here we have a file called test_entropy_gatherer.rb, which uses the familiar Test::Unit and can be executed with testrb test_entropy_gaterer.rb.

require "test/unit"
require_relative 'entropy_gatherer'

class TestEntropyGatherer < Test::Unit::TestCase
  # this runs before all tests
  def setup
    @gatherer = EntropyGatherer.new
    @gatherer.run
  end

  def test_result_size
    assert(@gatherer.results.size == 100, "wrong number of results")
  end

  def test_result_range
    @gatherer.results.each do |result|
      assert(result <= 1.0, "result greater than 1.0")
      assert(result >= 0.0, "result smaller than 0.0")
    end
  end
end

And here's how that all plays together.

iota ~/tmp/iworker_rand % testrb -v test_entropy_gatherer.rb
Run options: -v

# Running tests:

TestEntropyGatherer#test_result_range = 0.00 s = .
TestEntropyGatherer#test_result_size = 0.00 s = .


Finished tests in 0.000814s, 2455.7896 tests/s, 246806.8595 assertions/s.

2 tests, 201 assertions, 0 failures, 0 errors, 0 skips

Simplified Travis CI & RVM

2011-12-08

Last time I showed you one possibility of replacing bundler with rvm.

Today I want to improve on the hacky .load_gemset file, by simply putting it into the .travis.yml.

It was irritating me to have such a roundabout way of loading rvm, and after studying the way the configuration is being handled a bit more, this seemed so obvious I can't believe I missed it before.

This is being used in ffi-magic already, and will shortly land in Innate as well.

---
script: RUBYOPT=-rubygems rake bacon
before_script:
- test -s "$HOME/.rvm/scripts/rvm" && source "$HOME/.rvm/scripts/rvm"
- test -s .gems && rvm gemset import .gems
rvm:
- 1.8.7
- 1.9.2
- 1.9.3
- ruby-head
- rbx-18mode
- rbx-19mode
- ree
- jruby
notifications:
  email:
  - mf@rubyists.com
branches:
  only:
  - master

Yeah, that's all, happy Continuous Integration.

Calendar with CouchDB

2011-10-25

Seems like I'll be heading to the OpenRheinRuhr this year, I just won a ticket for it!

Countdown to OpenRheinRuhr

I'd like to write a little bit about CouchDB, and how you can use it with Makura, in order to find better ways to model the next version of this library.

This is a little preview of what we're going to be dealing with tomorrow:

require 'makura'

Makura::Model.database = 'coulendar'

class Event
  include Makura::Model

  properties :from, :to, :desc

  def from=(time)
    self['from'] = time.to_i
  end

  def from
    Time.at(self['from'].to_i)
  end

  def to=(time)
    self['to'] = time.to_i
  end

  def to
    Time.at(self['to'].to_i)
  end
end

require 'bacon'
Bacon.summary_on_exit

describe Event do
  before do
    Event.database = 'coulendar-test'
    Event.database.destroy! # clean up
    Event.database = 'coulendar-test'
  end

  it 'has a description' do
    desc = 'Writing blog post'
    from = Time.local(2011, 10, 25, 23)
    to = Time.local(2011, 10, 25, 24)

    event = Event.new(from: from, to: to, desc: desc)
    event.save

    event.desc.should == 'Writing blog post'
    event.from.hour.should == 23
  end
end

Quick introduction to Memrise

2011-10-24

For all of you who are learning languages, Memrise is a website worth trying.

It uses the good old approach of spreading the learning of each word over an appropriate time, as you might already know from the Anki spaced repetition algorithm.

Getting started by planting a few seeds

The approach of Memrise works for more languages, and provides an excellent interface, comparing words with flowers that need watering is one of the best analogies they could've come up with.

Go and check it out

It's Sunday already!

2011-10-23

Since I need a day of pure lazyness, here's a dump of some open tabs:

Soundtrack for today: Soundtracks for Everyday Adventures by Lullatone

Hackety Hack is back!

Since _why left us, Hackety Hack and shoes have been lovingly picked up by steveklabnik and a few supporters. Now the site for Hackety Hack is finally live again and already hosts a wide variety of fun little shoes apps.

Buy Domains and a logo idea in one

Originally posted at HN, this is still something I'd like to expand on... more on that in a future post.

The new Dive Into HTML5

Mark Pilgrim also cut all his online ties, and left behind this gem of a book, it's been my reference for everything HTML for the past year. A team of volunteers also continues this project under the new domain name.

Live preview of markdown in js

This is quite a bit older, created by John Fraser in 2007, but I recently remembered it, and thought it would make a good addition to my blog. Not sure I can ever quit writing markdown in vim, but preview is really something I need.

Be your own cloud storage provider

Found yesterday, they make the case that you really could host your own files in the cloud, which I happen to agree with. There is no support for CouchDB yet, but jan____ is working on it. I plan to write another article about my thoughts on this project, as I'm not entirely happy with their approach.

Stanford OpenClassroom

Stanford is providing a lot of courses, with videos of each topic, and even "homework". It seems like the videos were created by students, and I'd love to have transcriptions, but overall a good first effort.

printf("goodbye, Dennis");

Dennis Ritchie, a father of modern computing, died on October 8th, aged 70.

I owe him so much.

Travis CI and RVM

2011-10-22

After trying hooking up ooc and Tcl, waiting for the Self guys to respond, and exploring Travis CI, I figured I'd share my short story with you.

The documentation of Travis recommends using Bundler for managing dependencies of your project.

Since I happen to like the way gemsets work, and Travis is using RVM to manage the different Ruby versions, I figured I could just call rvm gemset import .gems and everything would be peachy.

For some reason, the rvm environment becomes broken once Travis starts your tests, so you have to load it manually again, and then import your gemset.

I used this approach with Innate, and here's the .travis.yml

---
script: 'RUBYOPT=-rubygems rake bacon'

before_script:
  - "./.load_gemset"

rvm:
  - 1.8.7
  - 1.9.2
  - 1.9.3
  - rbx
  - rbx-2.0
  - ree
  - jruby

notifications:
  email:
    - mf@rubyists.com

branches:
  only:
    - master

And also the .load_gemset executable referenced above:

#!/bin/bash

[[ -s "$HOME/.rvm/scripts/rvm" ]] && source "$HOME/.rvm/scripts/rvm" # Load RVM into a shell session *as a function*
[[ -s .gems ]] && rvm gemset import .gems

Please note that i do not use bash -e as they recommend, it would terminate the script before it comes to importing the gemset due to failure of loading .rvmrc, as RVM asks the user and we have no tty access. That's no big issue, as we will notice what's going wrong when our specs don't pass anymore anyway.

That's all, happy Continuous Integration.

Self: the power of simplicity

2011-10-21

A long, long time ago, in a galaxy far away, a language was forged, so powerful and shrouded in mystery that, although it has permeated and influenced a whole host of other languages, it remained obscure and generally unknown.

Self

Unfortunately, this will stay this way for a little bit longer, since I cannot seem to get it running on my machine, it keeps complaining:

chi ~ % Self -s /usr/share/self/0/Demo-4.4.snap
for I386:  LogVMMessages = true
for I386:  PrintScriptName  = true
for I386:  Inline = true
for I386:  SICDeferUncommonBranches = false (not implemented)
for I386:  SICReplaceOnStack = false (not implemented)
for I386:  SaveOutgoingArgumentsOfPatchedFrames = true

  Welcome to the Self system!  (Version 4.4)


Copyright 1992-2009 AUTHORS, Sun Microsystems, Inc. and Stanford University.
See the LICENSE file for license information.

Type _Credits for full credits.

VM version: 4.1.13

Adjusting VM for better UI2 performance:
  _MaxPICSize: 25
  _Flush
"Self 1" unknown font: arialBold
^C
  ----------------Interrupt-----------------
  Waiting:
    <0> waiting process: perProcessGlobals prompt mainInputLoop             
  ------------------------------------------
  Select a process (or q to quit scheduler): q
  Scheduler shut down.
  ------------------------------------------
VM# 
chi ~ % locate arial
/usr/share/fonts/TTF/arial.ttf
/usr/share/fonts/TTF/arialbd.ttf
/usr/share/fonts/TTF/arialbi.ttf
/usr/share/fonts/TTF/ariali.ttf

It should be a smooth ride if you're on OSX, since that's what the developers seem to use, but for now I've filed an issue and await further instructions.

The ooc Language

2011-10-20

Today I've decided to have a quick look at ooc and see if I can accomplish the eBeats challenge in a day.

First of all, it seems like a nice little language, a pretty frontend for C, with some of the ideals of Nimrod, but with a lot more pragmatism.

I've joined #ooc-lang on freenode to get some help along the way.

To start calculating eBeats, we need the current time in UTC, so let's look at what the documentation says: os/Time. Looking at this, I suddenly realize that I have little to no idea of what the docs say, so I better read through the less terse material first. I notice however that there doesn't seem to be anything related to UTC or GMT on this page, that doesn't look good.

Working on getting anything recognizable from gmtime_r(3) doesn't turn out to be easy, but after 2 hours I've finally done it and see some actual times:

include time
import os/Time
import lang/Format

// struct tm *gmtime_r(const time_t *timep, struct tm *result);
gmtime_r: extern func(TimeT*, TMStruct*) -> TMStruct*

main: func () {
  clock : TimeT
  time(clock&)

  time : TMStruct
  gmtime_r(clock&, time&)

  "%02i:%02i:%02i" cformat(time tm_hour, time tm_min, time tm_sec) println()
}

OK, so far everything seems fine, it's nothing to be proud of, but I won't stop here.

After some more fiddling, I come up with a timeToBeats function that takes the time struct and spits out the integer value of eBeats, but I seemingly cannot get anything more fine-grained. Let's see if you can find the bug.

timeToBeats: func (time: TMStruct) -> Float {
  beats := ((time tm_hour) * (1000.0 / 24.0)) +
           ((time tm_min)  * (1000.0 / (24.0 * 60.0))) +
           ((time tm_sec)  * (1000.0 / (24.0 * 60.0 * 60.0)))

  return beats
}

Done looking?

Well, anytime a developer uses Floats, everything has to be checked more than twice. In my case, after a little bit of further investigation and sprinkling cformat/println all over the place, I found that beats wasn't a Float, it was an Int, and ooc was so kind to just let me return an Int from a function that has Float as explicit return type. What it did, and what I cannot understand in modern language design, is that it implicitly converted the Int into a Float, no warnings given, no questions asked.

Let's have a closer look at what's going on there, with a little experiment.

To keep it simple, I just divide a Float by an Int.

10.0 / 5 // 2.000000

The compiler says, when asked for really verbose output:

Resolving variable decl uniqueVar : <unknown type> = 10.0 / 5

For uniqueVar : Float = 10.0 / 5, resolving type Float, of type BaseType

So we really got a Float as result, wonderful!

And now we divide an Int by a Float

10 / 5.0 // 0.000000

The compiler says, when asked for really verbose output:

Resolving variable decl uniqueVar : <unknown type> = 10 / 5.0

For uniqueVar : SSizeT = 10 / 5.0, resolving type SSizeT, of type BaseType

Well, whatever SSizeT is, it doesn't talk or walk like a Float, and the result is way worse than I had expected.

I'm running out of time to finish the post today, so I'll continue with the eBeats instead of diving further into the typing peculiarities of OCC.

After telling it that my return value really, really, is a Float, it complied and gave me the desired result. Here is the finished source:

include time
import os/Time

// struct tm *gmtime_r(const time_t *timep, struct tm *result);
gmtime_r: extern func(TimeT*, TMStruct*) -> TMStruct*

main: func () {
  clock : TimeT
  time(clock&)

  time : TMStruct
  gmtime_r(clock&, time&)

  "@%f" cformat(timeToBeats(time)) println()
}

timeToBeats: func (time: TMStruct) -> Float {
  beats : Float
  beats = ((time tm_hour) * (1000.0 / 24.0)) +
          ((time tm_min)  * (1000.0 / (24.0 * 60.0))) +
          ((time tm_sec)  * (1000.0 / (24.0 * 60.0 * 60.0)))

  return beats
}

Future updates will go directly into the git repo. I of course welcome any contributions.

Ruby for the Web

2011-10-19

Ruby is the best language in the world for interacting with the web, and I'm going to show you why.

This is a response to Python for the Web from gun.io, just because I cannot stand people use the term "best" without qualifying what aspect they refer to. And yes, "interacting with the web" is spongy enough to include all kinds of things.

In order to honor their "Most rights reserved." footer, I won't actually rewrite their post, but give a succinct counter to each point.

Interacting with Websites and APIs Using Ruby

First we'll handle two simple HTTP requests from the client side. For this we use the excellent REST Client gem, which can be installed via rubygems.

gem install rest-client
require 'rest-client'

puts RestClient.get('http://gnu.io')
require 'rest-client'

puts RestClient.get('https://YOURUSERNAME:PASSWORD@api.github.com/user')
require 'rest-client'

url = 'https://example.com/form'
data = {title: 'RoboCop', description: 'The best movie ever.'}
RestClient.post(url, data)

As before, you can use the basic auth syntax if you require basic or digest authentication.

Processing JSON in Ruby

Since JSON is in the stdlib, there is no need to install anything.

require 'json'
require 'rest-client'

c = RestClient.get('https://github.com/timeline.json')
j = JSON.parse(c)
j.each do |item|
  if repository = item['repository']
    puts repository['name']
  end
end

This also fixes a bug in the original code, as not every item in the timeline has a repository key.

Scraping the Web Using Ruby

Here I'll introduce you to Nokogiri, the binding for libxml2 and libxslt. The usage is heavily influenced by Hpricot and improves upon it in terms of speed, memory usage, accuracy, HTML correction, etc.

There is also a XML library in stdlib called REMXL, but there is not a single time I've used it without regrets.

So first of all install nokogiri, this also requires installation of libxml2 and libxslt on Linux, I have no idea about other systems, but the authors seem to have quite good documentation, so I'll leave the gritty details to them.

Here's how to do it on Arch Linux:

sudo pacman -S libxml2 libxslt
gem install nokogiri

And here's how to use it for HTML in combination with RestClient, although I'd personally use open-uri in this case for simplicity.

require 'nokogiri'
require 'rest-client'

tree = Nokogiri::HTML(RestClient.get('http://gun.io'))
tree.css('#frontsubtext').each do |element|
  puts element.text
end

Something that wasn't shown is how to use XPATH, since that's quite essential for most HTML and XML juggling, here we go:

require 'nokogiri'
require 'rest-client'

tree = Nokogiri::HTML(RestClient.get('http://gun.io'))
tree.xpath('//a').each do |element|
  puts "#{element.text} : #{element[:href]}"
end

Ruby Web Sites

And of course it's about time to plug my own project: Ramaze.

Let's make a little page equivalent of the gun.io example.

I won't go into much detail here, please check out the documentation, as it will answer any questions you have much better than I will be able to do here.

require 'ramaze'

class Home < Ramaze::Controller
  map '/'

  def index(*input)
    @output = input.join('/').upcase
    <<-'HTML'
<!DOCTYPE html>
<html>
  <head>
    <meta encoding="utf-8">
    <title>#{@output}</title>
  </head>
  <body>
    Your output is: #{@output}
  </body>
</html>
    HTML
  end
end

Ramaze.start

Setting up Ramaze on Lighttpd2 with FCGI and Runit

2011-10-10

A few days ago, when I was searching for a way to make a Ramaze application available via proxy forwarding while also letting lighttpd handle X-Sendfile, I was told that I'd have to use 1.5 or 2.0 to do that.

After a little digging it became clear that 1.5 would be a dead end, so i took the plunge and looked into the new and shiny version 2.

So let's have a look at each of the parts. First we start with lighttpd, and the new configuration. Please note that this is only for my own site, your mileage may vary.

I'm using Arch Linux for this, with runit as PID 1, as explained by bougyman.

The most important things are forwarding to fastcgi and setting the PATH_INFO header, as the default lighttpd one is not suitable for Ramaze.

setup {
  module_load ( "mod_balance", "mod_expire", "mod_fastcgi", "mod_vhost", "mod_lua", "mod_accesslog" );
  lua.plugin "core.lua";

  listen "0.0.0.0:80";
  listen "[::]:80";

  log ["debug" => "", "*" => "/var/log/lighttpd2/error.log"];
  accesslog "/var/log/lighttpd2/access.log";
  accesslog.format "%h %V %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"";

  static.exclude_extensions ( ".php", ".pl", ".fcgi", "~", ".inc" );
}

include "/etc/lighttpd2/mimetypes.conf";

vhost.map [
  "manveru.name" => {
    docroot "/home/manveru/github/manveru/manveru.name/public/";
    index ("index.html");

    # deliver static files directly, forward everything else to fcgi
    if physical.is_file {
      header.add ("X-cleanurl", "hit");
    } else {
      header.add ("X-cleanurl", "miss");
      env.set "PATH_INFO" => "%{enc:req.path}";

      balance.rr (
        {fastcgi "unix:/var/run/lighttpd/sockets/manveru.name/0.sock";},
        {fastcgi "unix:/var/run/lighttpd/sockets/manveru.name/1.sock";},
        {fastcgi "unix:/var/run/lighttpd/sockets/manveru.name/2.sock";}
      );
    }

    # if req.path =~ "\.(png|css|gif)$" { expire "access 1 week"; }
  }
];

static;

Next up is the Ramaze configuration, in this case we make one from scratch, just to make things a little bit more transparent and easier to control.

Eventually I want to make Ramaze easier to use with FCGI, but handling unix sockets for you is probably fraught with too many issues to provide a one-size-fits-all solution.

#!/usr/bin/env ruby

require 'ramaze'

## FCGI doesn't like you writing to stdout
Ramaze::Log.loggers = [Ramaze::Logger::Informer.new(__DIR__("../ramaze.fcgi.log"))]

require_relative '../app'

## Initialize Ramaze, but don't start any server just yet.
Ramaze.options.trap = :SIGTERM # will need that for runit later.
Ramaze.start(root: __DIR__('../'), started: true)

socket = File.join(*ENV.values_at('RAMAZE_SOCKET', 'RAMAZE_SOCKET_NUMBER'))
socket << '.sock'
puts "Connecting to #{socket}"

## make sure the socket is closed.
FileUtils.rm_f(socket)

Thread.new do
  begin
    sleep 0.5 until File.socket?(socket)
    File.chmod(0660, socket)
    File.chown(Etc.getpwnam(Etc.getlogin).uid, Etc.getgrnam('www-data').gid, socket)
  rescue Exception => ex
    Ramaze::Log.error(ex)
    raise(ex)
  end
end

Rack::Handler.get(:fastcgi).run(Ramaze, File: socket)

Go Nuts!

2010-05-15

First impressions of Go

Over the past couple of weeks, I've been picking up Go, a language created by some hackers at Google that was released last year.

I started learning it because I really wanted to have a language in my toolbox that can take advantage of multiple processors and has C-like performance. Of course I could just learn C or C++, but both of these languages really annoy me, their lack of garbage collection, unsafe pointer operations all over the place, long compilation cycles, brain damaging type-system, etc. Of course there are still places where you absolutely have to use these languages, and they have proven their worth in a lot of instances, but we really should be creating a better future.

Naturally, I'm learning this language with a view biased from my previous experiences, and so I thought it might be fun to write down what exactly is going on during this process.

Let's start with the things I found annoying, unintuitive, and weird.

No eval. There is next to no support for metaprogramming, you have to write everything and cannot generate code during runtime. There is a reflect library that helps a bit, but overall the philosophy of Go is explicit over implicit. Methods, which you can define on your own types, don't have an implicit receiver, there isn't even an equivalent to self or this that you might expect. Some people love it, others hate it, and I'm still torn between these two extremes.

No Generics. The closest you can come is having collections of the type interface{} and converting to/from it when accessing the collection. This has some performance implications and makes your code harder to reason about. Currently there are some attempts, most notably gotgo, that try to provide an equivalent of C++ templates for Go. Still a messy business, and not something newcomers would be aware of.

No methods on Interfaces. Given that you have something that satisfies an Interface, it would be a logical step to be able to define methods on the Interface itself that only relies on methods described in it. There are more or less ugly workarounds, like embedding your Interface into a struct, defining toplevel functions that take the interface as an argument, or making the method part of the interface and duplicating the code in all types implementing it.

Struct fields cannot be accessed through an Interface, you either have to define methods in the interface to access them, which results in lots of GetFoo, SetFoo and code duplication, but is necessary anyway if you want your struct fields to be read-only or write-only anyway.

All in all, Go certainly doesn't seem to be a language for DRY enthusiasts. It's certainly possible to achieve some amount of code reuse, but it's as if the language drives you into another direction so you always have to consciously steer into the other direction.

Quite obviously, the language developers try to get mindshare and users from the C and C++ camps. The mailing list has many mails from people asking why things are kind of like what they are used to, but subtly different. I've even heard complaints about the side on which types are specified, and the FAQ has a whole section for people coming from C or from C++. Neither of which are particularly useful for me, as I've never written anything in these languages.