Back to Basics: Regular Expressions

Posted 4 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

Regular expressions have been around since the early days of computer science. They gained widespread adoption with the introduction of Unix. A regular expression is a notation for describing sets of character strings. They are used to identify patterns within strings. There are many useful applications of this functionality, most notably string validations, find and replace, and pulling information out of strings.

Regular expressions are just strings themselves. Each character in a regular expression can either be part of a code that makes up a pattern to search for, or it can represent a letter, character or word itself. Let’s take a look at some examples.

Basics

First let’s look at an example of a regular expression that is made up of only actual characters and none of the special characters or patterns that generally make up regular expressions.

To get started let’s fire up irb and create our regular expression:

> regex = /back to basics/
 => /back to basics/

Notice we create a regular expression by entering a pattern between two front slashes. The pattern we’ve used here will only match strings that contain the stringi ‘back to basics’. Let’s use the match method, which gives us information about the first match it finds, to look at some examples of what matches and what doesn’t:

> regex.match('basics to back')
 => nil

We’re getting close, but nothing in this string matches our regular expression, so we get nil.

> regex.match('i enjoyback to basics')
 => <MatchData "back to basics">

After an unsuccessful attempt we have a match. Notice that our regular expression matched even though there are no spaces between the pattern and the words before it.

MatchData

The object returned from the RegularExpression object’s match method is of type MatchData. This object can tell us all sorts of things about a particular match. Let’s take a look at some of the information we can get about our match.

We can use the begin method to find out the offset of the beginning of our match in the original string:

> match = regex.match('i enjoyback to basics')
 => <MatchData "back to basics">

> match.begin(0)
 => 7

> 'i enjoyback to basics'[7]
 => "b"

The argument we send the method can be used to specify a capture, a concept which is covered below, within our match. In our above example begin tells us that the beginning of our match can be found at index 7 in our string. As we can see from the code above the 8th character in the string (at the 7th index in our string) is ‘b’ the first letter of our match.

Similarly we can get the index of the character following the end of our match using the end method:

> match.end(0)
 => 21

> 'i enjoyback to basics'[21]
 => nil

In this case we get nil since the end of our match is also the end of our string.

We can also use the to_s method to print our match:

> match.to_s
 => "back to basics"

Patterns

The regular expression’s real power becomes obvious when we introduce patterns. Let’s take a look at some examples.

Metacharacters

A metacharacter is any character that has a meaning within a regular expression. Let’s start with something simple, let’s say we want to find out if our string contains a number. This will require we use our first pattern the \d, which is a metacharacter that says we’re looking for any digit:

> string_to_match = 'back 2 basics'

> regex = /\d/
 => /\d/

> regex.match(string_to_match)
 => <MatchData "2">

Our regular expression matches the number 2 in our string.

Character Classes

Let’s say we wanted to find out if any of the letters from ‘k’ to ’s' were in our string. This will require we use a character class. A character class let’s us specify a list of characters or patterns that we’re looking for:

> string_to_match = 'i enjoy making stuff'

> regex = /[klmnopqrs]/
 => /[klmnopqrs]/

> regex.match(string_to_match)
 => <MatchData "n">

In this example we can see we entered all the letters of the alphabet we were interested in between the brackets and the first instance of any of those characters results in a match. We can simplify the above regular expression by using a range. This is done by entering two character or numbers separated by a -:

> string_to_match = 'i enjoy making stuff'

> regex = /[k-s]/
 => /[k-s]/

> regex.match(string_to_match)
 => <MatchData "n">

As expected, we get the same results with our simplified regular expression.

It’s also possible to invert a character class. This is done by adding a ^ to the beginning of the pattern. If we wanted to look for the first letter not in between ‘k’ and ’s' we would use the pattern /[^k-s]/:

> string_to_match = 'i enjoy making stuff'

> regex = /[^k-s]/
 => /[^k-s]/

> regex.match(string_to_match)
 => <MatchData "i">

Since ‘i’ isn’t in our range the first letter in our string meets the criteria our regular expression specified.

Another thing worth noting is the \d character we used above is an alias for the character class [0-9].

Modifiers

We have the ability to set a regular expression’s matching mode via modifiers. In Ruby this is done by appending characters after the regular expression pattern is defined. A particularly useful matching modifier is the case insensitive modifier i. Let’s take a look:

> string_to_match = 'BACK to BASICS'

> regex = /back to basics/i
 => /back to basics/i

> regex.match(string_to_match)
 => <MatchData "BACK to BASICS">

The regular expression matches our string in spite of the fact that the cases are clearly not the same. We’ll look at another common modifier later on in the blog.

Repetitions

Repetitions give us the ability to look for repeated patterns. We are given the ability to broadly search for that are repeating an indiscriminate number of time, or we can get as granular as the exact number of repetitions we’re looking for.

Let’s try to identify all the numbers in a string again:

> string_to_match = 'The Mavericks beat the Spurs by 21 in game two.'

> regex = /\d/
 => /\d/

> regex.match(string_to_match)
 => <MatchData "2">

Because we used only a single \d we only got the first digit, in this case ‘2’. What we’re actually looking for is the entire number, not just the first digit. We can fix this by modifying our pattern. We need to specify a pattern that will say find any group of contiguous digits. For this we can use the + metacharacter. This tells the regular expression engine to find one or more of the character or characters that match the previous pattern. Let’s take a look:

> string_to_match = 'The Mavericks beat the Spurs by 21 in game two.'

> regex = /\d+/
 => /\d+/

> regex.match(string_to_match)
 => <MatchData "21">

We could also look for an exact number of repetitions. Let’s say we only wanted to look for numbers between 100 and 999. One way we could do that would be using the {n} patern, where n indicates the number of repetitions we’re looking for:

> string_to_match = 'In 30 years the San Francisco Giants have had two 100 win seasons.'

> regex = /\d{3}/
 => /\d{3}/

> regex.match(string_to_match)
 => <MatchData "100">

Our pattern doesn’t match 30, but does match 100 because we told it only three repeating digit characters constituted a match.

Let’s look for words that are only longer than five characters. This will require a new metacharacter, the \w that matches any word character. Then we’ll use the {n,} pattern, which says look for n or more of the previous pattern:

> string_to_match = 'we are only looking for long words'

> regex = /\w{5,}/
 => /\w{5,}/

> regex.match(string_to_match)
 => <MatchData "looking">

You can also specify less than using this pattern {,m} and in between with this {n,m}.

Grouping

Grouping gives us the ability to combine several patterns into one single cohesive unit. This can be very useful when combined with repetitions. Earlier we looked at using repetitions with a single metacharacter \d, but rarely will that be enough to satisfy our needs. Let’s look at how we could define a more complex pattern we expect to see repeated.

Let’s look at how we might create a more complicated regular expression that matches phone numbers in several different formats. We’ll use groups and repetitions to do this:

> phone_format_one = '5125551234'
 => "5125551234"

> phone_format_two = '512.555.1234'
 => "512.555.1234"

> phone_format_three = '512-555-1234'
 => "512-555-1234"

regex = /(\d{3,4}[.-]{0,1}){3}/
 => /(\d{3,4}[\.-]{0,1}){3}/

> regex.match(phone_format_one)
 => <MatchData "5125551234" 1:"234">

> regex.match(phone_format_two)
 => <MatchData "512.555.1234" 1:"1234">

> regex.match(phone_format_three)
 => <MatchData "512-555-1234" 1:"1234">

We have successfully created our regular expression, but there is a lot going on there. Let’s break it down. First we define that our pattern will be made up of groups of three or four digits with this \d{3,4}. Next we indicate that we want to allow for ‘-’ or ‘.’ patterns (we have to escape the ‘.’ because this character is also a metacharacter that acts as a wild card), but that we don’t want to require these characters with this pattern [\.-]{0,1}. Finally we say we need three of this group of patterns by grouping the previous two patterns together and apply a repetition of three (\d{3,4}[.-]{0,1}){3}.

Lazy and Greedy

Regular expressions are by default greedy, which means they’ll find the largest possible match. Often that isn’t the behavior we’re looking for. When creating our patterns it’s possible to tell Ruby we’re looking for a lazy match, or the first possible match that satisfies our pattern.

Let’s look at an example. Let’s say we wanted to parse out the timestamp of a log entry. We’ll start out just trying to grab everything in between the square brackets that we know our log is configured to output the date in. In this pattern we’ll use a new metacharacter. The . is a wildcard in a regular expression:

> string_to_match = '[2014-05-09 10:10:14] An error occured in your application. Invalid input [foo] received.'

> regex = /\[.+\]/
 => /\[.+\]/

> regex.match(string_to_match)
 => <MatchData "[2014-05-09 10:10:14] An error occured in your application. Invalid input [foo]">

Instead of matching just the text in between the first two square brackets it grabbed everything between the first instance of an opening square bracket and the last instance of a closing square bracket. We can fix this by telling the regular expression to be lazy using the ? metacharacter. Let’s take another shot:

> string_to_match = '[2014-05-09 10:10:14] An error occured in your application. Invalid input [foo] received.'

> regex = /\[.+?\]/
 => /\[.+?\]/

> regex.match(string_to_match)
 => <MatchData "[2014-05-09 10:10:14]">

Notice that we added our ? after our repetition metacharacter. This tells the regular expression engine to keep looking for the next part of the pattern only until it finds a match; not until it finds the last match.

Assertions

Assertions are part of regular expressions that do not add any characters to a match. They just assert that certain patterns are present, or that a match occurs at a certain place within a string. There are two types of assertions, let’s take a closer look.

Anchors

The simplest type of assertion is an anchor. Anchors are metacharacters that let us specify positions in our patterns. The thing that makes these metacharacters different is they don’t match characters only positions.

Let’s look at how we can determine if a line starts with Back to Basics using the ^ anchor, which denotes the beginning of a line:

> multi_line_string_to_match = <<-STRING
"> I hope Back to Basics is fun to read.
"> Back to Basics is fun to write.
"> STRING
 => "I hope Back to Basics is fun to read.\nBack to Basics is fun to write.\n"

> regex = /^Back to Basics/
 => /^Back to Basics/

> regex.match(multi_line_string_to_match)
 => <MatchData "Back to Basics">

> match.begin(0)
 => 38

Looking at where our match begins we can see it’s the second instance of the string “Back to Basics” we’ve matched. Another thing to take note of is the ^ anchor doesn’t only match the beginning of a string, but the beginning of a line within a string.

There are many anchors available. I encourage you to review the Regex documentation and check out some of the others.

Lookarounds

The second type of assertion is called a lookaround. Lookarounds allow us to provide a pattern that must be matched in order for a regular expression to be satisified, but that will not be included in a successful match. These are called lookahead and lookbehind patterns.

Let’s say we had a comma delimited list of companies and the year they were founded. Let’s match the year that thoughtbot was founded. In this case we only want the year, we’re not interested in including the company in the match, but we’re only interestedin thougtbot, not the other two companies. To do this we’ll use a positive lookbehind. This means we’ll provide a pattern we expect to appear before the pattern we want to match.

> string_to_match = 'Dell: 1984, Apple: 1976, thoughtbot: 2003'

> regex = /(?<=.thoughtbot: )\d{4}/
 => /(?<=.thoughtbot: )\d{4}/

> regex.match(string_to_match)
 => <MatchData "2003">

Even though the pattern we use to assert the word thoughtbot preceeds our match appears in our regular expression it isn’t included in our match data. This is exactly the behavior we were looking for.

To specify a positive lookbehind we use the ?<=. If we wanted to use a negative lookbehind, meaning the match we want isn’t preceed by some particular text we would use ?<!=.

To do a positive lookahead we use ?=. A negative look ahead is achieved using ?!=.

Captures

Another useful tool is called a capture. This gives us the ability to match on a pattern, but only captures parts of the pattern that are of interest to us. We accomplish this by surrounding the pattern data we intend to capture with parenthesis, which is also how we specify a group. Let’s look at how we might pull the quantity and price for an item off of an invoice:

> string_to_match = 'Mac Book Pro - Quantity: 1 Price: 2000.00'

> regex = /[\w\s]+ - Quantity: (\d+) Price: ([\d\.]+)/
 => /[\w\s]+ - Quantity: (\d+) Price: ([\d\.]+)/ 

> match = regex.match(string_to_match)
 => <MatchData "Mac Book Pro - Quantity: 1 Price: 2000.00" 1:"1" 2:"2000.00"> 

> match[0]
 => "Mac Book Pro - Quantity: 1 Price: 2000.00" 

> match[1]
 => "1"

> match[2]
 => "2000.00"

Notice we have all the match data in an array. The first element is the actual match and the second two are our captures. We indicate we want something to be captured by surrounding it in parentheses.

We can make working with captures simpler by using what is called a named capture. Instead of using the match data array we can provide a name for each capture and access the values out of the match data as a hash of those names after the match has occurred. Let’s take a look:

> string_to_match = 'Mac Book Pro - Quantity: 1 Price: 2000.00'

> regex = /[\w\s]+ - Quantity: (?<quantity>\d+) Price: (?<price>[\d\.]+)/
 => /[\w\s]+ - Quantity: (?<quantity>\d+) Price: (?<price>[\d\.]+)/

> match = regex.match(string_to_match)
 => <MatchData "Mac Book Pro - Quantity: 1 Price: 2000.00" quantity:"1" price:"2000.00">

> match[:quantity]
 => "1"

> match[:price]
 => "2000.00"

Strings

There are also some useful functions that take advantage of regular expressions in the String class. Let’s take a look at some of the things we can do.

sub and gsub

The sub and gsub methods both allow us to provide a pattern and a string to replace instances of that pattern with. The difference between the two methods is that gsub will replace all instances of the pattern, while sub will only replace the first instance.

The gsub method gets its name from the fact that matching mode (discussed above) is set to global, which is accomplished using the modifier code g hence the name.

Let’s take a look at some examples.

> string_to_match = "My home number is 5125551234, so please call me at 5125551234."
 => "My home number is 5125551234, so please call me at 5125551234."

> string_to_match.sub(/5125551234/, '(512) 555-1234')
 => "My home number is (512) 555-1234, so please call me at 5125551234."

When we use sub we can see we still have one instance of our phone number that isn’t formatted. Let’s use gsub to fix it.

> string_to_match.gsub(/5125551234/, '(512) 555-1234')
 => "My home number is (512) 555-1234, so please call me at (512) 555-1234."

As expected gsub replaces both instances of our phone number.

While our previous example demonstrates the way the functions work it isn’t a particularly useful regular expression. If we were trying to format all the phone numbers in a large document we obviously couldn’t make our pattern the number in each case, so let’s revisit our example and see if we can make it more useful.

> string_to_match = "My home number is 5125554321. My office number is 5125559876."
 => "My home number is 5125554321. My office number is 5125559876." 

> string_to_match.gsub(/(?<area_code>\d{3})(?<exchange>\d{3})(?<subscriber>\d{4})/, '(\k<area_code>) \k<exchange>-\k<subscriber>')
 => "My home number is (512) 555-4321. My office number is (512) 555-9876."

Now our regular expression will format any phone number in our string. Notice that we take advantage of named captures in our regular expression and use them in our replacement by using \k.

scan

The scan method lets pull all reular expression matches out of a string. Let’s look at some examples.

> string_to_scan = "I've worked in TX and CA so far in my career."
 => "I've worked in TX and CA so far in my career." 

> string_to_scan.scan(/[A-Z]{2}/)
 => ["TX", "CA"]

Using a regular expression we pull out all the state codes in our string. One thing to keep in mind as you continue to learn is pay close attention to the assorted metacharacters available and how their meanings change depending context. Just in this introductory blog we saw multiple meaings for both the ^ and ? character and we didn’t even cover all of the possible meanings of even those two characters. Sorting out when each metacharacter means what is one of the more difficult parts of mastering regular expressions.

Regular expressions are one of the most powerful tools we have at our disposal with Ruby. Keep them in mind as you code and you’ll be surprised how often they can provide a nice clean solution to an otherwise daunting task!

What’s next?

If you found this useful, you might also enjoy:

Episode #463 - May 9th, 2014

Posted 4 months back at Ruby5

Untangline spaghetti code, opening the Atom source, better presenters with DumbDelegator, managing your OS X setup with osxc, an intro to Rails view caching, and the Eldritch async DSL all in this episode of the Ruby5!

Listen to this episode on Ruby5

Sponsored by New Relic

New Relic is _the_ all-in-one web performance analytics product. It lets you manage and monitor web application performance, from the browser down to the line of code. With Real User Monitoring, New Relic users can see browser response times by geographical location of the user, or by browser type.
This episode is sponsored by New Relic

Untangle Spaghetti Code

This article from Justin Weiss walks you through a refactoring that cleans up the code by reversing the caller and callee.
Untangle Spaghetti Code

Atom is OSS

The Atom editor is now open source!
Atom is OSS

DumbDelegator

The dumb_delegator gem is a delegator implementation specifically designed to work with Rails url helpers.
DumbDelegator

osxc

Tired of reinstalling things manually when you get a new machine? The osxc project will make this process a breeze for OS X machines!
osxc

Rails Caching

Want to start caching in Rails but aren't sure how to go about it? This two part blog post from Greg Molnar will tee you up!
Rails Caching

Eldritch

Eldritch is a DSL to make parallel programming easier in Ruby. Asynchronousness has never been easier!
Eldritch

Tips for Clojure Beginners

Posted 4 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

1. Learn the essentials with Clojure Koans.

Clojure Koans teaches you the basics of the language by providing a series of tests for you to turn green.

The topics and tests are chosen well, and the project’s vibe is pleasant (“calling a function is like giving it a hug with parentheses”).

Open a koan. Make it pass. Meditate. Enjoy enlightenment.

2. Move on to 4Clojure problems.

4Clojure is a great way to become familiar with Clojure’s many built-in functions.

Make sure to register an account and follow a handful of the Top Users. This will let you compare your solutions to theirs and be suitably mindblown.

A word of warning: 4Clojure tends to encourage code golf. Shorter is not always better.

For the longer problems, you may prefer to work in your editor. Check out offline-4clojure to get a local copy of the problems and tests.

3. Read a book or two.

Clojure Programming and The Joy of Clojure are both great places to start.

Clojure Programming is approachable, well-written, and no-nonsense. In particular, the examples are well-chosen and understandable.

The Joy of Clojure is also excellent, but takes more mental horsepower to get through. Its examples are perhaps more realistic, but thus more complicated and harder to follow.

4. Learn to develop interactively from your editor.

As a Rubyist, I’m used to running tests from my editor, and would never adopt a workflow that forced me to switch to the shell to run my tests. Additionally, I use the Spring pre-loader because rebooting the application every time I make a make a change and want to test it is painful. The ability to get test feedback quickly, and in the same place I’m writing them, contributes greatly to my flow and sense of happiness.

Despite this, when I want to interact with a running version of my application, it’s off to the Rails console I go. I write code “over here,” but interact with my running application “over there.”

Clojurians eschew this separation.

When writing Clojure, I can connect my editor to an instance of my running application that sticks around. When I change a function, I instruct that persistent application session to simply use the new function without restarting or reloading.

Further, when I want to see how the system behaves there’s no need to head off to some “over there” place. Instead, I can evaluate Clojure code in the context of my running application right from Vim, with results displayed wherever I might want them.

I had read descriptions of this development style and felt somewhat underwhelmed, but getting this set up and working really changed how much I enjoyed writing Clojure. Make sure you at least give this an honest shot before moving on.

  • Vim users: to get this experience, install Tim Pope’s fireplace.vim and read enough of the docs to learn how to eval code and open an in-editor REPL. Outdated resources might point you to VimClojure, but it is abandonware and should be avoided.

  • Emacs users: cider is what you’re looking for.

  • LightTable users: your editor does this out of the box! How enlightened of it. Check your docs for details, or just start on David Nolen’s interactive ClojureScript tutorial.

  • Users of other editors: you probably want to google something like [your-editor-name nrepl server].

5. Absorb Clojure’s philosophies and motivations with conference talks.

One of my favorite parts of the Clojure ecosystem is how many excellent conference talks are available. Three great examples, all from Rich Hickey, creator of Clojure:

  • Are We There Yet? - Rich asks whether OO as we practice it today is the best we can do. Spoiler: he thinks not. A great starting place to understand Clojure design motivations.

  • The Value of Values - Immutable data structures are a key element in Clojure’s design. This talk gives a great overview of their rationale and characteristics.

  • Simple Made Easy - Required viewing, if only because “complect” and “simple, in the Rich Hickey sense” are terms you’ll hear community members use often.

The above are some of my favorites, but I’ve been pleasantly surprised at the high quality of most Clojure talks I’ve watched, so don’t hestitate to dive into whatever looks interesting. For lots of options, check out the ClojureTV YouTube channel.

Bonus tip: I find I can watch most talks at 1.5x without a loss of comprehension. Enjoy that 40-minute talk in just 26!

6. Ask for help when stuck.

I’ve had good luck getting unstuck by asking for help in #clojure on freenode (IRC), reaching out to library authors directly on Twitter (thanks @swannodette, @weavejester, and @cemerick!), and the usual swearing/staring/source-diving that is sofware development.

7. Don’t panic.

Chances are, you’re coming to Clojure from an object-oriented languge with mutable data structures. Moving to a functional language with immutable data is a significant change of paradigm, and will take some getting used to. This is normal. Don’t feel bad if you struggle early on. I certainly did (and often still am)!

Episode #462 - May 6th, 2014

Posted 4 months back at Ruby5

Good news from the Ruby Core meeting, TDD debates galore, AdequateRecord, and some surprisingly uneventful code ping pong this week.

Listen to this episode on Ruby5

Sponsored by Pull Review

You want to ship it right instead of doing it again. But what could you clean while still meeting the deadline? PullReview reviews the Ruby code you just wrote and tells you what's wrong, why, and how to fix it - from style to security. Code, Spot, Fix, and Ship!
This episode is sponsored by Pull Review

Ruby Core Meeting

The Ruby Core Developer meeting took place in Japan and Terence Lee was kind enough to let us know what happened. Team Matz is considering accelerating the release schedule for Ruby patch releases. So that would be things like 2.1.1, and 2.1.2 as opposed to waiting for a 2.2 release. It would allow them to tackle bugs that cause dreaded segmentation faults much faster. To achieve this, Shibata Hiroshi & Koichi Sasada will work on a CI environment in order to make the release process easier. They also discussed removing commits that are not directly relevant to a patch release (or hotfix) before a the release goes out, probably in order to avoid potential regressions in those releases. By the way, since Ruby 2.1.0 was released, the core team announced that they would not release versions with patch­levels anymore. So no more obscure versions like 1.9.3­p545? Finally dear listeners, remember that feature suggestions for Ruby 2.2 are still being accepted so if you’ve ever wanted to give back to your favorite language, you should head over to bugs.ruby­lang.org right now.
Ruby Core Meeting

RailsConf Keynote & Videos

RailsConf was less than two weeks ago and the good people at Confreaks already have David Heinemeier Hansson’s keynote along with a handful of other talks available on their site. There’s another part of that talk that was a little less talked about: the part about David’s confession his inability to ever become a computer scientist. Although it was overshadowed by his knack for controversy, I think it was really interesting to hear such a prominent programmer — whatever you think of him — describe himself more as a software writer with affinities towards humanities instead of sciences.
RailsConf Keynote & Videos

TDD Counterpoints

So speaking of the TDD controversy, a number of prominent voices in the community have by now responded to his allegations that TDD for instance hinders design, is not necessary, etc. Robert Martin a.k.a Uncle Bob, was one of the first to respond with several blog posts actually. While deploring DHH’s rhetoric at first, he reiterated the benefits of TDD in a first post. As DHH continued aboard his TDD hate train, it was more like a blog ping pong than a code one. That’s about when it got interesting for me, right when Gary Bernhardt chimed in. On top of that yesterday, both Martin Fowler and Corey Haines joined in the merry battle of giants.
TDD Counterpoints

Duplication in Tests is Sometimes Good

Continuing down the testing path, this weekend Jakub Arnold published a short­yet­useful post entitled, “Duplication in Tests is Sometimes Good.”. I presume that you’re writing tests for your application or library. And I further presume you’re using a framework with before and after test hooks, like RSpec. People often use those hooks to clean up their test code, pulling out common setup and configuration. Is it possible that you can extract too much of that setup and configuration? Perhaps to the point of making your tests hard to understand? As Jakub shows, some duplication can be good. It’s not always bad to repeat ���yourself, especially if the repetition is easily updated with a find and replace.
Duplication in Tests is Sometimes Good

AdequateRecord

I think a much more important thing than dislike of TDD happened at RailsConf. I’m sad that not enough people have talked about Aaron Patterson’s Keynote. He spent quite a bit of time explaining the hard work that was put into speeding up ActiveRecord for the upcoming 4.2 release of Rails. He showed multiple benchmarks of the AdequateRecord branch and how in many ways it’s even faster than back in the Rails 2.3 days. So while the video for Aaron’s talk wasn’t available at the time of this recording, we highly recommend you watch it on Confreaks when it comes out. By the way, AdequateRecord has already been merged into rails/master on GitHub so you can already test it out if you like your edge really really bleedy.
AdequateRecord

Code Ping Pong with DHH - The Aftermath

And finally, to wrap up a TDDHH­filled episode, yesterday, Marcin Stecki published an article on the netguru blog describing some of the background for and aftermath from creating the Code Ping Pong with DHH site. In summation, the Rails 4 app was built on a whim, the source is available on GitHub, DHH didn’t know about it before hand, yet he did agree to respond to a handful of entries, DHH’s tweets brought several thousand eyes to the site, and ultimately Marcin was confused. As he put it, people poop on Rails all the time, but when presented with an avenue for presenting “bad code” to DHH for his thoughts, almost no one had any concrete examples. I think they got about 11 submissions ... and most of them weren’t even serious. Ultimately, only three entries were presented to David. Three submissions were presented to DHH... and David’s refactorings won the popular opinion vote for all three. Anyway, it is actually quite impressive to me that David volunteered his time for this, especially since he had nothing to do with its setup or concept. I appreciate that he gave his time and attention to what could’ve easily been seen as being silly and just ignored. So, thanks, David and Marcin!
Code Ping Pong with DHH - The Aftermath

Sponsored by Ruby5

Ruby5 is released Tuesday and Friday mornings. To stay informed about and active with this podcast, we encourage you to do one of the following:

Thank You for Listening to Ruby5

More new recruits

Posted 4 months back at RicRoberts :

Software engineer and web developer, Alex Lynham, is a Ruby enthusiast with particular expertise in front-end development but with ability throughout the stack. A speaker at NWRUG, he is also a fan of Python and JavaScript and divides his time between his love for programming and freelance music journalism.

Nicolas Terpolilli joins us from l’école Centrale de Lille for the summer. A software engineer, and devotee of Open Data, he’s a founder of rawdatahunter where he writes news and musings on Open Data.

And, on a contracting basis, we’ve been delighted to have Robin Gower with us since March. Robin has considerable software developing experience and is skilled at both front and back-end, development. And if that wasn’t enough, he’s also an economist and information analyst. Clever clogs.

You can see the rest of our team here.

Document Explicit Dependencies Through Tests

Posted 4 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

One of the purposes of writing tests is to provide living documentation of application’s code. Tests provide real examples of how a certain class or function is supposed to be used. Tests could also document the exact dependencies of the tested code.

The Problem

When Rails boots it loads most, if not all, of the application’s code, along with all of the dependencies (gems). Because of this, there is no need to require dependencies in individual files that contain application logic. When looking at the source of a specific class, it is hard to tell what external code it depends on. The test doesn’t help either.

A typical RSpec test usually looks something like this:

require 'spec_helper'

describe StyleGuide do
  # actual tests omitted
end

In the Rails community, it has become a de facto standard to require the default spec_helper (or an equivalent) in each test file. A typical spec/spec_helper.rb file ends up loading the whole Rails environment, requiring numerous helper files, and setting up various configuration options. All of that, en masse, is more than what any particular test file needs.

Certainly, integration and feature tests depend on the entire Rails environment. ActiveRecord model tests depend on the presence and configuration of a database. These are all good use cases for spec_helper. But what about unit tests that don’t require the database? When testing a plain old Ruby class, there might only be a few dependencies, if any.

The Solution

Being conscious about what must be required for a particular source file is a good thing. Instead of loading everything but the kitchen sink through the spec_helper, let’s specify the minimum dependencies inside of the test.

Here’s a revision of our code from above:

require 'basic_spec_helper'
require 'rubocop'
require 'app/models/style_guide'

describe StyleGuide do
  # actual tests omitted
end

This code states exactly what the tested class (app/models/style_guide.rb) depends on: the gem rubocop and basic_spec_helper.rb file.

The idea behind basic_spec_helper is that it’s very minimal, and that its sole purpose is to do a few convenience things. It should avoid becoming a junk drawer like spec_helper.

Here’s an example of a basic spec helper, extracted from Hound:

$LOAD_PATH << File.expand_path('../..', FILE)

require 'webmock/rspec'

RSpec.configure do |config|
  config.expect_with :rspec { |c| c.syntax = :expect }
  config.order = 'random'
end

This lightweight spec helper does just enough to enforce good practices and setup some configuration that should apply to all of the tests. Here’s the quick breakdown of what this spec helper is doing:

  1. Add the project root to the load path to allow requiring files starting at the project root.

    $LOAD_PATH << File.expand_path('../..', __FILE__)
    
  2. Requires webmock/rspec to ensure that external requests are not allowed.

    require 'webmock/rspec'
    
  3. Provides preferred RSpec configuration that all tests should adhere to.

    RSpec.configure do |config|
      config.expect_with :rspec { |c| c.syntax = :expect }
      config.order = 'random'
    end
    

Tests which might be part of a Rails application test suite, but don’t actually depend on Rails or ActiveRecord can now require this basic spec helper along with the essential gems and files. This causes each test to explicitly document dependencies of the tested code. Loading minimal dependencies during tests removes any magical coupling, helps with refactoring, saves time during debugging, and makes tests run faster.

What’s Next?

Want to learn more techniques about decoupling code away from Rails?

Minimum Viable Block Chain

Posted 4 months back at igvita.com

Cryptocurrencies, and Bitcoin in particular, have been getting a lot of attention from just about every angle: regulation, governance, taxation, technology, product innovation, and the list goes on. The very concept of a "peer-to-peer (decentralized) electronic cash system" turns many of our previously held assumptions about money and finance on their head.

That said, putting the digital currency aspects aside, an arguably even more interesting and far-reaching innovation is the underlying block chain technology. Regardless of what you think of Bitcoin, or its altcoin derivatives, as a currency and a store of value, behind the scenes they are all operating on the same basic block chain principles outlined by Satoshi Nakamoto:

We propose a solution to the double-spending problem using a peer-to-peer network. The network timestamps transactions by hashing them into an ongoing chain of hash-based proof-of-work, forming a record that cannot be changed without redoing the proof-of-work. The longest chain not only serves as proof of the sequence of events witnessed, but proof that it came from the largest pool of CPU power... The network itself requires minimal structure.

The block chain is agnostic to any "currency". In fact, it can (and will) be adapted to power many other use cases. As a result, it pays to understand the how and the why behind the "minimum viable block chain":

  • What follows is not an analysis of the Bitcoin block chain. In fact, I intentionally omit mentioning both the currency aspects, and the many additional features that the Bitcoin block chain is using in production today.
  • What follows is an attempt to explain, from the ground up, why the particular pieces (digital signatures, proof-of-work, transaction blocks) are needed, and how they all come together to form the "minimum viable block chain" with all of its remarkable properties.
I have learned long ago that writing helps me refine my own sloppy thinking, hence this document. Primarily written for my own benefit, but hopefully helpful to someone else as well. Feedback is always welcome, leave a comment below!


Securing transactions with triple-entry bookkeeping #

Alice and Bob are stamp collectors. It's nothing serious, and they're mostly in it for the social aspects of meeting others, exchanging stories, and doing an occasional trade. If both parties see something they like, they negotiate right there and then and complete the swap. In other words, it's a simple barter system.

Then, one day Bob shows up with a stamp that Alice feels she absolutely must have in her collection. Except there is a problem, because Bob is not particularly interested in anything that Alice has to offer. Distraught, Alice continues negotiating with Bob and they arrive at a solution: they'll do a one-sided transaction where Bob will give Alice his stamp and Alice will promise to repay him in the future.

Both Bob and Alice have known each other for a while, but to ensure that both live up to their promise (well, mostly Alice), they agree to get their transaction "notarized" by their friend Chuck.

They make three copies (one for each party) of the above transaction receipt indicating that Bob gave Alice a "Red stamp". Both Bob and Alice can use their receipts to keep account of their trade(s), and Chuck stores his copy as evidence of the transaction. Simple setup but also one with a number of great properties:

  1. Chuck can authenticate both Alice and Bob to ensure that a malicious party is not attempting to fake a transaction without their knowledge.
  2. The presence of the receipt in Chuck's books is proof of the transaction. If Alice claims the transaction never took place then Bob can go to Chuck and ask for his receipt to disprove Alice's claim.
  3. The absence of the receipt in Chuck's books is proof that the transaction never took place. Neither Alice nor Bob can fake a transaction. They may be able to fake their copy of the receipt and claim that the other party is lying, but once again, they can go to Chuck and check his books.
  4. Neither Alice nor Bob can tamper with an existing transaction. If either of them does, they can go to Chuck and verify their copies against the one stored in his books.

What we have above is an implementation of "triple-entry bookkeeping", which is simple to implement and offers good protection for both participants. Except, of course you've already spotted the weakness, right? We've placed a lot of trust in an intermediary. If Chuck decides to collude with either party, than the entire system falls apart.

Moral of the story? Be (very) careful about your choice of the intermediary!

Securing transactions with PKI #

Dissatisfied with the dangers of using a "reliably intermediary", Bob decides to do some research and discovers that public key cryptography can eliminate the need for an intermediary! This warrants some explanation…

Public-key cryptography, also known as asymmetric cryptography, refers to a cryptographic algorithm which requires two separate keys, one of which is secret (or private) and one of which is public. Although different, the two parts of this key pair are mathematically linked. The public key is used to encrypt plaintext or to verify a digital signature; whereas the private key is used to decrypt ciphertext or to create a digital signature.

The original intent behind using a third party (Chuck) was to ensure three properties:

  • Authentication: a malicious party can't masquerade as someone else.
  • Non-repudiation: participants can't claim that the transaction did not happen after the fact.
  • Integrity: the transaction receipt can't be modified after the fact.

Turns out, public key cryptography can satisfy all of the above requirements. Briefly, the workflow is as follows:

  1. Both Alice and Bob generate a set public-private keypairs.
  2. Both Alice and Bob publish their public keys to the world.
  3. Alice writes a transaction receipt in plaintext.
  4. Alice encrypts the plaintext of the transaction using her private key.
  5. Alice prepends a plaintext "signed by" note to the ciphertext.
  6. Both Alice and Bob store the resulting output.
Note that step #5 is only required when many parties are involved: if you don't know who signed the message then you don't know whose public key you should be using to decrypt it. This will become relevant very soon...

This may seem like a lot of work for no particular reason, but let's examine the properties of our new receipt:

  1. Bob doesn't know Alice's private key, but that doesn't matter because he can look up her public key (which is shared with the world) and use it to decrypt the ciphertext of the transaction.
  2. Alice is not really "encrypting" the contents of the transaction. Instead, by using her private key to encode the transaction she is "signing it": anyone can decrypt the ciphertext by using her public key, and because she is the only one in possession of the private key this mechanism guarantees that only she could have generated the ciphertext of the transaction.
How does Bob, or anyone else for that matter, get Alice's public key? There are many ways to handle distribution of public keys - e.g. Alice publishes it it on her website. We'll assume that some such mechanism is in place.

As a result, the use of public key infrastructure (PKI) fulfills all of our earlier requirements:

  • Bob can use Alice's public key to authenticate signed transaction by decrypting the ciphertext.
  • Only Alice knows her private key, hence Alice can't deny that the transaction took place - she signed it.
  • Neither Bob nor anyone else can fake or modify a transaction without access to Alice's private key.

Both Alice and Bob simply store a copy of the signed transaction and the need for an intermediary is eliminated. The "magic" of public key cryptography is a perfect match for their two-party barter system.

Balance = Σ(receipts) #

With the PKI infrastructure in place, Bob and Alice complete a few additional trades: Alice acquires another stamp from Bob and Bob picks up a stamp from Alice. They each follow the same steps as before to generate signed transactions and append them to their respective ledgers.

The records are secure, but there is a small problem: it's not clear if either party has an outstanding balance. Previously, with just one transaction, it was clear who owed whom (Alice owed Bob) and how much (one red stamp), but with multiple transactions the picture gets really murky. Are all stamps of equal value? If so, then Alice has a negative balance. If not, then it's anyone's guess! To resolve this, Alice and Bob agree on the following:

  • Yellow stamp is worth twice the value of a red stamp.
  • Blue stamp is equal in value to a red stamp.

Finally, to ensure that their new agreement is secure they regenerate their ledgers by updating each transaction with its relative value. Their new ledgers now look as follows:

With that, computing the final balance is now a simple matter of iterating through all of the transactions and applying the appropriate debits and credits to each side. The net result is that Alice owes Bob 2... units of value. What's a "unit of value"? It's an arbitrary medium of exchange that Alice and Bob have agreed on. Further, since "unit of value" doesn't roll off the tongue, Alice and Bob agree to call 1 unit of value as 1 chroma (plural: chroms).

All of the above seems trivial, but the fact that the balance of each party is a function of all of the receipts in the ledger has an important consequence: anyone can compute everyone's balance. There is no need for any trusted intermediaries and the system is trivial to audit. Anyone can traverse the full ledger, verify the trades, and figure out the outstanding balances of each party.

Multi-party transfers & verification #

Next, Bob stumbles across a stamp owned by John that he really likes. He tells John about the secure ledger he is using with Alice and asks him if he would be willing to do a trade where Bob transfers his balance with Alice as a method of payment - i.e. Bob gets the stamp from John, and Alice would owe John the amount she previously owed Bob. John agrees, but now they have dilemma. How exactly does Bob "transfer" his balance to John in a secure and verifiable manner? After some deliberation, they arrive at an ingenious plan:

Bob creates a new transaction by following the same procedure as previously, except that he first computes the SHA-256 checksum (a unique fingerprint) of the encrypted transaction he wants to transfer and then inserts the checksum in the "What" field of the new receipt. In effect, he is linking the new transfer receipt to his previous transaction with Alice, and by doing so, transfers its value to John.

To keep things simple, we'll assume that all transfers "spend" full value of the transaction being transfered. It's not too hard to extend this system to allow fractional transfers, but that's unnecessary complexity at this point.

With the new transaction in place, John makes a copy of the encrypted ledger for his safekeeping (now there are three copies) and runs some checks to verify its integrity:

  1. John fetches Alice's and Bob's public keys and verifies the first three transactions.
  2. John verifies that Bob is transferring a "valid" transaction:
    • The transaction that is being transferred is addressed to Bob.
    • Bob has not previously transfered the same transaction to anyone else.

If all the checks pass, they complete the exchange and we can compute the new balances by traversing the ledger: Bob has a net zero balance, Alice has a debit of 2 chroms, and John has a credit of 2 chroms (courtesy of Alice). Further, John can now take his new ledger to Alice and ask her for payment, and even though Alice wasn't present for their transaction, that's not a problem:

  • Alice can verify the signature of the new transfer transaction using Bob's public key.
  • Alice can verify that the transfer transaction is referencing one of her own valid transactions with Bob.
The above transfer and verification process is a pretty remarkable property of the system! Note that to make it all work, we need two enabling technologies: (a) use of PKI, which enables digital signature verification, and (b) the receipt ledger, which enables us to look at the full transaction history to verify balances and to link previous transactions to enable the transfer.

Satisfied with their ingenuity John and Bob part ways: Bob goes home with a new stamp and John with a new ledger. On the surface, everything looks great, but they've just exposed themselves to a challenging security problem... Can you spot it?

Double-spending and distributed consensus #

Shortly after completing the transaction with John, Bob realizes that they have just introduced a critical flaw into their system and one that he could exploit to his advantage if he acts quickly: both Bob and John have updated their ledgers to include the new transaction, but neither Alice nor anyone else is aware that it has taken place. As a result, there is nothing stopping Bob from approaching other individuals in his network and presenting them with an old copy of the ledger that omits his transaction with John! If he convinces them to do a transaction, just as he did with John, then he can "double-spend" the same transaction as many times as he wants!

Of course, once multiple people show up with their new ledgers and ask Alice for payment, the fraud will be detected, but that is of little consolation - John has already run away with his loot!

The double-spend attack was not possible when we only had two participants since in order to complete the transaction you'd verify and update both sets of books simultaneously. As a result, all ledgers were always in sync. However, the moment we added an extra participant we introduced the possibility of incomplete and inconsistent ledgers between all the participants, which is why the double-spend is now possible.

In CS speak, a two-party ledger provides "strong consistency", and growing the ledger beyond two parties requires some form of distributed consensus to mitigate double-spend.

The simplest possible solution to this problem is to require that all parties listed in the ledger must be present at the time when each new transaction is made, such that everyone can update their books simultaneously. An effective strategy for a small-sized group, but also not a scalable one for a large number of participants.

Requirements for a distributed consensus network #

Let's imagine that we want to scale our ledger to all stamp collectors around the globe such that anyone can trade their favorite stamps in a secure manner. Obviously, requiring that every participant must be present to register each transaction would never work due to geography, timezones, and other limitations. Can we build a system where we don't need everyone's presence and approval?

  1. Geography is not really an issue: we can move communication online.
  2. Timezone problems can be solved with software: we don't need each individual to manually update their ledgers. Instead, we can build software that can run on each participant's computer and automatically receive, approve, and add transactions to the ledger on their behalf.

In effect, we could build a peer-to-peer (P2P) network that would be responsible for distributing new transactions and getting everyone's approval! Except, unfortunately that's easier said than done in practice. For example, while a P2P network can resolve our geography and timezone problems, what happens when even just one of the participants goes offline? Do we block all transactions until they're back online?

Note that the "how" of building a P2P network is a massive subject in its own right: protocols and signaling, traversing firewalls and NATs, bootstrapping, optimizing how updates are propagated, security, and so on. That said, the low-level mechanics of building such a network are out of scope of our discussion... we'll leave that as an exercise for the reader.

Turns out, distributed consensus is a well studied problem in computer science, and one that offers some promising solutions. For example, two-phase commit (2PC) and Paxos both enable a mechanism where we only need the majority quorum (50%+) of participants to be present to safely commit a new transaction: as long as the majority has accepted the transaction the remainder of the group is guaranteed to eventually converge on the same transaction history.

That said, neither 2PC nor Paxos are sufficient on their own. For example, how would either 2PC or Paxos know the total number of participants in our P2P stamp-collector network when new participants are joining on a daily basis and others are disappearing without notice? If one of the prior participants is offline, are they offline temporarily, or have they permanently left the network? Similarly, there is another and an even more challenging "Sybil attack" that we must account for: there is nothing stopping a malicious participant from creating many profiles to gain an unfair share of voting power within our P2P network.

If the number of participants in the system was fixed and their identities were authenticated and verified (i.e. a trusted network), than both 2PC and Paxos would work really well. Alas, that is simply not the case in our ever changing stamp collector P2P network. Have we arrived at a dead end? Well, not quite…

One obvious solution to solve our dilemma is to eliminate the "distributed" part from the problem statement. Instead of building a P2P distributed system we could, instead, build a global registry of all stamp collectors that would record their account information, authenticate them and (try to) ensure that nobody is cheating by creating multiple identities, and most importantly, keep one shared copy of the ledger! Concretely, we could build a website where these trades can take place, and the website would then take care of ensuring the integrity and correct ordering of all transactions by recording them in its centralized database.

The above is a practical solution but, let's admit it, an unsatisfying one since it forces us to forfeit the peer-to-peer nature of our ledger system. It places all of the trust in a single centralized system and opens up an entirely new set of questions: what is the uptime, security and redundancy of the system; who maintains the system and what are their incentives; who has administrative access, and so on. Centralization brings its own set challenges.

Let's rewind and revisit some of the problems we've encountered with our P2P design:

  • Ensuring that every participant is always up to date (strongly consistent system) imposes high coordination costs and affects availability: if a single peer is unreachable the system cannot commit new transactions.
  • In practice we don't know the global status of the P2P network: number of participants, whether individuals are temporarily offline or decided to leave the network, etc.
  • Assuming we can resolve above constraints, the system is still open to a Sybil attack where a malicious user can create many fake identities and exercise unfair voting power.

Unfortunately, resolving all of the above constraints is impossible unless we relax some of the requirements: the CAP theorem tells us that our distributed system can't have strong consistency, availability, and partition tolerance. As a result, in practice our P2P system must operate under the assumption of weak(er) consistency and deal with its implications:

  • We must accept that some ledgers will be out of sync (at least temporarily).
  • The system must eventually converge on a global ordering (linearizability) of all transactions.
  • The system must resolve ledger conflicts in a predictable manner.
  • The system must enforce global invariants - e.g. no double-spends.
  • The system should be secure against Sybil and similar attacks.

Protecting the network from Sybil attacks #

Achieving consensus in a distributed system, say by counting votes of each participant, opens up many questions about the "voting power" of each peer: who is allowed to participate, do certain peers have more voting power, is everyone equal, and how do we enforce these rules?

To keep things simple, let's say everyone's vote is equal. As a first step, we could require that each participant sign their vote with their private key, just as they would a transaction receipt, and circulate it to their peers - signing a vote ensures that someone else can't submit a vote on their behalf. Further, we could make a rule that only one vote is allowed to be submitted. If multiple votes are signed by the same key then all of them are discarded - make up your mind already! So far so good, and now the hard part…

How do we know if any particular peer is allowed to participate in the first place? If all that's needed is just a unique private key to sign a vote, then a malicious user could simply generate an unlimited number of new keys and flood the network. The root problem is that when forged identities are cheap to generate and use, any voting system is easily subverted.

To solve this problem we need to make the process of submitting the vote "expensive". Either the cost of generating a new identity must be raised, or the very process of submitting a vote must incur sufficiently high costs. To make this concrete, consider some real-world examples:

  • When you show up to vote in your local government election, you are asked to present an ID (e.g. a passport) that is (hopefully) expensive to fake. In theory, nothing stops you from generating multiple fake IDs, but if the costs are high enough (monetary costs of making a fake, risk of being caught, etc), than the cost of running a Sybil attack will outweigh its benefits.
  • Alternatively, imagine that you had to incur some other cost (e.g. pay a fee) to submit a vote. If the cost is high enough, then once again, the barrier to running a large-scale Sybil attack is increased.

Note that neither of the above examples "solves" the Sybil attack completely, but they also don't need to: as long as we raise the cost of the attack to be larger than the value gained by successfully subverting the system, then the system is secure and behaves as intended.

Note that we're using a loose definition of "secure". The system is still open for manipulation, and the exact vote count is affected, but the point is that a malicious participant doesn't affect the final outcome.

Proof-of-work as a participation requirement #

Any user can easily (and cheaply) generate a new "identity" in our P2P network by generating a new private-public keypair. Similarly, any user can sign a vote with their private key and send it into the P2P network - that's also cheap, as the abundance of spam email in our inboxes clearly illustrates. Hence, submitting new votes is cheap and a malicious user can easily flood the network with as many votes as they wish.

However, what if we made one of the steps above expensive such that you had to expend significantly more effort, time, or money? That's the core idea behind requiring a proof-of-work:

  1. The proof-of-work step should be "expensive" for the sender.
  2. The proof-of-work step should be "cheap" to verify by everyone else.

There are many possible implementations of such a method, but for our purposes we can re-use the properties provided by the cryptographic hash functions we encountered earlier:

  1. It is easy to compute the hash value for any given message.
  2. It is expensive to generate a message that has a given hash value.

We can impose a new rule in our system requiring that every signed vote must have a hash value that begins with a particular substring - i.e. require a partial hash collision of, say, two zero prefix. If this seems completely arbitrary, that's because it is - stay with me. Let's walk through the steps to see how this works:

  1. Let's say a valid vote statement is a simple string: "I vote for Bob".
  2. We can use the same SHA-256 algorithm to generate a hash value for our vote.
    • sha256("I vote for Bob") → b28bfa35bcd071a321589fb3a95cac...
  3. The resulting hash value is invalid because it does not start with our required substring of two zeros.
  4. We modify the vote statement by appending an arbitrary string and try again:
    • sha256("I vote for Bob - hash attempt #2") → 7305f4c1b1e7...
  5. The resulting hash value does not satisfy our condition either. We update the value and try again, and again, and… 155 attempts later we finally get:
    • sha256("I vote for Bob - hash attempt #155") → 008d08b8fe...

The critical property of the above workflow is that the output of the cryptographic hash function (SHA-256 in this case) is completely different every time we modify the input: the hash value of the previous attempt does not tell us anything about what the hash value of the next attempt when we increment our counter - i.e. its a non-deterministic algorithm. As a result, generating a valid vote is not just "hard problem", but also one better described as a lottery where each attempt gives you a random output. Also, we can adjust the odds of the lottery by changing the length of the required prefix:

  1. Each character of the SHA-256 checksum has 16 possible values: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f.
  2. In order to generate a hash with a valid two zero prefix the sender will need 256 (162) attempts on average.
  3. Bumping the requirement to 5 zeros will require more than 1,000,000 (165) attempts on average… Point being, we can easily increase the cost and make the sender spend more CPU cycles to find a valid hash.
How many SHA256 checksums can we compute on a modern CPU? The cost depends on the size of the message, CPU architecture, and other variables. If you're curious, open up your console and run a benchmark: $> openssl speed sha.

The net result is that generating a valid vote is "expensive" for the sender, but is still trivial to verify for the receiver: the receiver hashes the transaction (one operation) and verifies that the checksum contains the required hash collision prefix... Great, so how is this useful for our P2P system? Above proof-of-work mechanism allows us to adjust the cost of submitting a vote such that the total cost of subverting the system (i.e. spoofing enough valid votes to guarantee a certain outcome) is higher than the value gained by attacking the system.

Note that the "high cost to generate a message" is a useful property in many other contexts. For example, email spam works precisely because it is incredibly cheap to generate a message. If we could raise the cost of sending an email message - say, by requiring a proof-of-work signature - then we could break the spam business model by raising costs to be higher than profits.

Building the minimum viable block chain #

We've covered a lot of ground. Before we discuss how the block chain can help us build a secure distributed ledger, let's quickly recap the setup, the properties, and the unsolved challenges within of our network:

  1. Alice and Bob complete a transaction and record it in their respective ledgers.
    1. Once done, Bob has a PKI-protected IOU from Alice.
  2. Bob completes a transaction with John where he transfers Alice's IOU to John. Both Bob and John update their ledgers, but Alice doesn't know about the transaction… yet.
    1. Happy scenario: John asks Alice to redeem his new IOU; Alice verifies his transaction by fetching Bob's public key; if the transaction is valid she pays John the required amount.
    2. Not so happy scenario: Bob uses his old ledger that omits his transaction with John to create a double-spend transaction with Katy. Next, both Katy and John show up at Alice's doorstep and realize that only one of them will get paid.

The double-spend is possible due to the "weak consistency" of the distributed ledger: neither Alice nor Katy know about John and Bob's transaction, which allows Bob to exploit this inconsistency to his advantage. Solution? If the network is small and all participants are known, we can require that each transaction must be "accepted" by the network before it is deemed valid:

  • Unanimous consensus: whenever a transaction takes place the two parties contact all other participants, tell them about the transaction, and then wait for their "OK" before they commit the transaction. As a result, all of the ledgers are updated simultaneously and double-spend is no longer possible.
  • Quorum consensus: to improve processing speed and availability of the network (i.e. if someone is offline, we can still process the transaction) we can relax above condition of unanimous consensus to quorum consensus (50% of the network).

Either of above strategies would solve our immediate problem for a small network of known and verified participants. However, neither strategy scales to a larger, dynamic network where neither the total number of participants is known at any point in time, nor their identity:

  1. We don't know how many people to contact to get their approval.
  2. We don't know whom to contact to get their approval.
  3. We don't know whom we are calling.
Note that we can use any means of communication to satisfy above worfklow: in person, internet, avian carriers, etc!

Lacking identity and global knowledge of all the participants in the network we have to relax our constraints. While we can't guarantee that any particular transaction is valid, that doesn't stop us from making a statement about the probability of a transaction being accepted as valid:

  • Zero-confirmation transaction: we can accept a transaction without contacting any other participants. This places full trust on the integrity of the payee of the transaction - i.e. they won't double-spend.
  • N-confirmation transaction: we can contact some subset of the (known) participants in the network and get them to verify our transaction. The more peers we contact, the higher the probability that we will catch malicious parties attempting to defraud us.

What is a good value for "N"? The answer depends on the amount being transferred and your trust and relationship with the opposite party. If the amount is small, you may be willing to accept a higher level of risk, or, you may adjust your risk tolerance based on what you know about the other party. Alternatively, you will have to do some extra work to contact other participants to validate your transaction. In either case, there is a tradeoff between the speed with which the transaction is processed (zero-confirmation is instant), the extra work, and the risk of that transaction being invalid.

So far, so good. Except, there is an additional complication that we must consider: our system relies on transaction confirmations from other peers, but nothing stops a malicious user from generating as many fake identities as needed (recall that an "identity" in our system is simply a public-private keypair, which is trivial to generate) to satisfy Katy's acceptance criteria.

Whether Bob decides to execute the attack is a matter of simple economics: if the gain is higher than the cost then he should consider running the attack. Conversely, if Katy can make the cost of running the attack higher than the value of the transaction, then she should be safe (unless Bob has a personal vendetta and/or is willing to lose money on the transaction... but that's out of scope). To make it concrete, let's assume the following:

  • Bob is transferring 10 chroms to Katy.
  • The cost of generating a fake identity and transaction response is 0.001 chroms: energy costs to keep the computer running, paying for internet connectivity, etc.

If Katy asks for 1001 confirmations, then it no longer makes (economic) sense for Bob to run the attack. Alternatively, we could add a proof-of-work requirement for each confirmation and raise the cost for each valid response from 0.001 chroms to 1: finding a valid hash will take CPU time, which translates to a higher energy bill. As a result, Katy would only need to ask for 11 confirmations to get the same guarantee.

Note that Katy also incurs some costs while requesting each confirmation: she has to expend effort to send out the requests and then validate the responses. Further, if the cost of generating a confirmation and verifying it is one-to-one, then Katy will incur the same total cost to verify the transaction as its value… which, of course, makes no economic sense.

This is why the asymmetry of proof-of-work is critical. Katy incurs low cost to send out requests and validate the responses, but the party generating the confirmation needs to expend significantly more effort to generate a valid response.

Great, problem solved, right? Sort of... in the process we've created another economic dilemma. Our network now incurs a cost to validate each transaction that is of equal or higher value than the transaction itself. While this acts as an economic deterrent against malicious participants, why would any legitimate participant be willing to incur any costs for someone else? A rational participant would simply wouldn't, it doesn't make sense. Doh.

Adding "blocks" & transaction fee incentives #

If participants in the network must incur a cost to validate each other's transactions, then we must provide an economic incentive for them to do so. In fact, at a minimum we need to offset their costs, because otherwise an "idle" participant (anyone who is not submitting own transactions) would continue accruing costs on behalf of the network — that wouldn't work. Also a couple of other problems that we need address:

  1. If the cost of verifying the transaction is equal to or higher than the value of the transaction itself (to deter malicious participants), than the total transaction value is net-zero, or negative! E.g. Bob transfers 10 chroms to Katy; Katy spends 10 chroms to compensate other peers to validate the transaction; Katy is sad.

  2. How does Katy pay for confirmations? If that's its own transaction then we have a recursive problem.

Let's start with the obvious: the transaction fee can't be as high as the value of the transaction itself. Of course, Katy doesn't have to spend the exact value to confirm the transaction (e.g. she can allocate half the value for confirmations), but then it becomes a question of margins: if the remaining margin (value of transaction - verification fees) is high enough, than there is still incentive for fraud. Instead, ideally we would like to incur the lowest possible transaction fees and still provide a strong deterrent against malicious participants. Solution?

We can incentivize participants in the network to confirm transactions by allowing them to pool and confirm multiple transactions at once - i.e. confirm a "block" of transactions. Doing so would also allow them to aggregate transaction fees, thereby lowering the validation costs for each individual transaction.

A block is simply a collection (one or more) of valid transactions - think of it as the equivalent of a page in a physical ledger. In turn, each block contains a reference to a previous block (previous page) of transactions, and the full ledger is a linked sequence of blocks. Hence, block chain. Consider the example above:

  1. Alice and Bob generate new transactions and announces them to the network.
  2. Chris is listening for new transaction notifications, each of which contains a transaction fee that the sender is willing to pay to get it validated and confirmed by the network:
    1. Chris aggregates unconfirmed transactions until he has a direct financial incentive (sum of transaction fees > his cost) to perform the necessary work to validate the pending transactions.
    2. Once over the threshold, Chris first validates each pending transaction by checking that none of the inputs are double-spent.
    3. Once all transactions are validated Chris adds an extra transaction to the pending list (indicated in green in the diagram above) that transfers the sum of advertised transaction fees to himself.
    4. Chris generates a block that contains the list of pending transactions, a reference to the previous block (such that we can traverse the blocks and see the full ledger), and performs the proof-of-work challenge to generate a block hash value that conforms to accepted rules of the network - e.g. partial hash collision of N leading zeros.
    5. Finally, once Chris finds a valid block, he distributes it to all other participants.
  3. Both Alice and Bob are listening for new block announcements and look for their transaction in the list:
    1. Alice and Bob verify integrity of the block - i.e. verify proof-of-work and contained transactions.
    2. If the block is valid and their transaction is in the list, then the transaction has been confirmed!
We made a big leap here. Previously we've only had one type of record in our network - the signed transaction. Now we have signed transactions and blocks. The former is generated by the individuals engaging in trade, and the latter is generated by parties interested in collecting fees by validating and confirming transactions.

Also, note that the above scheme requires some minimum volume of transactions in the system to sustain the incentives for individuals creating the blocks: the more transactions there are, the lower the fees have to be for any single transaction.

Phew, ok, Alice has announced a new transaction and received a valid block from Chris confirming it. That's one confirmation, what about the rest? Also, Chris is (hopefully) not the only participant who is incentivized to work on generating the blocks. What if someone else generates a different block at the same time, and which of those blocks is "valid"? This is where it gets interesting...

Racing to claim the transaction fees #

The remarkable part about introducing the ability to aggregate fees by verifying a block of transactions is that it creates a role for a new participant in the network who now has a direct financial incentive to secure it. You can now make a profit by validating transactions, and where there is profit to be made, competition follows, which only strengthens the network - a virtuous cycle and a clever piece of social engineering!

That said, the incentive to compete to validate transactions creates another interesting dilemma: how do we coordinate this block generation work in our distributed network? The short answer is, as you may have already guessed, we don't. Let's add some additional rules into our system and examine how they resolve this problem:

  1. Any number of participants is allowed to participate ("race") to create a valid block. There is no coordination. Instead, each interested participant listens for new transactions and decides whether and when they want to try to generate a valid block and claim the transaction fees.
  2. When a valid block is generated, it is immediately broadcast into the network.
    1. Other peers check the validity of the block (check each transaction and validity of the block itself), and if valid, add it to their ledgers and then finally rebroadcast it to other peers in the network.
    2. Once added, the new block becomes the "topmost block" of their ledger. As a result, if that same peer was also working on generating a block, then they need to abort their previous work and start over: they now need to update their reference to the latest block and also remove any transactions from their unconfirmed list that are contained in the latest block.
    3. Once above steps are complete, they start working on a new block, with the hope that they'll be the first ones to discover the next valid block, which would allow them to claim the transaction fees.
  3. … repeat above process until the heat death of the universe.

The lack of coordination between all the participants working on generating the blocks means there will be duplicate work in the network, and that's OK! While no single participant is guaranteed to claim any particular block, as long as the expected value (probability of claiming the block times the expected payout, minus the costs) of participating in the network is positive, then the system is self-sustaining.

Note that there is also no consensus amongst the peers on which transactions should be validated next. Each participant aggregates their own list and can use different strategies to optimize their expected payoff. Also, due to the random nature of our proof-of-work function (finding a partial hash collision for a SHA-256 checksum of the block), the only way to increase the probability of claiming a block is to expend more CPU cycles.

There is one more caveat that we need to deal with: it's possible that two peers will find a valid block at about the same time and begin propagating through the network - e.g. Kent and Chris in the diagram above. As a result, some fraction of the network may end up accepting Kent's block as topmost block, while the rest will take Chris's block. Now what?

Resolving chain conflicts #

Once again, we're going to take a hands-off approach and let the random nature of the block generation process resolve the conflict, albeit with one additional rule: if multiple chains are detected, the participants should immediately switch to and build on top of the longest chain. Let's see how this work in practice:

  1. Some peers will start building new blocks on top of Kent's block, others on top of Chris's block.
  2. At some point, someone will find a new block and begin propagating it through the network.
    1. When other peers receive the new block, the part of the network that was working with a different topmost block will detect that there is now a longer alternative chain, which means that they need to switch to it - e.g. in the above example, the peers who were working with Chris's block stop their work, drop Chris's block, and switch to the longer (Amy + Kent's) chain.
    2. Any transactions that are part of the discarded block but that are not yet confirmed are placed in the pending list and the process starts over.
It's possible that the race condition can persist for multiple blocks, but eventually, due to the random nature of the proof-of-work process, one branch will race ahead of the other and the rest of the network will converge on the same longest chain.

Great, we now have a strategy to resolve conflicts between different chains in the network. Specifically, the network promises linearizability of transactions by recording them in a linked list of blocks. But, crucially, it makes no promises about an individual block "guaranteeing" the state of any transaction. Consider the example above:

  • Alice sends out her transaction into the network.
  • Chris generates a valid block that confirms her transaction.

Except, there is a fork in the chain and Chris's block is later "removed" as the network converges on Kent's branch of the chain. As a result, even when Alice receives a block with her transaction, she can't be sure that this block won't be undone in the future!

Blocks are never "final" #

No block is "final", ever. Any block can be "undone" if a longer chain is detected. In practice, forks should be detected fairly quickly, but there is still always the possibility of an alternative chain. Instead, the only claim we can make is that the "deeper" any particular block is in the chain, the less likely it is that it will undone. Consequently, no transaction can ever be treated as "final" either, we can only make statements about the probability of it being undone.

  1. 0-confirmation transaction: exchange is performed without waiting for any block to include the transaction.
  2. 1-confirmation transaction: latest valid block includes the transaction.
  3. N-confirmation transaction: there is a valid block that includes the transactions, and there are N-1 blocks that have since been built on top of that block.

If you are willing to accept the risk, you always have the option to go with a 0-confirmation transaction: no transaction fees, no need to wait for confirmations. However, you also place a lot of trust in the opposite party.

Alternatively, if you want to lower your risk, then you should wait for one or more blocks to be built on top of the block that includes your transaction. The longer you wait, the more blocks will be built on top of the block that contains your transaction, the lower the probability of an alternative chain that may undo your transaction.

By "undo" we mean any scenario where one of the participants can make the network accept an alternative transaction transferring funds to any account other than yours - e.g. you complete the transaction, hand over the widget and get a receipt, but the attacker then injects a transaction that "double-spends" those same funds to an alternative account.

Why does the length of the block chain act as a good proxy for "safety" of a transaction? If an attacker wanted to undo a particular transaction, then they will need to build a chain that begins at a block prior to the one where that transaction is listed, and then build a chain of other blocks that is longer than the one currently used by the network. As a result, the deeper the block, the higher the amount of computational effort that would be required to replace it by creating an alternative chain. The longer the chain the more expensive it is to execute an attack.

How many blocks should you wait for before accepting a transaction? There is no one number, the answer depends on the properties of the network (time to generate each block, propagation latency of the transactions and blocks, size of the network, etc), and the transaction itself: it's value, what you know about the other party, your risk profile, and so on.

Properties of the (minimum viable) block chain#

  1. Individual transactions are secured by PKI.
    • Transactions are authenticated: a malicious party can't masquerade as someone else and sign a transaction on their behalf.
      • Authentication is only with respect to the public-private keypair. There is no requirement for "strong authentication" that links the keypair to any other data about the participants. In fact, a single participant can generate and use multiple keypairs! In this sense, the network allows anonymous transactions.
    • Non-repudiation: participants can't claim that the transaction did not happen after the fact.
    • Integrity: transactions can't be modified after the fact.
  2. Once created, transactions are broadcast into the P2P network.
    • Participants form a network where transactions and blocks are relayed amongst all the participating peers. There no central authority.
  3. One or more transactions are aggregated into a "block".

    • A block validates one or more transactions and claims the transaction fees.
      • This allow the transaction fees to remain small relative to the value of each transaction.
    • A valid block must have valid proof-of-work solution.
      • Valid proof-of-work output is hard to generate and cheap to verify.
      • Proof-of-work is used to raise the cost of generating a valid block to impose a higher cost on running an attack against the network.
    • Any peer is allowed to work on generating a valid block, and once a valid block is generated, it is broadcast into the network.
      • Any number of peers can compete to generate a valid block. There is no coordination. When a fork is detected, it is resolved by automatically switching to the longest chain.
    • Each block contains a link to a previous valid block, allowing us to traverse the full history of all recorded transactions in the network.
  4. Peers listen for new block announcements and merge them into their ledgers.

    • Inclusion of the transaction in a block acts as a "confirmation" of that transaction, but that fact alone does not "finalize" any transaction. Instead, we rely on the length of the chain as a proxy for "safety" of the transaction. Each participant can choose their own level of risk tolerance, ranging from 0-confirmation transactions to waiting for any arbitrary number of blocks.

The combination of all of the above rules and infrastructure provides a decentralized, peer-to-peer block chain for achieving distributed consensus of ordering of signed transactions. That's a mouthful, I know, but it's also an ingenious solution to a very hard problem. The individual pieces of the block-chain (accounting, cryptography, networking, proof-of-work), are not new, but the emergent properties of the system when all of them are combined are pretty remarkable.

Episode #461 - May 2nd, 2014

Posted 4 months back at Ruby5

Test-induced design damage, Frocking the console, Advanced Rake and Mutation Testing

Listen to this episode on Ruby5

Sponsored by New Relic

New Relic

Test-induced design damage

we are still feeling the shockwaves from DHH's keynote at RailsConf where he denounced TDD and suggested we focus more on writing clear code
Test-induced design damage

priscilla

priscilla frocks up your console by decorating your text in Colored Strings or unicode emojis
priscilla

Mutation Testing with Mutant

Arne Brasseur posted an article about using Mutant to do mutation testing
Mutation Testing with Mutant

worldcup2014.db

Gerald Bauer has updated and released the sportdb gem with a SQLite Database of the worldcup qualifiers.
worldcup2014.db

Thinking in Types

Posted 4 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

Here at thoughtbot, there has a been a lot of interest in Functional Programming in general and Haskell in particular. As someone with a bit of Haskell knowledge, I’ve been happy to field a lot of questions and take part in many discussions.

One such question came about when a colleague of mine was stuck attempting to do something in Haskell that seemed conceptually simple but resulted in a type error. After talking through the issue and coming up with a solution, I realized that this little exercise was actually quite interesting.

While a particular form of polymorphism, common in object-oriented languages, translated very well to Haskell, a related technique was not permitted by the language’s type system. In this post, I’d like to outline a simplified version of the task and how this limitation was overcome. Hopefully you’ll see that by working with the type system rather than against it, we find unexpected benefits in the resulting design.

The Task

The system my colleague was building models a Pong-like game. There are two players and a ball, all of which need to be rendered on screen. We’re going to avoid any discussion of the graphics library used to accomplish this and focus only on the task of flexibly sharing this behavior across the two kinds of objects we have in our domain. This simplification should allow the Haskell examples to make sense to someone not familiar with the language.

This task is well-suited to polymorphism with an object-oriented slant. We could define some Renderable interface and have our two object types (presumably Ball and Player) conform to that interface. This way, we can simply tell an object, whatever it is, to render itself.

In Ruby, we can do this easily with duck typing; there is no safety but it is concise and flexible. In Java, we could define an actual Interface and get some form of type checking. Things would be a bit safer but also very verbose.

This approach translates well to Haskell thanks to a feature called type classes. They provide a concise, flexible, and completely type safe way to define interfaces which multiple types can implement. In order to understand the problem I want to discuss, we’ll need to digress just a little and explain what these are and how they work.

Type Classes

A type class defines a set of functions which must be implemented for a type to be considered in that type class. Other functions can then be written which operate not on one specific type, but on any type which is in its given class constraint.

An example should help.

For a type to be Showable or in the Show type class, we have to define how to represent it as a String. Here is how the desired interface is described:

class Show a where
    show :: a -> String

It’s common to represent “any type” with the variable a. The above declaration states that for some type a to be in the Show class, you must define the function show which takes that type as argument and returns a String.

Imagine we declare a type representing people by their age and name:

data Person = Person Int String

A value of type Person can be built by giving an Int (their age) and a String (their name) to the constructor also called Person. It’s common to use the same word for the type and the constructor since there is no ambiguity when either is used. It would’ve also been fine to say:

data Person = BuildPerson Int String

But this is uncommon in cases where there is only one constructor.

Now that we have all of these people in our domain, we can specify how to represent them as Strings by implementing an instance of Show for Person:

instance Show Person where
    show (Person age name) = name ++ ", " ++ show age ++ " years old"

The left hand side uses pattern matching to take a value of type Person and bind the individual components to the variables age and name which we can then use in the function body.

It’s now possible for our type to be used by functions which care only that they’re given a type with an instance for Show. One such function is print. Its type signature specifies a class constraint to the left of the => allowing the use of show in the function body.

print :: Show a => a -> IO ()
print x = putStrLn (show x)

We could use all of this in a Haskell program like so:

main = print (Person 29 "Pat")

As you might have guessed, this program would print Pat, 29 years old to your terminal.

Renderable

With all that fresh Haskell knowledge, we can now speak to the actual problem at hand.

Using Show as an example, we can define our own type class:

class Render a where
    render :: a -> IO ()

This states that any type a can be made renderable by defining the function render which takes that type as argument and does whatever is required to render it on screen. Much like print, this would be an IO action. The details of IO in Haskell and the actual meaning of IO () are very interesting topics, but well outside the scope of this post.

Presuming we have the types Ball and Player in our system, we can make an instance of Render for each.

data Ball = Ball

instance Render Ball where
    render ball = do
        -- whatever

data Player = Player

instance Render Player where
    render player = do
        -- whatever

By implementing an instance for both types currently in our domain, we can now write functions which don’t care if they’re given a Ball or Player, they simply need something that can be rendered.

renderAll :: Render a => [a] -> IO ()
renderAll rs = mapM_ render rs

We need to use mapM_ here in place of the map you might expect because of the IO involved. This detail can be safely ignored, but I direct any curious readers to the docs.

Let’s test this thing out.

main = do
    -- ...

    renderAll [ball, playerOne, playerTwo]
/home/patrick/test.hs:4:22:
    Couldn't match expected type `Ball' with actual type `Player'
    In the expression: playerOne
    In the first argument of `renderAll', namely `[ball, playerOne, playerTwo]'
    In the expression: renderAll [ball, playerOne, playerTwo]

Uh-Oh.

The Problem

The reason this is an error is because Haskell does not allow heterogeneous lists. In other words, all elements in a list must be of the same type. We might consider this list of game elements as homogeneous because they are all renderable. In a language like Ruby or Java, this code would work fine. In Haskell, we have a much stricter type system and even though these values share an interface, they are not the same type.

This is where my colleague got stuck and asked me the question which sparked this blog post. It is a very common question for those learning a new language to ask those that already know it:

How do I do X in language Y?

In this case, X was “heterogeneous lists” and Y was “Haskell”.

I gave him what is probably the most common response to such questions:

Why on earth do you want to do X?

In actuality, there are a few approaches for doing heterogeneous lists in Haskell. However, I don’t think this is a good use case. We can get around this problem in a cleaner and safer way by using the type system rather than subverting it.

Thinking in Types

A list of a specific type is suitably descriptive, you have many Foos. However, if you find yourself attempting to put objects of different types into a single list you lose any descriptiveness. Even if the language allowed it, all you’d have is a blob of generic things with no sense of what those things represent when taken together.

As one solution to this problem, I would define a type to play the role of what is currently that nondescript list. This adds shape and safety to our program:

data Game = Game Ball Player Player

A Game consists of a Ball and two Players. As a trade-off, we could’ve also specified a Game as a Ball and list of Players. This would be more flexible, but less safe since it would make it possible to create a game with the wrong number of players. Haskell’s type system shines best when you encode as many invariants as possible within the types themselves. It’s impossible to run a Haskell program with a type error, so it follows that any logic encoded in those types is guaranteed correct. This is why we hear that Haskell reprise if it compiles, it works.

While any technique that reduces the possibility of certain bugs is a very real and tangible benefit, moving from a generic list to a more structured type has many other advantages. The code is easier to read and understand when you group data in a structured type rather than an opaque list. We can also extend our system by adding behavior around this data type specifically. Those behaviors will also be easier to understand because their arguments or return values will be of a more structured and well-understood type.

Composite

The next change motivated by this type is turning renderAll into renderGame –but wait– that could just be render:

instance Render Game where
    render (Game b p1 p2) = do
        render b
        render p1
        render p2

main = do
    -- ...

    render game

What we have now is a form of the Composite pattern: a compound value indistinguishable from its individual parts (at least when it comes to rendering). By using type class polymorphism and this composite data type, our system more closely follows Open/Closed and does its best to minimize the impact of the Expression Problem. Extending our game can occur in a number of flexible ways.

For example, we could extend the Game type and its instance with another record of some renderable type:

data Game = Game Ball Player Player Score

instance Render Game where
    render (Game b p1 p2 s) = do
        render b
        render p1
        render p2
        render s

If we’re careful in our module exports, a change like this can be done in a backward-compatible way. Such an approach is outlined here.

If for whatever reason we cannot change Game, we may choose to extend our Composite another layer:

data ExtendedGame = ExtendedGame Game Score

instance Render ExtendedGame where
    render (ExtendedGame g s) = do
        render g
        render s

Hopefully this post has shown that a strong static type system is not some evil thing to be avoided. It can help guide you in designing abstractions and ensure you don’t make mistakes. Languages like Haskell embrace this idea and make it easy to utilize to great effect. I hope you found it as interesting as I did to see how a small shift in thinking can not only work around a “limitation” but also improve the code you write.

What’s next?

  • Interested in learning some Haskell? Start with Learn you a Haskell.
  • Want to learn more about type classes in particular? Check out this talk. Type classes come in around minute 40.

The problem with using fixtures in Rails

Posted 4 months back at interblah.net - Home

I’ve been trying to largely ignore the recent TDD discussion prompted by DHH’s RailsConf 2014 keynote. I think I can understand where he’s coming from, and my only real concern with not sharing his point of view is that it makes it less likely that Rails will be steered in a direction which makes TDD easier. But that’s OK, and my concern grows then I have opportunities to propose improvements.

I don’t even really mind what he’s said in his latest post about unit testing the Basecamp codebase. There are a lot of Rails applications – including ones I’ve written – where a four-minute test suite would’ve been a huge triumph.

I could make some pithy argument like:

Sorry, I couldn't resist

… but let’s be honest, four minutes for a substantial and mature codebase is pretty excellent in the Rails world.

So that is actually pretty cool.

Using fixtures

A lot of that speed is no doubt because Basecamp is using fixtures: test data that is loaded once at the start of the test run, and then reset by wrapping each test in a transaction and rolling back before starting the next test.

This can be a benefit because the alternative – assuming that you want to get some data into the database before your test runs – is to insert all the data required for each test, which can potentially involve a large tree of related models. Doing this hundreds or thousands of times will definitely slow your test suite down.

(Note that for the purposes of my point below, I’m deliberately not considering the option of not hitting the database at all. In reality, that’s what I’d do, but let’s just imagine that it wasn’t an option for a second, yeah? OK, great.)

So, fixtures will probably make the whole test suite faster. Sounds good, right?

The problem with fixtures

I feel like this is glossing over the real problem with fixtures: unless you are using independent fixtures for each test, your shared fixtures have coupled your tests together. Since I’m pretty sure that nobody is actually using independent fixtures for every test, I am going to go out on a limb and just state:

Fixtures have coupled your tests together.

This isn’t a new insight. This is pain that I’ve felt acutely in the past, and was my primary motivation for leaving fixtures behind.

Say you use the same ‘user’ fixture between two tests in different parts of your test suite. Modifying that fixture to respond to a change in one test can now potentially cause the other test to fail, if the assumptions either test was making about its test data are no longer true (e.g. the user should not be admin, the user should only have a short bio, or so on).

If you use fixtures and share them between tests, you’re putting the burden of managing this coupling on yourself or the rest of your development team.

Going back to DHH’s post:

Why on earth would you run your entire test harness for every single line change in a particular model? If you have so little confidence in the locality of your changes, the tests are indeed telling you that the system has overly high coupling.

What fixtures do is introduce overly high coupling in the test suite itself. If you make any change to your fixtures, I do not think it’s possible to be confident that you haven’t broken a single test unless you run the whole suite again.

Fixtures separate the reason test data is like it is from the tests themselves, rather than keeping them close together.

I might be wrong

Now perhaps I have only been exposed to problematic fixtures, and there are techniques for reliably avoiding this coupling or at least managing it better. If that’s the case, then I’d really love to hear more about them.

Or, perhaps the pain of managing fixture coupling is objectively less than the pain of changing the way you write software to both avoid fixtures AND avoid slowing down the test suite by inserting data into the database thousands of times?

That’s certainly possible. I am skeptical though.

Docker-friendly Vagrant boxes 2014-04-30 released

Posted 4 months back at Phusion Corporate Blog

Vagrant

We provide Vagrant base boxes that are based on Ubuntu 14.04 and 12.04, 64-bit. These boxes are specifically customized to work well with Docker. Please learn more at the website

The changes in version 2014-02-30 are:

  • The Ubuntu 12.04 VirtualBox box in release 2014-02-22 was broken: the VirtualBox guest additions weren’t correctly installed because the kernel was incorrectly installed. This has now been fixed.
  • The Ubuntu 12.04 VMWare Fusion box now loads the VMWare Tools kernel modules during startup, so that Vagrant doesn’t have to wait so long at the “Waiting for HGFS kernel module” phase.
  • No changes in the Ubuntu 14.04 boxes.

Related resources: Github | Prebuilt boxes | Vagrant Cloud | Discussion forum | Twitter

Upgrade instructions for Vagrant >= 1.5 with Vagrant Cloud

Run:

vagrant box outdated

Upgrade instructions for Vagrant <= 1.4, or Vagrant >= 1.5 without Vagrant Cloud

Destroy your Vagrant VM:

vagrant destroy

Remove your existing base box:

# Vagrant >= 1.5
vagrant box remove phusion-open-ubuntu-12.04-amd64 --provider virtualbox
vagrant box remove phusion-open-ubuntu-12.04-amd64 --provider vmware_fusion

# Vagrant <= 1.4
vagrant box remove phusion-open-ubuntu-12.04-amd64 virtualbox
vagrant box remove phusion-open-ubuntu-12.04-amd64 vmware_fusion

Start your VM again. Vagrant will automatically download the latest version of the box.

vagrant up

Episode #460 - April 29th, 2014

Posted 4 months back at Ruby5

Configuring variants with Various, what to learn with What's Next, fixing Celluloid with RailsReloader, improving communication with Capistrano Team Notifications, integrating AngularJS with Rails, and getting more from Postgres with PG Power.

Listen to this episode on Ruby5

Sponsored by Pull Review

You want to ship it right instead of doing it again. But what could you clean while still meeting the deadline? PullReview reviews the Ruby code you just wrote and tells you what's wrong, why, and how to fix it - from style to security. Code, Spot, Fix, and Ship!
This episode is sponsored by Pull Review

Various

Various, by Zachary Friedman, is a gem that allows you to easily configure ActionPack::Variants for your Rails apps. The gem allows you to easily configure a mapping of user agent regular expressions which will set the request.variant automatically, based on whether or not there was a match from the User Agent.
Various

What's Next

Matthieu Tanguay-Carel wrote to us about What’s Next, a new micro-site designed to help programmers figure out what to learn at various stages in the learning process. It's got links for beginners and experts, as well as interview questions and a ranking of GitHub repositories.
What's Next

RailsReloader

Brandon Hilkert wrote a blog post on how he used the Rails Reloader to fix an issue on his library, SuckerPunch. He describes diving into Rails` source code, starting at the Reloader middleware. By following the comments found at the top of the Reloader class, he was able to find the solution.
RailsReloader

Capistrano Team Notifications

Alexander Balashov has created a gem which helps track Capistrano deploys via Space notifications and OSX Notification Center. The gem is called capistrano-team_notifications, and it allows everyone on your team to be notified any time a capistrano deploy occurs.
Capistrano Team Notifications

Rails & AngularJS

Sébastien Saunier wrote a blog post on how to integrate AngularJS services to an existing Rails app. He covers integrating Angular with Rails, setting up Karma and Jasmine for testing and making sure they all play nice with the Rails asset pipeline.
Rails & AngularJS

PG Power

Stan Carver wrote to us about PG Power, an ActiveRecord extension that helps to get more from PostgreSQL, like creating and dropping schemas, managing comments, and the ability to add foreign keys.
PG Power

TopRubyJobs

Red Digital Cinema is looking for a Sr. Ruby Software Engineer in Irvine, CA and Houston, TX. EquityMetrix is looking for an Application Developer in Dallas, TX and Science Exchange is looking for a Senior Ruby on Rails Engineer in Palo Alto, CA. If you’re looking for a Top Ruby gig or top ruby talent, head over to TopRubyJobs.com
TopRubyJobs

Thank You for Listening to Ruby5

Ruby5 is released Tuesday and Friday mornings. To stay informed about and active with this podcast, we encourage you to do one of the following:

Thank You for Listening to Ruby5

Announcing the Software Apprenticeship Podcast

Posted 4 months back at Jake Scruggs

In the summer of 2004 I did an apprenticeship of sorts at a place called Object Mentor.  At the time “Uncle” Bob Martin and his son Micah Martin were in leadership positions at the company and I somehow convinced them to let me work for free over the summer in exchange for teaching me software development. It wasn’t a very structured program, nothing like what Micah would later put together for 8th Light, but I was a pretty motivated learner.  I also had the advantage of coming from a teaching background so I knew how to learn.

All this has been covered in daily detail, if you'd like to read more.

After ten years of software experience I’m becoming a mentor to an apprentice and documenting the experience via podcast.  Backstop Solutions has graciously allowed me to pay our apprentice (the same rate we pay interns) as he is doing real work on a daily basis in addition to outside learning experiences.  From 9-5 he will be working on production code with 100% supervision as he will always be pairing with an experienced developer.  It’s a six month program with 2 month check-ins. 

The apprentice, Jonathan Howden, knows that should he fail to meet expectations we may end our relationship at 2, 4, or 6 months.  This is a bit scary, but if Backstop is going to be taking a chance on an un-credentialed employee we, as a company, need to be able mitigate the risk of such a person polluting our codebase.  It is therefore our responsibility to provide constant feedback to the apprentice so that he will know exactly what is needed to succeed.  So far he’s been doing great: He just finished his 3rd week of apprenticeship so we just recorded our third podcast and will be on a weekly schedule. Assuming he survives the six month apprenticeship, Jonathan will be offered a full time job at a damn good starting salary.  Interested in such a job right now? Check out https://careers.backstopsolutions.com/

In the first episode, Jonathan talks quite a bit about Dev Bootcamp (DBC).  I’ve known, worked with, and read the book of one of the founders so it seemed natural to reach out to Dave Hoover and DBC to help Backstop find its first apprentice.  We asked their “fairy job mother” to spread the word that we were looking for apprentices and ten applied.  They were all given coding homework challenges which were evaluated code review style with the whole InvestorBridge team allowed to attend.  We judged three submissions good enough to warrant an in-person interview.  Jonathan made it through this gauntlet and was rewarded with a brand new, much longer, gauntlet.  Of learning.  Look, there's no way not to make this sound hokey as we're trying to do a good thing here.

On a weekly basis I hope to capture what an apprenticeship is from the inside and perhaps provide some value to proto-developers and prospective mentor companies who may be wondering what this “apprenticeship” business is all about.  Changing careers is, like it or not, the future.  I did it in 2004 and I hope Jonathan will too in 2014.

Software Apprenticeship Podcast:
iTunes Page: https://itunes.apple.com/us/podcast/software-apprenticeship-podcast/id868371146?mt=2
RSS feed: http://softwareapprenticeship.libsyn.com/rss


Styling a Middleman Blog with Bourbon, Neat, and Bitters

Posted 4 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

This is a walk-through for styling your own static blog from scratch. We will be using Middleman for our framework and Bourbon, Neat, and Bitters for our Sass libraries.

Middleman is a lightweight static site framework built using Ruby. It compiles Markdown into HTML and is easily deployed to S3, Heroku, or GitHub Pages.

Middleman can also host a blog nicely like this blog.

The following steps will include instructions for installation, setup, deployment, and adding some very basic styles.

Installation

First, we install the Middleman gem and the blog extension gem. We can then initialize the project:

$ gem install middleman-blog
$ middleman init my_blog --template=blog --rack

Let’s start the server and take a look at what we’ve got:

$ cd my_blog
$ bundle exec middleman
$ open http://localhost:4567

You should see something like this in your browser:

Craigslist style

Middleman gives us a default post and just the default styles, which is a great place to start.

Configuration

We’ll be doing a lot of browser refreshing to see our progress, so let’s automate the process. We can use the built-in Livereload service to auto-refresh the page whenever we save. Let’s add the gem:

Gemfile

...
group :development do
  gem 'middleman-livereload'
end

And enable the service by uncommenting the configuration line:

config.rb

...
configure :development do
  activate :livereload
end

Livereload will begin working as soon as you run bundle and restart your server.

$ ctrl + c //shuts down Middleman server
$ bundle
$ bundle exec middleman

Prepare to Deploy

Our blog won’t run on Heroku as-is, but we only need to do a few things to change that.

First, we need to add some code to config.rb that will tell Middleman to put the files Heroku needs in a folder named tmp:

...
set :build_dir, 'tmp'
...

Next, we will create a file that tells Heroku how to build our source files. We will create a file named Rakefile in the root directory of our project and add the code below:

Rakefile

namespace :assets do
  task :precompile do
    sh "middleman build"
  end
end

config.ru

require 'rack/contrib/try_static'

use Rack::Deflater
use Rack::TryStatic,
  root: 'tmp',
  urls: %w[/],
  try: %w[.html index.html /index.html]

FIVE_MINUTES=300

run lambda { |env|
  [
    404,
    {
      'Content-Type'  => 'text/html',
      'Cache-Control' => "public, max-age=#{FIVE_MINUTES}"
    },
    ['File not found']
  ]
}

We’ll also need to include the rack-contrib gem in our Gemfile. Be sure to bundle and restart your server after this step.

Gemfile

...
gem 'rack-contrib'
...

The final step is initializing Git, which we will do next.

Initialize Git

To be able to track our changes and push our blog to Heroku, we need to initialize a Git repo.

$ git init
$ git add .
$ git commit -m 'initial commit'

To commit changes as we continue to work, we’ll run git add . to track new files and stage changes to files already being tracked. Then we can run git commit -m 'your commit message' and we’ll be able to push our latest changes to the remote.

It’s a good idea to commit changes to git at the end of each section of this post.

We’ll be using Heroku as a remote repository.

It would be beneficial to also set up a remote like GitHub for remote version tracking, but it’s not necessary.

We’ll just focus on Heroku for now. If you don’t already have a Heroku account, you’ll need to sign up for one before running heroku create.

$ heroku create
$ git push heroku master

And now, we can run heroku open in the terminal to open the page in our browser. We’ve just created a blog and pushed it live. Next is to add our own styles to customize the design.

Install Libraries

Our goal is to add a custom design to this blog, so let’s install our Sass toolkit and bundle:

Gemfile

gem 'bitters'
gem 'bourbon'
gem 'neat'

Bourbon is a library of vanilla Sass mixins, Neat gives us a responsive grid system, and Bitters sets default styles for Bourbon projects. These gems will make the assets available to our site through Middleman’s asset pipeline.

Since we’ve updated our Gemfile, we’ll need to bundle and restart our server again.

Bourbon and Neat are included by the gem, but Bitters requires an additional install in your stylesheets folder:

$ cd source/stylesheets
$ bitters install
Bitters files installed to /bitters

Next, we need to create a stylesheet manifest with the correct character encoding and import our design libraries:

source/stylesheets/all.css.scss

@charset "utf-8";

@import "bourbon";
@import "bitters/bitters";   /* Bitters needs to be imported before Neat */
@import "neat";

Include all of the stylesheets to be used in this site by adding the link to the manifest in the <head> section of layout.erb:

source/layout.erb

...
<head>
  <%= stylesheet_link_tag "all" %>
...

Let’s see how it looks now:

Bitters default style

Already, we see improvement. What did Bitters just set up for us?

  • Typography - Uses Helvetica as default font-family and sets the sizes for various header elements on a modular scale (e.g. <h1>’s and family)
  • Color - uses variables to systematize colors.
  • Lists - strips all styles from lists including bullets
  • Flash notice styles - very useful in Rails
  • Grid settings to accompany Neat
  • Some basic form styles

Get Stylish

We’ll need to customize a few things in Bitters to optimize our styles for a blog. First, let’s add our own Google font to layout.erb:

source/layout.erb

<head>
...
<link href='http://fonts.googleapis.com/css?family=Oxygen:400,300,700'
  rel='stylesheet' type='text/css'>
...
</head>

To make use of this font, all we need to do is change the $sans-serif variable in _variables.scss:

source/stylesheets/bitters/_variables.scss

...
$sans-serif: 'Oxygen', $helvetica;
...

By changing the $sans-serif variable, we’ve quickly and easily changed the font family globally.

Comfortable Reading

Let’s create a partial that will contain all of the layout related styles. We’ll need to import it in our manifest:

source/stylesheets/all.css.scss

...
@import "partials/layout";

Add the outer-container() mixin to the layout to center it in the viewport.

source/stylesheets/partials/_layout.scss

#main {
    @include outer-container;
}

For a good reading experience, we want to keep the length of the lines of text a comfortable length. If the lines are too long, the reader will have a hard time following the flow of the text.

Neat makes this easy to accomplish in as little as two steps. We’ll adjust the max-width property of the outer-container() mixin.

The first step will be to import _grid-settings.scss into Bitters. We’ll can just uncomment that line in _bitters.scss:

source/stylesheets/bitters/_bitters.scss

...
@import "grid-settings";

The second step is to edit _grid-settings.scss. Uncomment the $max-width variable and change its value to em(700). This should give us a comfortable line-length for reading.

source/stylesheets/_grid-settings.scss

...
$max-width: em(700);
...

Let’s see what our blog looks like now that we’ve added a few styles of our own:

Custom grid-settings

We see that Bitters has successfully applied our chosen font and centered our content. Don’t worry about some of the content being misaligned. We’re about to fix that.

Modular Structure

Our readers need to be able to easily move around the site, so we’ll add some helpful links in a navigation and footer.

To keep our code modular, we will break up the navigation and footer into separate partials. It’s good practice to make a folder to keep your partials in.

We’ll edit a group of new files:

  source/partials/_nav.erb
  source/partials/_footer.erb

  source/stylesheets/partials/_nav.scss
  source/stylesheets/partials/_footer.scss

Now that we’ve added some structure to our code we can import these partials into layout.erb and our Sass partials into all.css.scss.

source/layout.erb

...
<div id="main" role="main">
  <%= partial "partials/nav" %>
  <%= yield %>
  <%= partial "partials/footer" %>
</div>
...

Improved Markup

We are ready to improve the overall layout of our index page. A few small adjustments to the markup will make it more semantic and easier to style.

Paste the code below into _nav.erb:

source/_nav.erb

<nav>
  <ul>
    <li>
      <%= link_to "Blog Title", "index.html", :class => 'blog-title' %>
    </li>
  </ul>
</nav>

We will move some of the content from layout.erb into the footer. Paste the code below into _footer.erb and remove it from layout.erb:

source/_footer.erb

<footer>
  <ul class="large-column">
    <li><h5 class="heading">Recent Articles</h5></li>
    <li>
      <ol>
        <% blog.articles[0...10].each do |article| %>
          <li>
            <%= link_to article.title, article %>
            <span><%= article.date.strftime('%b %e') %></span>
          </li>
        <% end %>
      </ol>
    </li>
  </ul>

  <ul class="small-column">
    <li><h5 class="heading">Tags</h5></li>
    <li>
      <ol>
        <% blog.tags.each do |tag, articles| %>
          <li><%= link_to "#{tag} (#{articles.size})", tag_path(tag) %></li>
        <% end %>
      </ol>
    </li>
  <ul>
</footer>

Additionally, we’ll improve the markup in index.html.erb:

source/index.html.erb

...
<ul>
  <% page_articles.each_with_index do |article, i| %>
    <li>
      <h3><%= link_to article.title, article %></h3>
      <h5><%= article.date.strftime('%b %e') %></h5>
      <p><%= article.body %></p>
    </li>
  <% end %>
</ul>
...

Adding Custom Styles

As a finishing touch, we’ll add some custom styles to our navigation, footer, and layout: For consistency, we will also create Sass partials for the nav and the footer.

source/stylesheets/all.css.scss

...
@import "partials/nav";
@import "partials/footer";

source/stylesheets/partials/_nav.scss

nav {
  border-bottom: 1px solid $base-border-color;
  margin: em(30) 0;

  ul {
    display: inline-block;
    margin-bottom: em(10);
  }
}

.blog-title {
  font-size: 1.4em;
  font-weight: 700;
  letterspacing: 4px;
  text-transform: uppercase;
}

source/stylesheets/partials/_footer.scss

footer {
  border-top: 1px solid $base-border-color;
  margin: 2em 0;
  padding-top: 2em;
}

Some styles are providing a style pattern we’ll use throughout the blog. We’ll place these styles in the _layout.scss stylesheet since they create a reusable layout pattern.

source/stylesheets/partials/_layout.scss

...
ol {
  font-size: 1em;
  font-weight: 500;

  li {
    margin: .5em 0;

    span {
      display: inline-block;

      &:before {
        content: '/';
        margin: 0 .3em;
      }
    }
  }
}
...

Using the mixin @include span-columns(), Neat will calculate the width of the div based on the number of columns you specify in the argument.

source/stylesheets/partials/_layout.scss

...
.large-column {
  @include span-columns(8 of 12);
}

.small-column {
  @include span-columns(4 of 12);
}

Now we have a basic template set up for our very own blog.

Final Results

All that’s left to do is make sure all of our changes have been committed to Git and then deploy these updates to Heroku.

$ git push heroku master

Completion

Now we have a good foundation upon which to publish our own blog.

We Are All Wrong About Software Design

Posted 4 months back at Luca Guidi - Home

We are all wrong. When it comes to talk about opinions this is the way that things work. Everyone has her or his own beliefs shaped by years of experience in the field, frustrating code, books, successes, etc. How can all these backgrounds fall into a one unified theory? They just can’t.

You’ve always been told to pick the right tool for the job. But what’s the right tool? You decide it, according to your practical knowledge.

I love Ruby because I feel it natural, but other developers hate this language. I prefer clean code, other people don’t care. I’m for RSpec and Capybara, other for Test::Unit. CoffeeScript vs plain JavaScript, ERb vs HAML, Postgres vs MySQL. Vim or Emacs? Mac or Linux? TDD or not, anyone?

With all these partitions, we’re not freeing people from dogmas, but just creating fans of an opposite opinion.

Relativity can be applied to software design as well. How many levels of indirection do I need to get a certain job done? Well, it depends. It depends on a myriad of good reasons, but mainly on your judgement. Which can be superior for you, and fallacious for somebody else.

We can discuss about tradeoffs, but please stop using your successful product as the certification that you’re right about code.

I work at Litmus, a profitable company. If I’d put the following code in a template, do you will find it reasonable just because of my employer?

<%
  require 'mysql2'

  client = Mysql2::Client.new({
    host: 'host',
    username: 'username',
    database: 'database'})

  rows = client.query(%{SELECT * FROM previews
    ORDER BY created_at DESC
    LIMIT 5})
%>

<ul>
<% rows.each do |row| %>
  <li><%= row.fetch(:title) %></li>
<% end %>
</ul>

Hey, it works! Who needs all those fancy abstractions like controllers and ORMs. Who needs frameworks at all! That constructs are for architecture astronauts. Get off my lawn! Look at me, I’m a pragmatist. I proved this by ruining the multi-millionaire software I work on.

This isn’t an argument, just nonsense.