SSL for Rails with Heroku and DNSimple

Posted 2 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

SSL certificates can be intimidating but Heroku and DNSimple make the process easy. The following steps should take us less than 15 minutes.

Buy the SSL certification from DNSimple

Buy a wildcard certificate from DNSimple. The wildcard (*) lets us use the same certificate on staging, production, and any other future subdomains (api, etc.).

Prepare the SSL certificate

Follow the wildcard certificate instructions to get .pem, .crt, and .key files prepared.

Follow these instructions to complete .key preparation, provision the SSL addon from Heroku, and add the certificate to Heroku:

heroku certs:add server.crt server.key

Replace it with:

heroku certs:add *.{pem,crt,key}

Otherwise, we might see an error like:

Updating SSL Endpoint myapp.herokussl.com for [heroku-app]... failed
 !    Internal server error.

Get SSL endpoint from Heroku

Run:

heroku certs

This provides us the correct end point for the SSL enabled domain. This is a domain that looks like tokyo-2121.herokussl.com.

Add Heroku SSL endpoint to DNSimple

Next, go to our DNSimple dashboard and update/add the CNAME record for the SSL enabled domain to point to (e.g.) tokyo-2121.herokussl.com.

Prepare Rails app

Make a one-line configuration change in our staging and production environment config files within our Rails app:

# config/environments/{staging,production}.rb
config.force_ssl = true

Deploy that change.

Now, if users type “ourdomain.com”, they should be redirected to “https://www.ourdomain.com” and our browser’s URL bar should display its appropriate indicator (perhaps a green lock) declaring the SSL certificate is valid.

What’s next?

Read our production checklist to see a full list of things, including SSL, that should be done before an application goes live.

Episode #466 - May 20th, 2014

Posted 2 months back at Ruby5

Unexpected new Ruby, plenty of SMTP, Discovery Tests and their mocks, heroes building ROM, Windows longing for ExecJS, Napybaras, and dangerous :nodocs: this week on Ruby5.

Listen to this episode on Ruby5

Sponsored by RailsLTS

With the release of Rails 4, support for Rails 2.3 and Rails 3.0 has ended. Security vulnerabilities are no longer patched for these versions. Rails LTS offers a solution for mission critical applications by providing security patches for old versions of Rails.
This episode is sponsored by RailsLTS

Ruby 1.9.3-p547 released

Last week, Ruby 1.9.3­p547 was released. As the release announcement explains, there is one exception to the security ­only maintenance phase that 1.9.3 is currently in, and that’s critical regressions. Apparently some Ruby users on Ubuntu 10.04 LTS noticed that when their system had an old version of OpenSSL, a segfault was possible when using the get method on Net::HTTP, and this release fixes that prevents the segfault. Only affected users are invited to upgrade. ���
Ruby 1.9.3-p547 released

Email failover with MultiSMTP

Harlow Ward wrote in to let us know about a small library from Michael Choi and his blog post on the HotelTonight engineering blog covering Rails email failover with multi­SMTP support. Using the Mail gem, MultiSMTP gives your app multiple SMTP server configuration support with failover. With multiple SMTP servers configured ­like SendGrid, Mandrill, or Google ­you get multiple host redundancy and very likely geographical and service redundancy.
Email failover with MultiSMTP

Mock Objects in Discovery Tests

The virtue and value of mocks has been under siege for the past few weeks as DHH and others attacked the construct for enabling bad behavior when testing. Justin Searls, who I discovered with his great Breaking Up With Your Test Suite talk at Ancient City Ruby this year, created a great screencast in which he shows how useful mocks can be in what he calls Discovery Tests. He describes only using mocks to the extent he needs them in order to specify the contract between a subject's dependencies. The screencast is quite thorough and provides a lot of concrete examples. I think whether you agree with him or not, it’s rare enough to be able to witness another developer’s design habits at play, that you should take the time to watch this. He also mentions Gary Bernhardt’s Test Isolation Is About Avoiding Mocks post which is long, but quite worth the read, as well.
Mock Objects in Discovery Tests

Help Us Build ROM

Piotr Solnica, maintainer of ROM — the Ruby Object Mapper library — put together a post where he describes the current state of ROM, the purpose, the foundation, where it’s been, the current ecosystem, and most importantly, instructions on how to get started contributing to the project. Because the project has been going on for so long, there is a lot of history and experimentation. There’s also a large number of libraries and concepts floating around. Things like mutant for mutation testing, memoizable for method return value memoization and adamantium to create immutable objects. So, rather than have to re­-explain all of this knowledge for each developer who wants to get started, he distilled it into one page.
Help Us Build ROM

ExecJS is looking for a new Windows-friendly maintainer

Josh Peek, the current maintainer of the ExecJS gem (which the Asset Pipeline depends on) announced on Twitter yesterday that he’s looking for a more active maintainer for the project, especially someone who cares about Windows support.
ExecJS is looking for a new Windows-friendly maintainer

Napybara

George Mendoza wrote in to let us know about a Capybara extension library that he’s created and released, called Napybara. It's a Capybara extension which adds a lightweight DSL for defining Capybara helpers that are easy to abstract, compose, and introspect. An example of this are for nested reference support for selecting your page inputs and elements. Basically, rather than just working with element names or whatever, with Napybara, you can "dot-­reference" page objects by tag name or class to get all the way down to what you’re after. Rather than maybe doing within scopes or something, you can now just reference elements more directly with something like "testpage.form.row­1.text_field". You can find by Ruby objects or selectors, you can easily determine element existence, it works with RSpec and it is extensible. Check it out on GitHub.
Napybara

:nodoc: means tread lightly

Zachary Scott wrote an interesting piece of documentation for Rails last week, covering method visibility. While the Ruby core team marks methods that people shouldn’t use directly as private, on the Rails core team some internal methods are indeed public but they’re marked with a :nodoc: comment. A lot of people do ingenious stuff with internal methods like arel_table and end up getting bit whenever they are changed, renamed or removed without notice from the Rails core team because that’s exactly what people should expect.
:nodoc: means tread lightly

Project Retrospective: Reserve A Game

Posted 2 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

Reserve A Game

Earlier this year, Boston startup Reserve a Game came to thoughtbot to design their product: a way for people to reserve time at local public tennis courts. Many in our office are familiar with the disappointment of showing up to the Boston Common courts after work, only to find a long line to play. For frequent tennis players, this becomes less of a slight annoyance and more of daily problem. At a time when most everything can be reserved (tables at a restaurant, seats for a movie), it seemed like a great idea, but first we needed to test our assumptions.

Here’s a look at our process, from design sprint to launch:

The Problem

The experience of playing tennis on public courts is frustrating. The challenge of finding an open court, uncertainty about court availability, and the potential of having to wait, get in the way of playing the game.

The Product

What does it need to be? What does it need to provide? Who is it for? We had to eliminate a lot of feature ideas, even good ones, to settle on searching and reserving as the main jobs to be done in the application. Our biggest challenge was to meet the needs of both tennis players who would reserve courts, and cities that would adopt this service. For an early-stage startup, it can be a challenge to balance user needs with sales. In this case, it was clear that the founders valued user experience above all, so we charged ahead, hoping that if tennis players would use the product, cities would want to adopt it.

The design sprint wasn’t easy, but it was worth it. We had a done a lot of work before we came to thoughtbot, but this process helped us to put a framework to our concepts and whittle our options down to the best user flow possible for players. We also discovered what we all felt was the best design to ensure a great experience on both mobile and desktop.

Brent Colson, Founder, Reserve a Game

The Critical Path

What tasks does a person need to be able to do with this product? After diverging for a day to sketch the user flow individually, we came together to draw the critical path as a group. This drawing served as the basis for our prototype.

Testing and Learning

We interviewed tennis players and watched them click through the prototype. These conversations validated our assumption that avid tennis players would, in fact, pay for the convenience of a reservation. However, using this app needed to be easier than just showing up and waiting in line.

Some of our assumptions turned out to be false. For example, we assumed people wouldn’t care about the specific court they reserved in a facility. We were wrong. While some people didn’t care, the most avid players (our target users) noted that they often had a preference, and would appreciate the option to choose. Our prototype had focused on selecting any available court in a facility, rather than choosing a specific court. We now knew we needed to add court selection back to the top of our feature list.

User testing did two main things for us. First, our assumptions went through the ringer and some proved to be false, allowing us to pivot in minor ways to save us big money down the road. Second, it proved to be another validation of just how much need there is for our product. Most testers were surprised by how easy the process was and told us they would definitely use the service. It was music to our ears…

Brent Colson, Founder, Reserve a Game

The Development Process

Over the course of 200+ git commits, we built the MVP in time for the spring season in Boston. The design sprint helped us to understand our assumptions and risks up front and gave us a clear direction. With many potential problems having already been addressed, we moved quickly through development. Daily standups, weekly retrospectives and constant communication kept our team focused on delivering the right product.

I thought I knew agile development before we started working with thoughtbot, but their expertise and process helped me grow tremendously in my own ability to manage these types of projects. They helped me to constantly challenge assumptions and find better, or simpler ways to build our product. In the end, we built the right product, which wasn’t necessarily how we envisioned it before we started with thoughtbot.

Brent Colson, Founder, Reserve a Game

If you want to learn more about our process, take a look at our design sprint methodology.

Reserve a Game has now launched their service in the Boston area. They are looking for creative people, including designers and developers, to join their team to help create the future of community sporting services. If you are interested, please get in touch with Brent Colson.

Weighing in on Long Live Testing

Posted 2 months back at Jay Fields Thoughts

DHH recently wrote a provocative piece that gave some views into how he does and doesn't test these days. While I don't think I agree with him completely, I applaud his willingness to speak out against TDD dogma. I've written publicly about not buying the pair-programming dogma, but I hadn't previously been brave enough to admit that I no longer TDD the vast majority of the time.

The truth is, I haven't been dogmatic about TDD in quite some time. Over 6 years ago I was on a ThoughtWorks project where I couldn't think of a single good reason to TDD the code I was working on. To be honest, there weren't really any reasons that motivated me to write tests at all. We were working on a fairly simple, internal application. They wanted software as fast as they could possibly get it, and didn't care if it crashed fairly often. We kept everything simple, manually tested new features through the UI, and kept our customer's very happy.

There were plenty of reasons that we could have written tests. Reasons that I expect people will want to yell at me right now. To me, that's actually the interesting, and missing part, of the latest debate on TDD. I don't see people asking: Why are we writing this test? Is TDD good or bad? That depends; TDD is just a tool, and often the individual is the determining factor when it comes to how effective a tool is. If we start asking "Why?", it's possible to see how TDD could be good for some people, and bad for DHH.

I've been quietly writing a book on Working Effectively with Unit Tests, and I'll have to admit that it was really, really hard not to jump into the conversation with some of the content I've recently written. Specifically, I think this paragraph from the Preface could go a long way to helping people understand an opposing argument.

Why Test?

The answer was easy for me: Refactoring told me to. Unfortunately, doing something strictly because someone or something told you to is possibly the worst approach you could take. The more time I invested in testing, the more I found myself returning to the question: Why am I writing this test?

There are many motivators for creating a test or several tests:
  • validating the system
    • immediate feedback that things work as expected
    • prevent future regressions
  • increase code-coverage
  • enable refactoring of legacy codebase
  • document the behavior of the system
  • your manager told you to
  • Test Driven Development
    • improved design
    • breaking a problem up into smaller pieces
    • defining the "simplest thing that could possibly work"
  • customer approval
  • ping pong pair-programming
Some of the above motivators are healthy in the right context, others are indicators of larger problems. Before writing any test, I would recommend deciding which of the above are motivating you to write a test. If you first understand why you're writing a test, you'll have a much better chance of writing a test that is maintainable and will make you more productive in the long run.

Once you start looking at tests while considering the motivator, you may find you have tests that aren't actually making you more productive. For example, you may have a test that increases code-coverage, but provides no other value. If your team requires 100% code-coverage, then the test provides value. However, if you team has abandoned the (in my opinion harmful) goal of 100% code-coverage, then you're in a position to perform my favorite refactoring: delete.
I don't actually know what motivates DHH to test, but if we assumed he cares about validating the system, preventing future regressions, and enabling refactoring (exclusively) then there truly is no reason to TDD. That doesn't mean you shouldn't; it just means, given what he values and how he works, TDD isn't valuable to him. Of course, conversely, if you value immediate feedback, problems in small pieces, and tests as clients that shape design, TDD is probably invaluable to you.

I find myself doing both. Different development activities often require different tools; i.e. Depending on what I'm doing, different motivators apply, and what tests I write change (hopefully) appropriately.

To be honest, if you look at your tests in the context of the motivators above, that's probably all you need to help you determine whether or not your tests are making you more or less effective. However, if you want more info on what I'm describing, you can pick up the earliest version of my upcoming book. (cheaply, with a full refund guarantee)

Rails Is Not Dead

Posted 2 months back at Luca Guidi - Home

A few years ago, my lead develop of the time, told me: “Beware of the coupling that you’re introducing in your models”. My answer was: “Hey, you don’t know what you’re talking about, this is ActiveRecord”.

I was an unexperienced developer fascinated by the innovation that Rails introduced. After all, my technology was able to get rid of all that configurations, and all that boring stuff from the past.

- Patterns?
- I don't need them. I have the Holy Grail: MVC.
- Interfaces?
- Are you kidding me? This isn't Java, it's Ruby!

We all have been there. We were recovering from the consequences of the New Economy economy bubble burst. A new wave of technologies was raising: AJAX, Rails, and the Web 2.0. All was exciting, we felt like we’d learned from the past mistakes. We were creating a new way to sell software, we needed to be agile, fast and pragmatic.

All that thinking about the architecture looked like an unnecessary aftermath of the enterprise legacy.

Fast forwarding to today, after years spent on several projects, countless hours staring at a monitor, helping startups, banks, broadcasting and highway companies, non-profit organizations, large and small Open Source projects, I can confess one thing: I was wrong.

Rails has the undeniable credit to have changed the way we do web development. It has lowered the entry level so much, that even people without a background as programmers have been able to create a successful business. But this had a cost: technical debt.

I don’t blame Rails for this, as we shouldn’t blame TDD for writing damaged code. We write it, we generate legacy while writing it. We take merits when it’s good, we’re to blame when it isn’t.

But honestly, I don’t get one thing: why half of the Ruby community is so scared to talk about better code?

I have my own opinions about software, that now are diverging from Rails. I’m creating a web framework to introduce new ideas and rethink architectures. I hope to remove the pain points that have been a problem for my direct experience as a programmer.

And you know what? I might be wrong, but don’t be afraid to talk with me. Don’t you recognize this is diversity too? I don’t claim to be right like the leaders of this community think they are. I don’t think that the ideas that I had ten years ago are still valid, like they do.

The current leaders are behaving like those old politicians who say that global warming is a scam: they get defensive because their world is criticized. They keep saying to not worry about, that the problem doesn’t exist, but it does.

Rails is not dead. Debating about things such as the hexagonal architecture isn’t an assault to the framework, but a way to evolve as a community. If they feel under attack, we have a problem.

Please speak up.

Refactoring with an apprentice

Posted 2 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

When I was part of thoughtbot’s apprentice.io program, I spent my days learning to code in a functional, maintainable and scalable way. In my first two weeks, refactoring with my mentor Anthony became one of my favorite exercises.

In my third week as an apprentice, we sat down to refactor a test I wrote for my breakable toy application. This is the process we followed:

# spec/features/teacher/add_an_assignment_spec.rb

require 'spec_helper'

feature 'teacher adding an assignment' do
  scenario 'can create an assignment with valid attributes' do
    teacher = create(:teacher)
    course1 = create(:course, name: 'Math', teacher: teacher)
    course2 = create(:course, name: 'Science', teacher: teacher)
    course3 = create(:course, name: 'History', teacher: teacher)
    course4 = create(:course, name: 'Quantum Physics', teacher: teacher)

    visit new_teacher_assignment_path(as: teacher)

    select 'Science', from: :assignment_course_id
    fill_in :assignment_name, with: 'Pop Quiz'
    fill_in :assignment_description, with: 'I hope you studied!'
    select '2014', from: :assignment_date_assigned_1i
    select 'January', from: :assignment_date_assigned_2i
    select '15', from: :assignment_date_assigned_3i
    select '2014', from: :assignment_date_due_1i
    select 'January', from: :assignment_date_due_2i
    select '17', from: :assignment_date_due_3i
    fill_in :assignment_points_possible, with: 100
    click_button I18n.t('helpers.submit.create', model: 'Assignment')

    expect(current_path).to eq(teacher_assignments_path)
    expect(page).to have_content('Course: Science')
    expect(page).to have_content('Name: Pop Quiz')
    expect(page).to have_content('Description: I hope you studied!')
    expect(page).to have_content('Date assigned: January 15, 2014')
    expect(page).to have_content('Date due: January 17, 2014')
    expect(page).to have_content('Points possible: 100')
  end
end

Check out this repo to follow along.

Key areas to improve on:

  • The section of code for filling out the form is not very legible especially the date fields.
  • The date_assigned and date_due fields are not named using Rails idioms.
  • There is a lot of repetition.
  • The code is not reusable.
  • It is important to abstract as many strings from your code as possible to make future changes less painful.

Where to begin?

Idiomatic date attributes

First we looked at some low hanging fruit. Anthony asked me, “when we use t.timestamps in a migration, what are the attribute names?” I knew immediately where he was going with this. Timestamps are named created_at and updated_at. The reason they are named with _at is because it gives developers context that says, “this is a DateTime.” Similarly, when naming Date attributes, the proper Rails idiom is _on.

This may seem like a small thing, but following conventions makes it much easier for other developers (including my future self) to read and derive context about my attributes without having to dig.

So, we changed date_assigned and date_due to assigned_on and due_on respectively:

select '2014', from: :assignment_assigned_on_1i
select 'January', from: :assignment_assigned_on_2i
select '15', from: :assignment_assigned_on_3i
select '2014', from: :assignment_due_on_1i
select 'January', from: :assignment_due_on_2i
select '17', from: :assignment_due_on_3i

Updated assertions to follow the same convention:

expect(page).to have_content('Assigned on: January 15, 2014')
expect(page).to have_content('Due on: January 17, 2014')

We check to make sure our test passes and make a commit.

Create what you need, and nothing more

This test creates 4 courses, so that a teacher must select which course the assignment belongs to. This could be done much more efficiently by using some simple Ruby. However, upon further consideration, having more than 1 course is not only unnecessary, but detrimental to our test because it writes unnecessary data to the database and increases run time.

teacher = create(:teacher)
course1 = create(:course, name: 'Math', teacher: teacher)
course2 = create(:course, name: 'Science', teacher: teacher)
course3 = create(:course, name: 'History', teacher: teacher)
course4 = create(:course, name: 'Quantum Physics', teacher: teacher)

becomes

teacher = create(:teacher)
create(:course, name: 'Science', teacher: teacher)

Again, we check to make sure the test still passes and commit.

Abstract clunky code to methods to increase readability

Anthony is always pushing me to write code so that it performs like it reads. Doing so ensures readability for myself and other developers, and forces us to make sure our objects and methods execute the intentions indicated by their name.

When Anthony isn’t around for conversations, I find pseudocoding helps organize my intentions:

# I want a method that:
#  fills out all attributes within my assignment form
#  selects a course from a dropdown list
#  fills in text fields with the appropriate attributes
#  selects dates for the appropriate attributes
#  submits my form
#  is readable and intention-revealing

The ugliest part of this test (in my opinion) is the way the date fields are selected. It takes 3 lines of code for each date because the form has a separate select for each Month, Day and Year. We can DRY (Don’t Repeat Yourself) this up by creating a method that will make these selections for us.

First, we write the method call:

select_date(:assignment, :assigned_on, 'January 15, 2014')
select_date(:assignment, :due_on, 'January 17, 2014')

When using form_for, form fields are all prefixed with the object they belong to (in this case, :assignment), so we pass that first. Second, we pass the form field we want to select. Third, we pass the date in a string.

Now we write the method:

def select_date(prefix, field, date)
  parsed_date = Date.parse(date)
  select parsed_date.year, from: :"#{ prefix }_#{ field }_1i"
  select parsed_date.strftime('%B'), from: :"#{ prefix }_#{ field }_2i"
  select parsed_date.day, from: :"#{ prefix }_#{ field }_3i"
end

To make extracting the Month, Day and Year from our string much easier, we convert them to a Date object and interpolate the form attributes using the prefix and field arguments.

The test is green, so we commit.

Extract parsing of Dates

We don’t love the local variable for parsing our date and there is an argument order dependency as well. So we extract the parsing of dates to local variables at the top of the test. We aren’t exactly sure what to do about the argument order dependency at this point, and assume this may be an issue with the next form field as well, so we put off that refactor for now.

assigned_on = Date.parse('January 15, 2014')
due_on = Date.parse('January 17, 2014')
...

select_date(:assignment, :assigned_on, assigned_on)
select_date(:assignment, :due_on, due_on)
...

def select_date(prefix, field, date)
  select date.year, from: :"#{ prefix }_#{ field }_1i"
  select date.strftime('%B'), from: :"#{ prefix }_#{ field }_2i"
  select date.day, from: :"#{ prefix }_#{ field }_3i"
end

We feel good about this refactor. Our code is still readable and introduction of the Date object will allow us to put the date in whatever format we like and still achieve the same result.

The test is green, so we commit.

Find similarities in your code

We have 3 text_field attributes being filled the exact same way. Following what we did with select_date, we can write a method that will fill in the text_field’s and will DRY up this duplication.

Again, we write the method call so that it reads as the code performs:

fill_in_text_field(:assignment, :name, 'Pop Quiz')
fill_in_text_field(:assignment, :description, 'I hope you studied!')
fill_in_text_field(:assignment, :points_possible, 100)

This isn’t much of an improvement. While we may have decreased a little bit of ‘noise’ and increased the legibility, we are accomplishing our task in the same number of lines and it is still repetitive. Stay tuned, we’re going somewhere with this.

def fill_in_text_field(prefix, field, value)
  fill_in "#{ prefix }_#{ field }", with: value
end

We are getting less out of this refactor than the last. We still have argument order dependencies and the code’s readability is much the same. However, notice the pattern forming between all the different form methods…

Our test is green, so we commit.

Be patient, go step-by-step

We have a lot of duplication. We are calling the methods select_date and fill_in_text_field multiple times each. We need to find a way (perhaps an object or method) to pass each form field type multiple arguments and iterate over them.

However, we need to make sure that all the form fields are essentially using the same format before we try to wrap them. We need to create a method for selecting our course. Write the method call:

select_from_dropdown(:assignment, :course_id, 'Science')

This may look a bit funny, selecting a course from a drop-down list with a field identifier of :course_id, but passing it the string 'Science' instead of an id. This is due to a little bit of form_for magic. When courses are created, each receives an id when persisted to the database. In our form, we want to be able to select courses that belong to a specific teacher. form_for allows us to display the names of the courses in the drop-down list, but once selected and submitted, the form actually sends the course_id that is required for our Assignment belongs_to :course association.

The method:

def select_from_dropdown(prefix, field, value)
  select value, from: :"#{ prefix }_#{ field }"
end

Now we can really see a pattern forming. Test is green, commit.

Abstract identical arguments using yield and wrapping method calls

The next refactor is a very small but integral step in making these other refactors worth while. What good are the select_from_dropdown, select_date and fill_in_text_field if we can’t use them every time we need to fill in another form?

In each of these method calls, we are passing a ‘prefix’ (in this case it is :assignment) which will always be present no matter what form you are filling out for a particular object. We can abstract this away by wrapping all of these method calls in another method that simply yields the prefix. We’ll call this method within_form(prefix, &block):

def within_form(prefix, &block)
  yield prefix
end

Update the method calls to fill in our form:

within_form(:assignment) do |prefix|
  select_from_dropdown(prefix, :course_id, 'Science')
  fill_in_text_field(prefix, :name, 'Pop Quiz')
  fill_in_text_field(prefix, :description, 'I hope you studied!')
  select_date(prefix, :assigned_on, assigned_on )
  select_date(prefix, :due_on, due_on )
  fill_in_text_field(prefix, :points_possible, 100)
end

All we are doing here is abstracting the object for the form into an argument, and supplying that argument to each method call with yeild.

But what if instead of yeilding our prefix, we were to yield an object back that understood the methods select_from_dropdown, select_date and fill_in_text_field? If that object understood our prefix, and these methods, we could use within_form for any object we need to fill in a form for.

Our test is green, so we commit.

When applicable, create a new object to handle a single responsibility

This next refactor has a lot of moving pieces. We want to create an object responsible for filling out all form fields. As always, write the method call and describe what you want it to perform:

within_form(:assignment) do |f|
  f.select_from_dropdown(course_id: 'Science')
  f.fill_in_text_field(name: 'Pop Quiz')
  f.fill_in_text_field(description: 'I hope you studied!')
  f.fill_in_text_field(points_possible: 100)
  f.select_date(assigned_on: assigned_on)
  f.select_date(due_on: due_on)
end

In order for within_form to perform properly we must update it to yield an object that understands each method responsible for filling out form fields.

def within_form(form_prefix, &block)
  completion_helper = FormCompletionHelper.new(form_prefix, self)
  yield completion_helper
end

By creating the FormCompletionHelper object, we have a place to put the select_from_dropdown, select_date and fill_in_text_field methods.

Each time within_form is called, a new instance of FormCompletionHelper is yielded. We call our form methods, and pass along the field and value arguments that each method requires:

class FormCompletionHelper
  delegate :select, :fill_in, to: :context

  def initialize(prefix, context)
    @prefix = prefix
    @context = context
  end

  def fill_in_text_field(field, value)
    fill_in :"#{ prefix }_#{ field }", with: value
  end

  def select_date(field, date)
    select date.year, from: :"#{ prefix }_#{ field }_1i"
    select date.strftime('%B'), from: :"#{ prefix }_#{ field }_2i"
    select date.day, from: :"#{ prefix }_#{ field }_3i"
  end

  def select_from_dropdown(field, value)
    select value, from: :"#{ prefix }_#{ field }"
  end

  private

  attr_reader :prefix, :context
end

This was a big refactor. By wrapping the methods select_from_dropdown, select_date and fill_in_text_field within the FormCompletionHelper class, their scope changes and they no longer have access to Capybara’s select and fill_in methods. In order to call these methods from FormCompletionHelper, we delegate to the test context (hence, choosing context for the name of this object). We do this by passing self as an argument to the within_form method and we delegate with the code:

delegate :select, :fill_in, to: :context

We also are able to remove prefix as an argument from each of our methods by defining a getter method for prefix:

attr_reader :prefix, :context

Our test is green, so we commit.

Remove duplication, through iteration

We are calling fill_in_text_field 3 times, select_date twice, and the click_button action that submits our form is reliant on ‘Assignment’ being passed as a string.

Let’s address the duplication:

within_form(:assignment) do |f|
  f.select_from_dropdown(course_id: 'Science')
  f.fill_in_text_fields(
    name: 'Pop Quiz',
    description: 'I hope you studied!',
    points_possible: 100
  )
  f.select_dates(
    assigned_on: assigned_on
    due_on: due_on,
  )
end
click_button I18n.t('helpers.submit.create', model: 'Assignment')

Notice that instead of passing multiple arguments, we now pass key: value pairs.

Now, we must adjust our methods to allow one or more fields:

def fill_in_text_field(options)
  options.each do |field, value|
    fill_in :"#{ prefix }_#{ field }", with: value
  end
end
alias :fill_in_text_fields :fill_in_text_field

def select_date(options)
  options.each do |field, date|
    select date.year, from: :"#{ prefix }_#{ field }_1i"
    select date.strftime('%B'), from: :"#{ prefix}_#{ field }_2i"
    select date.day, from: :"#{ prefix }_#{ field }_3i"
  end
end
alias :select_dates :select_date

def select_from_dropdown(options)
  options.each do |field, value|
    select value, from: :"#{ prefix }_#{ field }"
  end
end
alias :select_from_dropdowns :select_from_dropdown

In addition to adding iteration to each of these methods, we also create an alias for each method to allow singular or plural method calls.

alias :select_dates :select_date

The test is green, commit.

Complete abstractions

We know that in the context of a database backed object, form submit actions are only used to create or update the object. To make this submission dynamic we need to assign the model and stipulate whether the action is a create or update.

The method call as we want to see it perform:

def within_form(:assignments) do |f|
...
  f.submit(:create)
end

Implement #submit in FormCompletionHelper:

delegate :select, :fill_in, :click_button, to: :context
...
def submit(create_or_update)
  raise InvalidArgumentException unless [:create, :update].include?(create_or_update.to_sym)
  click_button I18n.t("helpers.submit.#{ create_or_update }", model: model_name)
end

private

def model_name
  prefix.to_s.capitalize
end

This refactor completes our ability to dynamically fill out forms. By abstracting the submit action to the FormCompletionHelper we are able to assign the model_name using our form_prefix. We also include an InvalidArgumentException to allow :create or :update arguments only.

Our test is green, so we commit.

Encapsulate behavior and keep code clean and organized through namespacing

Now that we’ve fully abstracted form completion, it doesn’t make much sense to leave #within_form or FormCompletionHelper in our spec/features/teacher/adding_an_assignment_spec.rb. We can move them to separate files, wrap them in a module, and include them in our RSpec configuration so that our teacher/adding_an_assignment_spec.rb will have access to them. A nice by-product of doing this is that any other feature specs that require a form to be filled out can now use the within_form method.

We move our FormCompletionHelper class to a new file called form_completion_helper.rb. The functionality that this class offers is only going to be used in feature spec files, so we place the file in the spec/support/features/ directory. We also namespace the class by wrapping it in a Features module.

# spec/support/features/form_completion_helper.rb

module Features
  class FormCompletionHelper
    delegate :select, :fill_in, :click_button, to: :context

    def initialize(prefix, context)
      @prefix = prefix
      @context = context
    end

    def fill_in_text_field(options)
      options.each do |field, value|
        fill_in :"#{ prefix }_#{ field }", with: value
      end
    end
    alias :fill_in_text_fields :fill_in_text_field

    def select_date(options)
      options.each do |field, date|
        select date.year, from: :"#{ prefix }_#{ field }_1i"
        select date.strftime('%B'), from: :"#{ prefix }_#{ field }_2i"
        select date.day, from: :"#{ prefix }_#{ field }_3i"
      end
    end
    alias :select_dates :select_date

    def select_from_dropdown(options)
      options.each do |field, value|
        select value, from: :"#{ prefix }_#{ field }"
      end
    end
    alias :select_from_dropdowns :select_from_dropdown

    def submit(create_or_update)
      raise InvalidArgumentException unless [:create, :update].include?(create_or_update.to_sym)
      click_button I18n.t("helpers.submit.#{create_or_update}", model: model_name)
    end

    private

    def model_name
      prefix.to_s.capitalize
    end

    attr_reader :prefix, :context
  end
end

We also created a form_helpers.rb file for our #within_form helper method. We also namespace this method in the Features and FormHelpers modules.

# spec/support/features/form_helpers.rb

module Features
  module FormHelpers
    def within_form(form_prefix, &block)
      completion_helper = FormCompletionHelper.new(form_prefix, self)
      yield completion_helper
    end
  end
end

The last step is to require these modules in /spec/spec_helper.rb.

RSpec.configure do |config|
  config.include Features
  config.include Features::FormHelpers
end

Our test is green, so we commit.

We’re done.

Next time, I might use the Formulaic gem. However, I find that exercises like this really help me understand the process of writing quality code.

What’s next?

If you found this post helpful, check out our refactoring screencast on Learn.

Also, check out this blog to take your test refactoring to the next level using I18n.

Our Problem with Boxen

Posted 2 months back at blah blah woof woof articles

Over a year ago, we started using Boxen at Icelab to manage our team’s development environments. It’s worked as advertised, but not well enough. I’m looking for something else.

Our problem with Boxen is that it contains hundreds of different components, all of which are changing, all the time. It needs constant tending. By the time a team is big enough to need something like Boxen, it’s paradoxically too small to look after it properly: to have someone whose job it is to be “Boxenmaster,” someone who knows how these intricate parts all interact and can run the regular clean installs needed to make sure that the experience is smooth for new people (we’ve found it typical for a functional Boxen setup to stay working over repeated updates, while clean installs end up failing). We need a tool that we can set up once for a project and then have it still work reliably 6 months later when we revisit the project for further work.

Boxen can be a trojan horse. If you don’t look after it properly, you risk creating two classes of developers within your team: those who were around to get Boxen working when you first rolled it out, now enjoying their nice functional development environments, and those who get started later, whose first experience in the team is one of frustration, non-productivity and a disheartening, unapproachable web of interconnected packages. And no one is safe: as soon as you want to upgrade that version of PHP, you could slip down there too.

So while Boxen can indeed work as manager of personal development environments, and while Puppet is a serious and capable configuration manager, my recommendation is not to commit to Boxen half-heartedly. In fact, you may not even need something so rigorous and complex if your team is still small. I’m looking into such alternatives right now. I’m not settled on anything yet, but it’ll likely involve virtual machines, and perhaps use Docker as a way to keep it lightweight while working on multiple projects at a time. Fig seems like it could be a good fit. I’ll keep you posted.

Episode #465 - May 16th, 2014

Posted 2 months back at Ruby5

A rich text editor that isn't awful, over one thousand free programming books, handle email subscriptions and track emails sent, super easy email unsubscribe, and a cheatsheet for OAuth exploits.

Listen to this episode on Ruby5

Sponsored by New Relic

New Relic is _the_ all-in-one web performance analytics product. It lets you manage and monitor web application performance, from the browser down to the line of code. With Real User Monitoring, New Relic users can see browser response times by geographical location of the user, or by browser type.
This episode is sponsored by New Relic

Quill

Quill is a modern rich text editor built for compatibility and extensibility. It was created by Jason Chen and Byron Milligan and open sourced by Salesforce.com.
Quill

Free Programming Books

A compilation of roughly 1,500 links to free programming-related resources.
Free Programming Books

OAuth Security Cheatsheet

This site describes common OAuth/Single Sign On/OpenID-related vulnerabilities.
OAuth Security Cheatsheet

Ahoy Email

Simple, powerful email tracking for Rails
Ahoy Email

Mailkick

Email subscriptions made easy
Mailkick

What to do when Stack Overflow is down?

Posted 2 months back at mir.aculo.us - Home

The Internet is awesome. Information at your fingertips and always-available help from people all over the world is one of humanity’s dreams come true. I’m not sure if the ancient philosophers included copy & pasting of code snippets in this dream, but it’s a fact of daily developer life.

It’s awesome, and it’s very convenient.

But this is not how you learn and become great at thinking for yourself, finding solutions for solving programming problems and most importantly how to be creative. Over-using sites like Stack Overflow will not make you a better developer, it will (probably) make you very good at clicking up-vote buttons and copy & pasting.

You owe it to your brain and future self that you try to find a solution first, and not give up just because something doesn’t work the first time you try it. Especially when you don’t feel comfortable or knowledgeable, because you’re working on a new project, with a new programming language or different development environment. Humans are built for exploration and understanding by doing. Your brain will reward you for discovering things. The rewards are higher as the problem is harder (for you) to solve.

That doesn’t mean to ban forums and answer sites from your bookmarks. You can and should share your discoveries and see how others solved similar problems. You’ll probably find that there’s (hope my cats won’t hear this) more than one way to skin a cat. Sharing will likely lead to new, better ways to tackle a specific problem, and this will benefit lots of people.

All because you spent 10 minutes thinking about a problem and not just copy & pasting the first answer that somewhat works.

Software Apprenticeship Podcast Episode 3 is Out

Posted 2 months back at Jake Scruggs

This week we start off by throwing Jonathan into the deep end of pool where he pairs with an experienced developer on a 10 year old Java project that is the core of our signature product: Backstop.  Of course the company is called Backstop Solutions and so, in order to avoid confusion, we gave the project a different name for internal use:  Fund Butter.  The mystery of how such a terrible thing came to pass is revealed in this very episode.

There’s no way we couldn’t discuss DHH’s Rails Conf declaration and blog post: “TDD is dead. Long live testing.” This, of course, leads to a discussion of our team’s test philosophy.

You can find it on iTunes (search for Software Apprenticeship Podcast), any podcast app’s search function, Google, this temp page: https://itunes.apple.com/us/podcast/software-apprenticeship-podcast/id868371146 or use the RSS feed directly if your into that sort of thing: http://softwareapprenticeship.libsyn.com/rss



Our panel (composed of Toby Tripp, Matt Pyra, Eric Johnson, Jonathan Howden, and I) meander through quite a few topics.  Here’s a breakdown by time of the various topics:

01:27 Backstop’s 2 day bug bash
02:00 The apprentice has to tackle a 10 year old Java program
06:22 “Everything needs to have an ID if you’re going to make it Hibernate-y” - Jake “super technical” Scruggs
08:21 Why we call Backstop “Fund Butter” within the company
10:50 The apprentice encounters Selenium
12:42 Troubles with regression/integration testing through the web browser.
13:43 “Unit Testing is Dead” - DHH 
20:00 Pairing all the time vs code review
21:51 Toby talks about the Hill Air Force Base pair programming study mentioned here: http://www.informit.com/articles/article.aspx?p=29410
26:40 The Wall of Awesome - Backstop’s way for employee’s to recognize and thank other employees
47:21 The anti-college movement
49:36 The “Expose your ignorance” apprenticeship pattern with examples/confessions from Jonathan, Jake, and Toby
51:14 The C3 project comes up with near 100% frequency when >= 2 die hard XP'ers are in the same room.

Episode #464 - May 13th, 2014

Posted 2 months back at Ruby5

In this episode we talk about new Ruby releases for versions 2.1.2 and 2.0.0-p48, Jekyll 2.0, Ruby core team disputing a CVE report, How Batched Queries work in Rails and FiniteMachine.

Listen to this episode on Ruby5

Sponsored by RailsLTS

With the release of Rails 4, support for Rails 2.3 and Rails 3.0 has ended. Security vulnerabilities are no longer patched for these versions. Rails LTS offers a solution for mission critical applications by providing security patches for old versions of Rails.
This episode is sponsored by RailsLTS

Ruby 2.1.2 and 2.0.0-p481

Late last week Ruby 2.1.2 was released alongside Ruby 2.0.0­-p481. Version 2.1.2 notably fixes a regression on the Hash#reject method.
Ruby 2.1.2 and 2.0.0-p481

Jekyll 2.0

Jekyll, the popular static site generator, released its 2.0 version last week. This new version is jam­-packed with highly­-requested features and bug fixes, thanks to the contribution of over 180 different developers.
Jekyll 2.0

Ruby CVE dispute

The Ruby core team is disputing an earlier CVE report. The CVE in question claimed that it was possible for an attacker to forge arbitrary root certificates by modifying the certificate’s signature. Apparently it’s simply a misunderstanding over the return value of the X509Certificate#verify method.
Ruby CVE dispute

How Batched Queries Work

Adam Sanderson wrote another article in his "Reading Rails" series titled 'How do Batched Queries Work', where he investigates Rails' find_each. The 'find_each' method is used to run queries when you don't want all the records instantiated at once.
How Batched Queries Work

FiniteMachine

FiniteMachine is a gem by Piotr Murach that provides a minimal finite state machine with a straightforward and intuitive syntax. The machine is event driven, with a focus on passing synchronous and asynchronous messages to trigger state transitions. It supports a natural DSL for declaring events, exceptions and callbacks.
FiniteMachine

Baseimage-docker 0.9.11 released

Posted 2 months back at Phusion Corporate Blog

Baseimage-docker is a special Docker image that is configured for correct use within Docker containers. It is Ubuntu, plus modifications for Docker-friendliness. You can use it as a base for your own Docker images. Learn more at the Github repository and the website, which explain in detail what the problems are with the stock Ubuntu base image, and why you should use baseimage-docker.

Changes in this release

  • Upgraded to Ubuntu 14.04 (Trusty). We will no longer release images based on 12.04.
    Thanks to contributions by mpeterson, Paul Jimenez, Santiago M. Mola and Kingdon Barrett.
  • Fixed a problem with my_init not correctly passing child processes’ exit status. Fixes GH-45.
  • When reading environment variables from /etc/container_environment, the trailing newline (if any) is ignored. This makes commands like this work, without unintentially adding a newline to the environment variable value:
    echo my_value > /etc/container_environment/FOO
    

    If you intended on adding a newline to the value, ensure you have two trailing newlines:

    echo -e "my_value\n" > /etc/container_environment/FOO
    
  • It was not possible to use docker run -e to override environment variables defined in /etc/container_environment. This has been fixed (GH-52). Thanks to Stuart Campbell for reporting this bug.

Using baseimage-docker

Please learn more at the README.

Baseimage-docker 0.9.10 released

Posted 2 months back at Phusion Corporate Blog

Baseimage-docker is a special Docker image that is configured for correct use within Docker containers. It is Ubuntu, plus modifications for Docker-friendliness. You can use it as a base for your own Docker images. Learn more at the Github repository and the website, which explain in detail what the problems are with the stock Ubuntu base image, and why you should use baseimage-docker.

Changes in this release

  • Upgraded to Ubuntu 14.04 (Trusty). We will no longer release images based on 12.04.
    Thanks to contributions by mpeterson, Paul Jimenez, Santiago M. Mola and Kingdon Barrett.
  • Fixed a problem with my_init not correctly passing child processes’ exit status. Fixes GH-45.
  • When reading environment variables from /etc/container_environment, the trailing newline (if any) is ignored. This makes commands like this work, without unintentially adding a newline to the environment variable value:
    echo my_value > /etc/container_environment/FOO
    

    If you intended on adding a newline to the value, ensure you have two trailing newlines:

    echo -e "my_value\n" > /etc/container_environment/FOO
    
  • It was not possible to use docker run -e to override environment variables defined in /etc/container_environment. This has been fixed (GH-52). Thanks to Stuart Campbell for reporting this bug.

Using baseimage-docker

Please learn more at the README.

The post Baseimage-docker 0.9.10 released appeared first on Phusion Corporate Blog.

Docker-friendly Vagrant boxes 2014-05-11 released

Posted 2 months back at Phusion Corporate Blog

Vagrant

We provide Vagrant base boxes that are based on Ubuntu 14.04 and 12.04, 64-bit. These boxes are specifically customized to work well with Docker. Please learn more at the website

The changes in version 2014-05-11 are:

  • The Ubuntu 12.04 boxes have been upgraded to kernel 3.13 (Trusty kernel). This is because even the updated VMWare Tools still occasionally caused kernel panics on kernel 3.8. In our tests, we’ve observed that VMWare Tools does not cause any kernel panics on kernel 3.13.
  • No changes in the Ubuntu 14.04 boxes.

Related resources: Github | Prebuilt boxes | Vagrant Cloud | Discussion forum | Twitter

Upgrade instructions for Vagrant >= 1.5 with Vagrant Cloud

Run:

vagrant box outdated

Upgrade instructions for Vagrant <= 1.4, or Vagrant >= 1.5 without Vagrant Cloud

Destroy your Vagrant VM:

vagrant destroy

Remove your existing base box:

# Vagrant >= 1.5
vagrant box remove phusion-open-ubuntu-12.04-amd64 --provider virtualbox
vagrant box remove phusion-open-ubuntu-12.04-amd64 --provider vmware_fusion

# Vagrant <= 1.4
vagrant box remove phusion-open-ubuntu-12.04-amd64 virtualbox
vagrant box remove phusion-open-ubuntu-12.04-amd64 vmware_fusion

Start your VM again. Vagrant will automatically download the latest version of the box.

vagrant up

Back to Basics: Regular Expressions

Posted 2 months back at GIANT ROBOTS SMASHING INTO OTHER GIANT ROBOTS - Home

Regular expressions have been around since the early days of computer science. They gained widespread adoption with the introduction of Unix. A regular expression is a notation for describing sets of character strings. They are used to identify patterns within strings. There are many useful applications of this functionality, most notably string validations, find and replace, and pulling information out of strings.

Regular expressions are just strings themselves. Each character in a regular expression can either be part of a code that makes up a pattern to search for, or it can represent a letter, character or word itself. Let’s take a look at some examples.

Basics

First let’s look at an example of a regular expression that is made up of only actual characters and none of the special characters or patterns that generally make up regular expressions.

To get started let’s fire up irb and create our regular expression:

> regex = /back to basics/
 => /back to basics/

Notice we create a regular expression by entering a pattern between two front slashes. The pattern we’ve used here will only match strings that contain the stringi ‘back to basics’. Let’s use the match method, which gives us information about the first match it finds, to look at some examples of what matches and what doesn’t:

> regex.match('basics to back')
 => nil

We’re getting close, but nothing in this string matches our regular expression, so we get nil.

> regex.match('i enjoyback to basics')
 => <MatchData "back to basics">

After an unsuccessful attempt we have a match. Notice that our regular expression matched even though there are no spaces between the pattern and the words before it.

MatchData

The object returned from the RegularExpression object’s match method is of type MatchData. This object can tell us all sorts of things about a particular match. Let’s take a look at some of the information we can get about our match.

We can use the begin method to find out the offset of the beginning of our match in the original string:

> match = regex.match('i enjoyback to basics')
 => <MatchData "back to basics">

> match.begin(0)
 => 7

> 'i enjoyback to basics'[7]
 => "b"

The argument we send the method can be used to specify a capture, a concept which is covered below, within our match. In our above example begin tells us that the beginning of our match can be found at index 7 in our string. As we can see from the code above the 8th character in the string (at the 7th index in our string) is ‘b’ the first letter of our match.

Similarly we can get the index of the character following the end of our match using the end method:

> match.end(0)
 => 21

> 'i enjoyback to basics'[21]
 => nil

In this case we get nil since the end of our match is also the end of our string.

We can also use the to_s method to print our match:

> match.to_s
 => "back to basics"

Patterns

The regular expression’s real power becomes obvious when we introduce patterns. Let’s take a look at some examples.

Metacharacters

A metacharacter is any character that has a meaning within a regular expression. Let’s start with something simple, let’s say we want to find out if our string contains a number. This will require we use our first pattern the \d, which is a metacharacter that says we’re looking for any digit:

> string_to_match = 'back 2 basics'

> regex = /\d/
 => /\d/

> regex.match(string_to_match)
 => <MatchData "2">

Our regular expression matches the number 2 in our string.

Character Classes

Let’s say we wanted to find out if any of the letters from ‘k’ to ’s' were in our string. This will require we use a character class. A character class let’s us specify a list of characters or patterns that we’re looking for:

> string_to_match = 'i enjoy making stuff'

> regex = /[klmnopqrs]/
 => /[klmnopqrs]/

> regex.match(string_to_match)
 => <MatchData "n">

In this example we can see we entered all the letters of the alphabet we were interested in between the brackets and the first instance of any of those characters results in a match. We can simplify the above regular expression by using a range. This is done by entering two character or numbers separated by a -:

> string_to_match = 'i enjoy making stuff'

> regex = /[k-s]/
 => /[k-s]/

> regex.match(string_to_match)
 => <MatchData "n">

As expected, we get the same results with our simplified regular expression.

It’s also possible to invert a character class. This is done by adding a ^ to the beginning of the pattern. If we wanted to look for the first letter not in between ‘k’ and ’s' we would use the pattern /[^k-s]/:

> string_to_match = 'i enjoy making stuff'

> regex = /[^k-s]/
 => /[^k-s]/

> regex.match(string_to_match)
 => <MatchData "i">

Since ‘i’ isn’t in our range the first letter in our string meets the criteria our regular expression specified.

Another thing worth noting is the \d character we used above is an alias for the character class [0-9].

Modifiers

We have the ability to set a regular expression’s matching mode via modifiers. In Ruby this is done by appending characters after the regular expression pattern is defined. A particularly useful matching modifier is the case insensitive modifier i. Let’s take a look:

> string_to_match = 'BACK to BASICS'

> regex = /back to basics/i
 => /back to basics/i

> regex.match(string_to_match)
 => <MatchData "BACK to BASICS">

The regular expression matches our string in spite of the fact that the cases are clearly not the same. We’ll look at another common modifier later on in the blog.

Repetitions

Repetitions give us the ability to look for repeated patterns. We are given the ability to broadly search for that are repeating an indiscriminate number of time, or we can get as granular as the exact number of repetitions we’re looking for.

Let’s try to identify all the numbers in a string again:

> string_to_match = 'The Mavericks beat the Spurs by 21 in game two.'

> regex = /\d/
 => /\d/

> regex.match(string_to_match)
 => <MatchData "2">

Because we used only a single \d we only got the first digit, in this case ‘2’. What we’re actually looking for is the entire number, not just the first digit. We can fix this by modifying our pattern. We need to specify a pattern that will say find any group of contiguous digits. For this we can use the + metacharacter. This tells the regular expression engine to find one or more of the character or characters that match the previous pattern. Let’s take a look:

> string_to_match = 'The Mavericks beat the Spurs by 21 in game two.'

> regex = /\d+/
 => /\d+/

> regex.match(string_to_match)
 => <MatchData "21">

We could also look for an exact number of repetitions. Let’s say we only wanted to look for numbers between 100 and 999. One way we could do that would be using the {n} patern, where n indicates the number of repetitions we’re looking for:

> string_to_match = 'In 30 years the San Francisco Giants have had two 100 win seasons.'

> regex = /\d{3}/
 => /\d{3}/

> regex.match(string_to_match)
 => <MatchData "100">

Our pattern doesn’t match 30, but does match 100 because we told it only three repeating digit characters constituted a match.

Let’s look for words that are only longer than five characters. This will require a new metacharacter, the \w that matches any word character. Then we’ll use the {n,} pattern, which says look for n or more of the previous pattern:

> string_to_match = 'we are only looking for long words'

> regex = /\w{5,}/
 => /\w{5,}/

> regex.match(string_to_match)
 => <MatchData "looking">

You can also specify less than using this pattern {,m} and in between with this {n,m}.

Grouping

Grouping gives us the ability to combine several patterns into one single cohesive unit. This can be very useful when combined with repetitions. Earlier we looked at using repetitions with a single metacharacter \d, but rarely will that be enough to satisfy our needs. Let’s look at how we could define a more complex pattern we expect to see repeated.

Let’s look at how we might create a more complicated regular expression that matches phone numbers in several different formats. We’ll use groups and repetitions to do this:

> phone_format_one = '5125551234'
 => "5125551234"

> phone_format_two = '512.555.1234'
 => "512.555.1234"

> phone_format_three = '512-555-1234'
 => "512-555-1234"

regex = /(\d{3,4}[.-]{0,1}){3}/
 => /(\d{3,4}[\.-]{0,1}){3}/

> regex.match(phone_format_one)
 => <MatchData "5125551234" 1:"234">

> regex.match(phone_format_two)
 => <MatchData "512.555.1234" 1:"1234">

> regex.match(phone_format_three)
 => <MatchData "512-555-1234" 1:"1234">

We have successfully created our regular expression, but there is a lot going on there. Let’s break it down. First we define that our pattern will be made up of groups of three or four digits with this \d{3,4}. Next we indicate that we want to allow for ‘-’ or ‘.’ patterns (we have to escape the ‘.’ because this character is also a metacharacter that acts as a wild card), but that we don’t want to require these characters with this pattern [\.-]{0,1}. Finally we say we need three of this group of patterns by grouping the previous two patterns together and apply a repetition of three (\d{3,4}[.-]{0,1}){3}.

Lazy and Greedy

Regular expressions are by default greedy, which means they’ll find the largest possible match. Often that isn’t the behavior we’re looking for. When creating our patterns it’s possible to tell Ruby we’re looking for a lazy match, or the first possible match that satisfies our pattern.

Let’s look at an example. Let’s say we wanted to parse out the timestamp of a log entry. We’ll start out just trying to grab everything in between the square brackets that we know our log is configured to output the date in. In this pattern we’ll use a new metacharacter. The . is a wildcard in a regular expression:

> string_to_match = '[2014-05-09 10:10:14] An error occured in your application. Invalid input [foo] received.'

> regex = /\[.+\]/
 => /\[.+\]/

> regex.match(string_to_match)
 => <MatchData "[2014-05-09 10:10:14] An error occured in your application. Invalid input [foo]">

Instead of matching just the text in between the first two square brackets it grabbed everything between the first instance of an opening square bracket and the last instance of a closing square bracket. We can fix this by telling the regular expression to be lazy using the ? metacharacter. Let’s take another shot:

> string_to_match = '[2014-05-09 10:10:14] An error occured in your application. Invalid input [foo] received.'

> regex = /\[.+?\]/
 => /\[.+?\]/

> regex.match(string_to_match)
 => <MatchData "[2014-05-09 10:10:14]">

Notice that we added our ? after our repetition metacharacter. This tells the regular expression engine to keep looking for the next part of the pattern only until it finds a match; not until it finds the last match.

Assertions

Assertions are part of regular expressions that do not add any characters to a match. They just assert that certain patterns are present, or that a match occurs at a certain place within a string. There are two types of assertions, let’s take a closer look.

Anchors

The simplest type of assertion is an anchor. Anchors are metacharacters that let us specify positions in our patterns. The thing that makes these metacharacters different is they don’t match characters only positions.

Let’s look at how we can determine if a line starts with Back to Basics using the ^ anchor, which denotes the beginning of a line:

> multi_line_string_to_match = <<-STRING
"> I hope Back to Basics is fun to read.
"> Back to Basics is fun to write.
"> STRING
 => "I hope Back to Basics is fun to read.\nBack to Basics is fun to write.\n"

> regex = /^Back to Basics/
 => /^Back to Basics/

> regex.match(multi_line_string_to_match)
 => <MatchData "Back to Basics">

> match.begin(0)
 => 38

Looking at where our match begins we can see it’s the second instance of the string “Back to Basics” we’ve matched. Another thing to take note of is the ^ anchor doesn’t only match the beginning of a string, but the beginning of a line within a string.

There are many anchors available. I encourage you to review the Regex documentation and check out some of the others.

Lookarounds

The second type of assertion is called a lookaround. Lookarounds allow us to provide a pattern that must be matched in order for a regular expression to be satisified, but that will not be included in a successful match. These are called lookahead and lookbehind patterns.

Let’s say we had a comma delimited list of companies and the year they were founded. Let’s match the year that thoughtbot was founded. In this case we only want the year, we’re not interested in including the company in the match, but we’re only interestedin thougtbot, not the other two companies. To do this we’ll use a positive lookbehind. This means we’ll provide a pattern we expect to appear before the pattern we want to match.

> string_to_match = 'Dell: 1984, Apple: 1976, thoughtbot: 2003'

> regex = /(?<=.thoughtbot: )\d{4}/
 => /(?<=.thoughtbot: )\d{4}/

> regex.match(string_to_match)
 => <MatchData "2003">

Even though the pattern we use to assert the word thoughtbot preceeds our match appears in our regular expression it isn’t included in our match data. This is exactly the behavior we were looking for.

To specify a positive lookbehind we use the ?<=. If we wanted to use a negative lookbehind, meaning the match we want isn’t preceed by some particular text we would use ?<!=.

To do a positive lookahead we use ?=. A negative look ahead is achieved using ?!=.

Captures

Another useful tool is called a capture. This gives us the ability to match on a pattern, but only captures parts of the pattern that are of interest to us. We accomplish this by surrounding the pattern data we intend to capture with parenthesis, which is also how we specify a group. Let’s look at how we might pull the quantity and price for an item off of an invoice:

> string_to_match = 'Mac Book Pro - Quantity: 1 Price: 2000.00'

> regex = /[\w\s]+ - Quantity: (\d+) Price: ([\d\.]+)/
 => /[\w\s]+ - Quantity: (\d+) Price: ([\d\.]+)/ 

> match = regex.match(string_to_match)
 => <MatchData "Mac Book Pro - Quantity: 1 Price: 2000.00" 1:"1" 2:"2000.00"> 

> match[0]
 => "Mac Book Pro - Quantity: 1 Price: 2000.00" 

> match[1]
 => "1"

> match[2]
 => "2000.00"

Notice we have all the match data in an array. The first element is the actual match and the second two are our captures. We indicate we want something to be captured by surrounding it in parentheses.

We can make working with captures simpler by using what is called a named capture. Instead of using the match data array we can provide a name for each capture and access the values out of the match data as a hash of those names after the match has occurred. Let’s take a look:

> string_to_match = 'Mac Book Pro - Quantity: 1 Price: 2000.00'

> regex = /[\w\s]+ - Quantity: (?<quantity>\d+) Price: (?<price>[\d\.]+)/
 => /[\w\s]+ - Quantity: (?<quantity>\d+) Price: (?<price>[\d\.]+)/

> match = regex.match(string_to_match)
 => <MatchData "Mac Book Pro - Quantity: 1 Price: 2000.00" quantity:"1" price:"2000.00">

> match[:quantity]
 => "1"

> match[:price]
 => "2000.00"

Strings

There are also some useful functions that take advantage of regular expressions in the String class. Let’s take a look at some of the things we can do.

sub and gsub

The sub and gsub methods both allow us to provide a pattern and a string to replace instances of that pattern with. The difference between the two methods is that gsub will replace all instances of the pattern, while sub will only replace the first instance.

The gsub method gets its name from the fact that matching mode (discussed above) is set to global, which is accomplished using the modifier code g hence the name.

Let’s take a look at some examples.

> string_to_match = "My home number is 5125551234, so please call me at 5125551234."
 => "My home number is 5125551234, so please call me at 5125551234."

> string_to_match.sub(/5125551234/, '(512) 555-1234')
 => "My home number is (512) 555-1234, so please call me at 5125551234."

When we use sub we can see we still have one instance of our phone number that isn’t formatted. Let’s use gsub to fix it.

> string_to_match.gsub(/5125551234/, '(512) 555-1234')
 => "My home number is (512) 555-1234, so please call me at (512) 555-1234."

As expected gsub replaces both instances of our phone number.

While our previous example demonstrates the way the functions work it isn’t a particularly useful regular expression. If we were trying to format all the phone numbers in a large document we obviously couldn’t make our pattern the number in each case, so let’s revisit our example and see if we can make it more useful.

> string_to_match = "My home number is 5125554321. My office number is 5125559876."
 => "My home number is 5125554321. My office number is 5125559876." 

> string_to_match.gsub(/(?<area_code>\d{3})(?<exchange>\d{3})(?<subscriber>\d{4})/, '(\k<area_code>) \k<exchange>-\k<subscriber>')
 => "My home number is (512) 555-4321. My office number is (512) 555-9876."

Now our regular expression will format any phone number in our string. Notice that we take advantage of named captures in our regular expression and use them in our replacement by using \k.

scan

The scan method lets pull all reular expression matches out of a string. Let’s look at some examples.

> string_to_scan = "I've worked in TX and CA so far in my career."
 => "I've worked in TX and CA so far in my career." 

> string_to_scan.scan(/[A-Z]{2}/)
 => ["TX", "CA"]

Using a regular expression we pull out all the state codes in our string. One thing to keep in mind as you continue to learn is pay close attention to the assorted metacharacters available and how their meanings change depending context. Just in this introductory blog we saw multiple meaings for both the ^ and ? character and we didn’t even cover all of the possible meanings of even those two characters. Sorting out when each metacharacter means what is one of the more difficult parts of mastering regular expressions.

Regular expressions are one of the most powerful tools we have at our disposal with Ruby. Keep them in mind as you code and you’ll be surprised how often they can provide a nice clean solution to an otherwise daunting task!

What’s next?

If you found this useful, you might also enjoy: