Quote of the day - DHH on designers

Posted about 1 year back at Beyond The Type - Home

"Now there is dynamic real pieces of data. They don't have to use Lorem Ipsum to fill it up with crap."

DHH on designers using Rails view templates from the Scott Hanselman podcast

Nice podcast. Some great material in there.

Ruby-esque JMX -- Part 2

Posted about 1 year back at Revolution On Rails

Below is the code from the JMX tinkering. I snake-ized the keys to be more Ruby-esque, per Ed's suggestion.

require 'java'
include Java

# Obviously stolen from Rails ActiveSupport for underscoring strings
class String
def underscore
self.to_s.gsub(/::/, '/').
gsub(/([A-Z]+)([A-Z][a-z])/,'\1_\2').
gsub(/([a-z\d])([A-Z])/,'\1_\2').
tr("-", "_").
downcase
end
end

module JMX
class MBean
include_class 'javax.management.ObjectName'
include_package "java.lang.management"
include_package "javax.management"
include_package "javax.management.remote"
include_class "java.util.HashMap"
attr_accessor :name
def initialize(object_name)
@object_name = object_name
@name = @object_name.to_s
end

def attributes
attrs = MBean.connection.getMBeanInfo(@object_name).attributes rescue []
attrs.inject({}) do |list, a|
list[a.name.underscore] = MBean.connection.getAttribute(@object_name, "#{a.name}") rescue "Unknown"
list
end
end

def self.find_all_by_name(name)
object_name = ObjectName.new(name)
beans = MBean.connection.queryMBeans(object_name,nil )
beans.collect {|bean| MBean.new(bean.get_object_name)}
end

def self.find_by_name(name)
#obviously inefficient
find_all_by_name(name).first
end

def self.connection
@@mbsc ||= begin
#load from some config file later maybe
url = "service:jmx:rmi:///jndi/rmi://localhost:1099/jmxrmi"
connector = JMXConnectorFactory::connect(JMXServiceURL.new(url), HashMap.new)
connector.getMBeanServerConnection
end
end
protected

def method_missing(method, *args, &block)
attributes.keys.include?(method.to_s) ? attributes[method.to_s] : super
end
end
end

#Find all the MBeans matching some object name
mbeans = JMX::MBean.find_all_by_name("java.lang:*")
puts "Found #{mbeans.size} beans"

#Find a bean, and get its attribute
bean = JMX::MBean.find_by_name("java.lang:type=ClassLoading")
puts "There are #{bean.loaded_class_count} classes loaded in that VM"

Digesting RailsConf 2007

Posted about 1 year back at Loud Thinking

Going from five to sixteen hundred people is a big risk for a conference. There's so much to lose: The atmosphere, the coherence of content, and the interestingness of the people. But in my mind we didn't, RailsConf 2007 was a roaring success.

There were so many great debates going on, so much fascinating work happening, and so extraordinary tales of adoption. It was wonderful to meet up with people like Martin Fowler, Ward Cunningham, Tim Bray, Dave Thomas, Robert Martin, and other industry leaders.

But in many ways, even more wonderful was the level of involvement from everyone else. I remember RubyConf '03 when we just had a couple of people doing professional Ruby work. This year at RailsConf more than half the room raised their hand when I asked how many were working professionally with Rails. What a leap.

So many people doing applications in all niches and of all shades. Plenty of startups, naturally, but also plenty of so-called enterprise operations. From banks to insurance companies. ThoughtWorks announcing that 40% of all new business in the US is Ruby on Rails projects. Wow.

I loved the fact that it wasn't all about the nitty gritty stuff either. We had an Extra Action marching band that pushed the comfort level of many on the fun side of things.

And on the more serious side, Alan Francis explored the similarities between the Rails and XP movements on a higher plane of approach, angry teenager-tendencies, and peaks.

I also much enjoyed the fact that it was broader than just Ruby and Rails circle. That we had Avi Bryant talk to us about this magical parallel universe of Smalltalk. And that we attracted people like Scott Hanselman from the .NET world (and that he posed plenty of opposing opinion that we sorta captured in a podcast with Martin Fowler and me).

All in all, a spectacular extended weekend. It made me all the more excited for turning another chapter in the conference book in Berlin come late September with RailsConf Europe.

All photos by the always awesome James Duncan Davidson

Digesting RailsConf 2007

Posted about 1 year back at Loud Thinking

Going from five to sixteen hundred people is a big risk for a conference. There's so much to lose: The atmosphere, the coherence of content, and the interestingness of the people. But in my mind we didn't, RailsConf 2007 was a roaring success.

There were so many great debates going on, so much fascinating work happening, and so extraordinary tales of adoption. It was wonderful to meet up with people like Martin Fowler, Ward Cunningham, Tim Bray, Dave Thomas, Robert Martin, and other industry leaders.

But in many ways, even more wonderful was the level of involvement from everyone else. I remember RubyConf '03 when we just had a couple of people doing professional Ruby work. This year at RailsConf more than half the room raised their hand when I asked how many were working professionally with Rails. What a leap.

So many people doing applications in all niches and of all shades. Plenty of startups, naturally, but also plenty of so-called enterprise operations. From banks to insurance companies. ThoughtWorks announcing that 40% of all new business in the US is Ruby on Rails projects. Wow.

I loved the fact that it wasn't all about the nitty gritty stuff either. We had an Extra Action marching band that pushed the comfort level of many on the fun side of things.

And on the more serious side, Alan Francis explored the similarities between the Rails and XP movements on a higher plane of approach, angry teenager-tendencies, and peaks.

I also much enjoyed the fact that it was broader than just Ruby and Rails circle. That we had Avi Bryant talk to us about this magical parallel universe of Smalltalk. And that we attracted people like Scott Hanselman from the .NET world (and that he posed plenty of opposing opinion that we sorta captured in a podcast with Martin Fowler and me).

All in all, a spectacular extended weekend. It made me all the more excited for turning another chapter in the conference book in Berlin come late September with RailsConf Europe.

All photos by the always awesome James Duncan Davidson

Cranking up the machinery

Posted about 1 year back at Loud Thinking

So I finally had a few spare moments to work on the Loud Thinking machine again. Instead of going with one of the million packages out there, I decided to eat some dog food and just roll my own.

Yes, yes, terribly inefficient from a productivity perspective, but I indulged myself with a learning experience on how it feels to setup a small Rails application from scratch using Ubuntu Feisty, nginx, Mongrel, and SQLite3.

As a side-effect, I haven't bothered implementing comments for my little machine just yet. And I'm thinking that's actually a blessing in part disguise. I think I'll be happy with the tranquillity for a while.

Cranking up the machinery

Posted about 1 year back at Loud Thinking

So I finally had a few spare moments to work on the Loud Thinking machine again. Instead of going with one of the million packages out there, I decided to eat some dog food and just roll my own.

Yes, yes, terribly inefficient from a productivity perspective, but I indulged myself with a learning experience on how it feels to setup a small Rails application from scratch using Ubuntu Feisty, nginx, Mongrel, and SQLite3.

As a side-effect, I haven't bothered implementing comments for my little machine just yet. And I'm thinking that's actually a blessing in part disguise. I think I'll be happy with the tranquillity for a while.

Using MySQL reserved words as model names

Posted about 1 year back at Spejman On Rails

The generation of a model with a migration in a ruby on rails application lets to the creation of a database table with the pluralization form of the desired model name. In MySQL, a migration will generate a sql statement like:


CREATE TABLE model_name_pluralized (`id` int(11) DEFAULT NULL auto_increment PRIMARY KEY,
`created_on` date DEFAULT NULL, `name` varchar(255) DEFAULT NULL) ENGINE=InnoDB

As you can notice column names are quoted but table name doesn't. If you use as a model name a sigularized form of a MySQL reserved word, the migration that creates this model will generate a statement that will lead to an error like:

Mysql::Error: You have an error in your SQL syntax; check the manual that corresponds to your
MySQL server version for the right syntax to use near 'databases (`id` int(11) DEFAULT NULL
auto_increment PRIMARY KEY, `created_on` da' at line 1: CREATE TABLE databases (`id` int(11)
DEFAULT NULL auto_increment PRIMARY KEY, `created_on` date DEFAULT NULL, `name` varchar(255)
DEFAULT NULL) ENGINE=InnoDB

Last week I read the Josh Susser Laying Tracks slides who
encourages me to write a patch for this issue.

Before writing anything I tried to find if someone has made something related and I found some tickets related in Rails trac:


The most interesting of this tickets is #4905 which fix all MySQL statements to prevent reserved words crash, but I don't know why isn't included in the code because it's last history is from 05/25/2006. #7850 is closed as duplicated because of #4905. And #3631 history finishes with "don't use reserved words" what in my opinion isn't the best solution.

In brief, the problem exists (I can't name my models with names like "database", "exist", ...) and the patch too (#4905). Then, what should we do to fix this problem?

Episode 37: Simple Search Form

Posted about 1 year back at Railscasts

A search form is quite different than other forms, this is because it does not deal with model's attributes. See a good way to add a simple search form in this episode.

Episode 37: Simple Search Form

Posted about 1 year back at Railscasts

A search form is quite different than other forms, this is because it does not deal with model's attributes. See a good way to add a simple search form in this episode.

Clever Caching

Posted about 1 year back at Koz Speaks - Home

Besides our talk, my most valuable experience at RailsConf was talking with Tobi about the new caching strategy (‘Tobi caching’) he’s using at Shopify. There are a few parts which all work together nicely.

Etags Matter

As Joe explained, using good etags, can substantially reduce your bandwidth bill. In his case it was a 70% reduction. The take away from this is that you need to think about how you’re going to generate opaque cache coherency values for your actions. For a good intro to HTTP conditional gets, go read this tutorial by charles.

Expiry is a Pain

Anyone who’s had to write sweepers for for an application with heavy caching knows how frustrating it can be. After all, cache invalidation is one of the two hard things in computer science. If you could somehow avoid expiring all the ‘stuff’ you’re caching, your life would be much much easier.

Memcache is Smart

Memcache and the Memcache client libraries have plenty of smarts built into them, despite being ‘dumb by design’. The client libraries use clever hashing to know which server to talk to, this lets you run a cluster of caches without worrying too much about which keys live on which server.

The server also has its own smarts about what keys are important. When it needs the memory memcached will drop the least recently used values, thereby ensuring that your unused keys won’t be ‘wasting space’.

Mix it all together

So with that in mind, what can we do to improve our application’s performance, and simplify our application.

Forget about expiry

As mentioned before, expiry is a complete pain in the ass. So let’s not do it. The key to getting away with this is to pick a key which completely encapsulates the resource you’re caching, and also ensures that if anything relevant changes, the key changes. Take the case of this blog post, a simple key would be the permalink, however if we used that, we’d need to expire the cache every time someone commented, or I corrected a typo.

The no-expiry alternative would be for mephisto to keep a ‘version number’ associated with each post and increment it every time someone commented, or the post body changed. Once it was doing that, we could construct a key that looked like www.koziarski.net:clever-caching:#{version_number}. Every time the version number changed, we’d get a cache miss, and regenerate the content, but subsequent requests will be served out of memcache. No more expiry!

Now that we’ve saved all that CPU time, we should see if there’s a way we can save some bandwidth too.

Embrace Etags

Thankfully, our cache key has all the properties of an ETag, whenever something important changes, our cache key does. So lets use that as a basis of building our ETag by using the MD5 hash. The only reason I don’t advocate using the cache key itself, is that you may want to include sensitive data in the key. Now we can just chuck d444415a8228fbed44cfa7ef39f15d8b into the ETag header, and compare our key with the value of ‘If-None-Match’ from the request headers.

Conclusion

By doing this you get the bandwidth savings of HTTP caching, the performance boost of action caching, but without the difficult expiry code. You can avoid all the NFS related headaches of page caching, but still get most of the performance boost.

While the approach won’t suit every project, it could well suit yours. Finally, a snippet of sorts for those of you who think in code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39


around_filter :cache_sensibly, :only=>:show

def cache_sensibly
  # compose the key using something we know matches our business
  cache_response(request.host, request.request_uri, @blog.version, @post.version) { yield }
end

private
def cache_response(*keys)
      key = keys * ':'
      # use the hash as an etag so we can cache on 
      # private data
      etag = MD5.hexdigest(key)
      
      
      # first handle HTTP, lets us avoid a memcache hit
      # and saves a huge amount of bandwidth to the client
      if request.env["HTTP_IF_NONE_MATCH"] == etag
        headers["X-Cache"] = "HTTP"
        head :not_modified
        return
      end
      response.headers["ETag"] = etag
      
      # Next check memcache
      if data = Cache.get(key)
        # render from the cached values
        headers["Content-Type"] = data[:content_type]
        headers["X-Cache"] = "HIT"
        render :text=>data[:content], :status=>data[:status]
      else
        # Finally, yield, indicate we've missed then cache the response
        headers["X-Cache"] = "MISS"
        yield
        Cache.put(key, {:content=>response.body, :status=>headers["Status"].to_i, :content_type=>(response.content_type || "text/html")})
      end
    end

Clever Caching

Posted about 1 year back at Koz Speaks - Home

Besides our talk, my most valuable experience at RailsConf was talking with Tobi about the new caching strategy (‘Tobi caching’) he’s using at Shopify. There are a few parts which all work together nicely.

Etags Matter

As Joe explained, using good etags, can substantially reduce your bandwidth bill. In his case it was a 70% reduction. The take away from this is that you need to think about how you’re going to generate opaque cache coherency values for your actions. For a good intro to HTTP conditional gets, go read this tutorial by charles.

Expiry is a Pain

Anyone who’s had to write sweepers for for an application with heavy caching knows how frustrating it can be. After all, cache invalidation is one of the two hard things in computer science. If you could somehow avoid expiring all the ‘stuff’ you’re caching, your life would be much much easier.

Memcache is Smart

Memcache and the Memcache client libraries have plenty of smarts built into them, despite being ‘dumb by design’. The client libraries use clever hashing to know which server to talk to, this lets you run a cluster of caches without worrying too much about which keys live on which server.

The server also has its own smarts about what keys are important. When it needs the memory memcached will drop the least recently used values, thereby ensuring that your unused keys won’t be ‘wasting space’.

Mix it all together

So with that in mind, what can we do to improve our application’s performance, and simplify our application.

Forget about expiry

As mentioned before, expiry is a complete pain in the ass. So let’s not do it. The key to getting away with this is to pick a key which completely encapsulates the resource you’re caching, and also ensures that if anything relevant changes, the key changes. Take the case of this blog post, a simple key would be the permalink, however if we used that, we’d need to expire the cache every time someone commented, or I corrected a typo.

The no-expiry alternative would be for mephisto to keep a ‘version number’ associated with each post and increment it every time someone commented, or the post body changed. Once it was doing that, we could construct a key that looked like www.koziarski.net:clever-caching:#{version_number}. Every time the version number changed, we’d get a cache miss, and regenerate the content, but subsequent requests will be served out of memcache. No more expiry!

Now that we’ve saved all that CPU time, we should see if there’s a way we can save some bandwidth too.

Embrace Etags

Thankfully, our cache key has all the properties of an ETag, whenever something important changes, our cache key does. So lets use that as a basis of building our ETag by using the MD5 hash. The only reason I don’t advocate using the cache key itself, is that you may want to include sensitive data in the key. Now we can just chuck d444415a8228fbed44cfa7ef39f15d8b into the ETag header, and compare our key with the value of ‘If-None-Match’ from the request headers.

Conclusion

By doing this you get the bandwidth savings of HTTP caching, the performance boost of action caching, but without the difficult expiry code. You can avoid all the NFS related headaches of page caching, but still get most of the performance boost.

While the approach won’t suit every project, it could well suit yours. Finally, a snippet of sorts for those of you who think in code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39


around_filter :cache_sensibly, :only=>:show

def cache_sensibly
  # compose the key using something we know matches our business
  cache_response(request.host, request.request_uri, @blog.version, @post.version) { yield }
end

private
def cache_response(*keys)
      key = keys * ':'
      # use the hash as an etag so we can cache on 
      # private data
      etag = MD5.hexdigest(key)
      
      
      # first handle HTTP, lets us avoid a memcache hit
      # and saves a huge amount of bandwidth to the client
      if request.env["HTTP_IF_NONE_MATCH"] == etag
        headers["X-Cache"] = "HTTP"
        head :not_modified
        return
      end
      response.headers["ETag"] = etag
      
      # Next check memcache
      if data = Cache.get(key)
        # render from the cached values
        headers["Content-Type"] = data[:content_type]
        headers["X-Cache"] = "HIT"
        render :text=>data[:content], :status=>data[:status]
      else
        # Finally, yield, indicate we've missed then cache the response
        headers["X-Cache"] = "MISS"
        yield
        Cache.put(key, {:content=>response.body, :status=>headers["Status"].to_i, :content_type=>(response.content_type || "text/html")})
      end
    end

RejectConf2007 Videos

Posted about 1 year back at Synthesis

Found a couple videos from RejectConf, which I missed out on while at RailsConf. And since there was 4 parallel tracks at RailsConf, the most anyone could see is 25% of it. If anyone knows links to more videos from RailsConf07 or RejectConf07, I’d love to know.

Loading all Rails test fixtures with fixtures :all

Posted about 1 year back at Cody Fauser

Are you as tired as we were of loading 20+ different fixtures in each of your Rails test classes? We were, and we even added a method all_fixtures() to test_helper.rb to do the loading of all our fixtures for us.

Thankfully though, we don't need our own helper method anymore, as the Rails fixtures() method will now accept a symbol :all, which will instruct the test helper to load all of your fixtures automatically.

1
2
3
4
5
6
7
8

require File.dirname(__FILE__) + '/../test_helper'

class ShopTest < Test::Unit::TestCase
  fixtures :all

  # Your tests here
end

As of Rails 1.2.3 this feature has not yet been merged from the trunk. This means that you'll either need to run Edge Rails from Subversion, or install the beta Rails gems as follows:


sudo gem install -s http://gems.rubyonrails.org rails -y

Happy testing!

Ruby-esque JMX

Posted about 1 year back at Revolution On Rails

The topic of JMX on JRuby came up recently and I decided to play around. I found a great starter on Jeff Mesnil's blog, but I decided I hated the syntax.

Ruby has spoiled me. ActiveRecord has spoiled me.

So I cooked up this little (fully working) example:

#Find all the MBeans matching some object name
mbeans = JMX::MBean.find_all_by_name("cacheStatistics:*")

mbeans.each do |bean|
puts "#{bean.name} "

#Either use methods on the bean object
puts " - CacheHits: #{bean.CacheHits}"

#Or access the attributes hash.
puts " - CacheMisses: #{bean.attributes["CacheHits"]}"
end


The code is ~50lines which I'll post at some point.

I never thought working with java objects could be made to "feel" nice.

I was also chatting with "headius" on #jruby, and he mentioned that Rob Harrop, of Spring fame, had a talk at JavaOne about about something similar called MScript. I'd love to get my hands on those slides.

Acts As Fast But Very Inaccurate Counter

Posted about 1 year back at Revolution On Rails

Introduction

If you have chosen the InnoDB MySQL engine over MyISAM for its support of transactions, foreign keys and other niceties, you might be aware of its limitations, like much slower count(*). Our DBAs are in a constant lookout for slow queries in production and the ways to keep DBs happy so they recommended that we should try to fix count(). They suggested to check SHOW TABLE STATUS for an approximate count of rows in a table. This morning I wrote acts_as_fast_counter which proved that the speed is indeed improved but the accuracy might be not acceptable. The rest of the post just records details of the exercise.

The approach

I created a model per engine and seeded each with 100K records. Then I run count on each model for a thousand times and measured the results.

The code:

module ActiveRecord; module Acts; end; end 

module ActiveRecord::Acts::ActsAsFastCounter

def self.included(base)
base.extend(ClassMethods)
end

module ClassMethods

def acts_as_fast_counter
self.extend(FastCounterOverrides)
end

module FastCounterOverrides

def count(*args)
if args.empty?
connection.select_one("SHOW TABLE STATUS LIKE '#{ table_name }'")['Rows'].to_i
else
super(*args)
end
end

end

end

end

ActiveRecord::Base.send(:include, ActiveRecord::Acts::ActsAsFastCounter)

# create_table :myisams, :options => 'engine=MyISAM'  do |t|
# t.column :name, :string
# end
# 100_000.times { Myisam.create(:name => Time.now.to_s) }
#
# create_table :innodbs, :options => 'engine=InnoDB' do |t|
# t.column :name, :string
# end
# 100_000.times { Innodb.create(:name => Time.now.to_s) }

class Bench

require 'benchmark'
require 'acts_as_fast_counter'

def self.run
measure
show_count
convert_to_fast_counter
show_count
add_records
show_count
destroy_records
show_count
measure
end

def self.measure
puts "* Benchhmarks:"
n = 1_000
Benchmark.bm(12) do |x|
x.report('MyISAM') { n.times { Myisam.count } }
x.report('InnoDB') { n.times { Innodb.count } }
end
end

def self.convert_to_fast_counter
Innodb.send(:acts_as_fast_counter)
puts "* Converted Innodb to fast counter"
end

def self.add_records
@myisam = Myisam.create(:name => 'One more')
@innodb = Innodb.create(:name => 'One more')
puts "* Added records"
end

def self.destroy_records
@myisam.destroy
@innodb.destroy
puts "* Destroyed records"
end

def self.show_count
puts "* Record count:"
puts " MyISAM: #{ Myisam.count }"
puts " InnoDB: #{ Innodb.count }"
end

end


The results:
* Benchhmarks:
user system total real
MyISAM 0.180000 0.040000 0.220000 ( 0.289983)
InnoDB 0.430000 0.070000 0.500000 ( 35.102496)
* Record count:
MyISAM: 100000
InnoDB: 100000
* Converted Innodb to fast counter
* Record count:
MyISAM: 100000
InnoDB: 100345
* Added records
* Record count:
MyISAM: 100001
InnoDB: 100345
* Destroyed records
* Record count:
MyISAM: 100000
InnoDB: 100345
* Benchhmarks:
user system total real
MyISAM 0.250000 0.030000 0.280000 ( 0.350673)
InnoDB 0.250000 0.040000 0.290000 ( 0.977711)


Final thoughts

The MySQL manual has a clear warning about inaccuracy of the amount of rows in the SHOW TABLE STATUS results:

Rows - The number of rows. Some storage engines, such as MyISAM, store the exact count. For other storage engines, such as InnoDB, this value is an approximation, and may vary from the actual value by as much as 40 to 50%. In such cases, use SELECT COUNT(*) to obtain an accurate count.


The test confirms it by showing 345 more records then expected thus making it not very useful but for some edge cases. If you know a way to improve the speed of count() on InnoDB with some other approach beyond using a counter table, please share.


1 ... 518 519 520 521 522 ... 575