Status

Location
Arlington, VA
Subscribe to GeoRSS Subscribe to KML


Ruby

RubyConf*MI - the word is “Testing”

Published in Conference, Ruby


I went to my fourth conference this summer. Conferences, especially small ones, are a great chance to get quick insights into new ideas or technologies, but most importantly to share ideas with a lot of other people.

The presentations were good, many revolved around testing, with some talking about performance and deployment. There were very few presentations of new novel ideas or techniques. That’s where discussions come in.

I’m a new Ruby programmer who came to the party by the way of the popular gateway drug, Rails. So I’m still behind on proper and good Ruby tools and techniques. Pat Eyler gave an overview of various Ruby libraries for testing and performance such as test/unit, autotest, unit_diff, rspec, rcov, ruby-prof, and benchmark.

He sequed very well into SouthEast Michigan’s own Patrick Hurley’s talk on C optimization. Coming from a C/C++ background, knowing how easy it is to add small chunks of fast C code to dramatically speed up code (Patrick’s example showed a 20-times increase).

Back in Rails (esque) land, Zach Dennis talked about his ActiveRecord:Extensions (AR:E). ActiveRecord isn’t Rails specific, but makes a big part of the functionality behind Rails. AR:E adds a lot of very convenient functionality as well as speed improvements to database operations.

And of course, we made another pilgrimage to the Grand Rapids Brewing Comany.


Ruby Conference Michigan - Saturday August 26

Published in Conference, Ruby, Travel


RubyConf*MI LogoI’ll be at the RubyConf*MI which is the premiere Ruby event in Michigan.

hCalendar info

RubyConf*MI: Saturday, August 26, 9AM-5PM, at Science Building at Calvin College, Grand Rapids, Michigan

It’s just a 1-day conference, but should be a great time. Get to see David Black, author of Ruby for Rails, speak, as well as SouthEast Michigan’s own Patrick Hurley.

If you’re in the Grand Rapids, Michigan area, stop by.


How does a Framework Scale and not splinter?

Published in Rails, Ruby


One of the discussions that came up at BarCamp Grand Rapids was how can a framework (and language) scale and grow, without splintering too much. There were a disproportionate number of Java developers present, and one of the few complaints they had about Java was the large number of frameworks that were available. None had a market dominance, or clear set of features. Every week new frameworks pop up.

Compare to the few of us that were Rails proponents. Currently, Rails is the only Ruby framework with any market share (are there even any others?). The question was, in the future will more Ruby frameworks show up, steal market/mind-share, and splinter the community. Diversity is good, new ideas help spur innovation. However, large fracturing confuses new developers and makes support and interoperability difficult.

One definite way to address this possible problem is by keeping the core framework simple and effective and having good support for extensions, plugins, and additions by other developers. Rails currently does this very well, and it looks like this will become even more solid in the future.

A Ruby on Rails Plugin Repository is being realized and should be fully supported soon. Luke Redpath has a good discussion on the current progress and future path. To date, the plugin repository has been wiki based, or required knowing the appropriate subversion repository for a plugin. This new effort will centralize plugins, promote proper documentation, annotation, and testing.

Perl has CPAN, an absolutely incredible repository of modules that has probably kept Perl alive and made it a very powerful language. Ruby Gems provide a very similar infrastructure for distributing great enhancements to the language. Hopefully, the Ruby on Rails Plugin Repository will keep the community united and working to support and build on a single framework while still allowing them to bring in their application specific features.


Microformat Ruby Parser

Published in Programming, Ruby


With help from Assaf, I’ve generalized the Microformat parsers even more. Now, there is a base-class Microformat that provides a structure for any sub-class to be created that specifies a Microformat that you want to parse.

It’s not robust, in that it assumes that the important information is actually in the text of the tag, and nothing is in any of the tag’s attributes. This should probably be added, where the properties would be an array that specify if the value comes from text or a named attribute.


class Microformat < Scraper::Base  def self.properties(*symbols)
    symbols.each do |symbol|
      html_class = symbol.to_s.gsub(/_/, "-")
      process ".#{html_class}", symbol=>["abbr@title", "a@href", :text]
    end
  end
end

class Geo < Microformat
  properties :latitude, :longitude
end
class Adr < Microformat
  properties :post_office_box,
    :extended_address,
    :street_address,
    :locality,
    :region,
    :postal_code,
    :country_name
end

To use you just have a generalized scraper:


class Location < Scraper::Base   array :geos
  array :adrs
  process ".adr", :adrs => Adr
  process ".geo", :geos => Geo
  result :geos, :adrs
end

which will now return an object with a .geos and .adrs for geo and adr attributes, respectively.

Updated: Assaf pointed out that he added the ability to pull out the correct data if the microformat was stored in an abbr vs. a span.
process ".#{html_class}", symbol=>["abbr@title", "a@href", :text]


scrAPI - Microformat Parsing in Ruby

Published in Programming, Project, Ruby


I was looking for some nice Ruby utility to help in parsing out Microformats from webpages. There are 3 projects currently on RubyForge:

Talking on #microformats I was pointed to LabNotes newer incarnation of a parser: scrAPI. It’s a much more generic HTML parser/scraper, that can handle getting data from HTML by structure, class, or id. Here is Assaf’s presentation at Mashup Camp II where he gives some good tutorials and discussion about the API.

Down and dirty with the code

To illustrate scrAPI, I’ll show you the code needed to parse geo location data from a webpage.

First we just do bring in the necessary libraries, and get an example HTML page:


require 'scrapi/lib/scrapi'
require "net/http"

h = Net::HTTP.new("code.highearthorbit.com", 80)
resp, data = h.get("/greaseroute/index.php")

Then we define our scrapers. The geo microformat looks like:
<div class=”geo”>
<span class=”latitude”>35.126</span>,
<span class=”longitude”>-80.764</span>
</div>

The process method of the Geo class can take a HTML structure path, CSS class or id, and then the attribute to store. Also, our general Location scraper will look for all geo class tags in the HTML, and fill out the geos array using the Geo class Scraper.


class Geo < Scraper::Base
  process ".latitude", :latitude => :text
  process ".longitude", :longitude => :text
end

class Location < Scraper::Base
  array :geos
  process ".geo", :geos => Geo
  result :geos
end

Finally, now that we’ve built up our “tools”, we can scrape the data, and output all the found locations.


locations = Location.scrape(data)

locations.each {|loc| puts "[#{loc.latitude} x #{loc.longitude}]" }

That was really easy, and effective. Additionally, due to the Microformats standards, we can feel pretty confident on changes to the original site’s markup to not mess up our parsing.