I am looking over the FOSS4G Schedule of sessions. It's all table based, and it's somewhat difficult to find specific tracks, rooms, etc. So I took what was the table-based, non-semantic, calendar and converted it into a much more useful hCalendar output, which can be easily translated to iCal for your subscription fun using Brian Suda's X2V.
Here is the current HTML of the schedule. As you can see, this is an absolute mess of DOM. This table is in fact already the 4th embedded table (tables-within-tables-within-tables oh my!)
celspacing="0" cellpadding="0" bgcolor="#E6E6E6">
Tuesday, 12 September 2006
cellspacing="1" width="100%" style="padding:3px;
border-top:1px solid #E6E6E6;border-bottom:1px solid #E6E6E6;">
(AmphipÃ´le (niv. 3): 07:00 - 09:00)
 Getting Started with MapServer
by Mr. Jeff MCKENNA (DM Solutions Group)
In the middle there was some actual interesting bits, such as presentation title, author, times, etc. So what we need to do is walk through all this and build up a conference.
Employing some slick Ruby scripting - and using the very useful scrAPI from Assaf we can define scrapers to walk over the multiple days, and then within those days grab each of the sessions. These are then output into proper hCalendar format like:
Enabling Users to Produce personalized Geodata
Mr. Andrew TURNERHighEarthOrbit
Friday, 15 September 2006 from 10:30-
at the Amphimax MAX 350
The code below makes parsing the nightmare above fairly simple, but due to the lack of any proper classes or id's (each presentation is
id="entry"- eep!), we have to find the bits we want by their current markup attributes. Not suggested, but at least this is nicer than trying to figure out the 10-levels of DOM starting at the root.
You can see the parser here.
About this articlewritten on September 7, 2006
posted in ProgrammingRubyFOSS4G Back to Top