status

location
Washington, DC
Subscribe to GeoRSS Subscribe to KML


EPA KML a great step, but both forward and backward

Published in KML  |  5 Comments


The Google LatLon blog points to the EPA’s release of chemical emission facilities in KML.

This is a really great step. I’ve looked at the EPA data and started writing various scripts to massage it into a nicer format than their CSV. A lot of work, a lot of data, and a pain to maintain. So it’s great that institutions are exposing their data in easy to consume and view formats.

However, it also illustrates the typical ‘munging’ of styling and data that users, even professional users like the EPA, will do with KML. In the Air Emission KML they are using Altitude to display some measure of chemical emission. It isn’t clear what the number really means, and seems really bad to have a data measurement in the geometry.

And especially other clients that will consume this data. How are they to know whether the height is really the altitude of a facility, or some other representative number? In fact, you can export the facilities KML with altitude representing: NOx, Lead, Particulate Matter, Sulfur Dioxide, or Volatile Organic Compounds and there isn’t a way to tell once you’ve exported. Unfortunately KML doesn’t currently support the ability to style based on ExtendedData, other than for the Balloon text. But it would still be useful to put this data there.

For example, in Mapufacture we pull in and store the arbitrary data with an KML or RSS feed. So then a user could add, for example, all Emission facilities with a high NOx emission rate within 5 miles of their house to the map of their community.

An answer could be a link for each element to the data in a different markup such as GML, but that seems slightly convoluted and difficult for tool developers. That would still be a good solution for advanced data and metadata specifying how and when the data was collected. But the point here is to publish a base level of information in a light-weight, broadly used data format.

A couple of other small niggles with their KML: the title is just “Temporary Places” and doesn’t have a meaningful title of what the data really is, or where it came from. A title that is representative of the data, so when I leave it in my KML viewer or ingest it into another tool, then I can remember what the data is about. KML 2.2 also supports attribution and atom links that should point back to the EPA site. Also, all facility names are all caps, which is how the data was stored in their old CSV files (when I looked before).

By Example

By way of specific discussion, lets do a simple example of modifying the EPA’s current KML to a more useful KML.

<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://earth.google.com/kml/2.2">
  <Folder>
    <name>Temporary Places</name>
    <open>1</open>
    <Document>
      <name>Petroleum_facilities.kml</name>
      <Placemark>
        <name>SAM HILL OIL CO:&lt;br&gt;&lt;/br&gt; 725 S MAIN ST BRIGHTON CO 80601-3047</name>
        <description><![CDATA[<p>SIC: 2911<br />
          Petroleum Refining And Related Industries Petroleum Refining Petroleum Refining </p>
          <p>NAICS:       <br />
          </p>
          <img src="http://www.epa.gov/cgi-bin/broker?_service=data&_program=dataprog.dw_emisplot_epad8_facility_caps.sas&year1=2002&year2=2002&debug=0&site=15121"width="460" height="360">
          <hr><p><b>Petroleum Facilities Emissions (258  facilities)</b>]]></description>
        <styleUrl>#A</styleUrl>
        <Point>
          <extrude>1</extrude>
          <altitudeMode>relativeToGround</altitudeMode>
          <coordinates>-104.8257,39.9754,13.102642872</coordinates>
        </Point>
      </Placemark>
    </Document>
  </Folder>
</kml>

So let’s clean this up a little by adding some better titles, links to more information, and putting the data in a data location instead of with geometry.

<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://earth.google.com/kml/2.2">
  <Folder>
    <name>EPA Air Emission Sources</name>
    <description>Under the Clean Air Act, EPA establishes air quality standards to protect public health, including the health of "sensitive" populations such as people with asthma, children, and older adults. EPA also sets limits to protect public welfare. This includes protecting ecosystems, including plants and animals, from harm, as well as protecting against decreased visibility and damage to crops, vegetation, and buildings.</description>
    <atom:link href="http://www.epa.gov/air/emissions/" />
    <atom:author>
      <atom:name>US Environmental Protection Agency (EPA)</atom:name>
    </atom:author>
    <open>1</open>
    <Document>
      <name>Petroleum Facilities</name>
      <atom:link rel="self" type="application/kml+xml" href="http://www.epa.gov/mxplorer/Petroleum_facilities_US.kmz"/>
      <Placemark id="sic:2911">
        <name>Sam Hill Oil Co</name>
        <description><![CDATA[
          <img src="http://www.epa.gov/cgi-bin/broker?_service=data&_program=dataprog.dw_emisplot_epad8_facility_caps.sas&year1=2002&year2=2002&debug=0&site=15121"width="460" height="360">]]></description>
        <styleUrl>#petroleum_facility</styleUrl>
        <address>725 S Main St, Brighton, CO 80601-3047</address>
        <ExtendedData>
          <Data name="SIC">
            <value>2911</value>
          </Data>
          <Data name="industryDescription">
            <value>Petroleum Refining And Related Industries Petroleum Refining Petroleum Refining</value>
          </Data>
          <Data name="facilityType">
            <name>Petroleum</name>
          </Data>
          <Data name="NOx (ppm)">
            <value>13.102642872</value>
          </Data>
        </ExtendedData>
        <Point>
          <coordinates>-104.8257,39.9754</coordinates>
        </Point>
      </Placemark>
    </Document>
  </Folder>
</kml>
		

To summarize, the EPA releasing their data in KML is a really great step to leading the way on information transparency and public awareness, however there should just a tiny bit more effort to demonstrate some better behavior in their shared data.

Similar Posts


Responses

  1. Jason Birch says:

    December 2nd, 2007 at 6:33 pm (#)

    Interesting take Andrew. I see KML as filling two roles: data presentation and data exchange (if you’re not willing to take the time to develop a GML schema).

    I sometimes have a hard time maintaining semantic goodness while achieving the level of creative and attractive results in Google Earth that I desire. For instance, the work that I’ve been doing recently with panoramic photos and earlier using a multigeometry for polygonal mouse-overs really makes the data not suitable for import into a GIS. …and in these cases I care less about that than I do about it looking good and working well in Google Earth.

    Those are relatively extreme cases though. I agree that it would be great if styling was suitable divorced from attributes so that we are less tempted to make the data ineffective outside of its intended purpose. This is one of the reasons I am glad to see KML get pushed into a standards track; in many ways this is similar to the browser wars around HTML that we are just starting to recover from now. I see great value in KML as an information dissemination format, and the more “standard” it can be, the wider its reach.

    Hmm. What was my point? Good thing this is just a comment.

  2. Brian Hamlin says:

    December 3rd, 2007 at 2:30 am (#)

    thanks for the heads up.. I am guessing the EPA is using GDAL to generate the KML, because not too many generators mark the KML as v2.0 anymore. Also, yes, it is kind of wacky that they use altitude this way. But hey, KML is a display format. You are going to see all kinds of things. I’m glad they have a CSV/HTML output listed right near by, too..

    This data isn’t for hard-core analysis, this is public outreach for techies. And there is a place for that. Overall, thumbs up

  3. Andrew says:

    December 4th, 2007 at 9:57 am (#)

    That’s a very good point Brian – the responsibility typically lay with the tool developers rather than the user.

    The common argument I’ve heard in favor of this type of behavior (mixing attribute data as Altitude), and as Jason points out, is that it “looks good in Google Earth” – however, 1) Google Earth isn’t the only KML client, so if we permit “mixing” of arbitrary bits and representations then we’re going to get the ‘GeoBrowser’ incompatibilies like we have in browsers (how should I render an extruded marker?) and 2) is KML really just a ‘drawing format’ like SVG in a different canvas space? That’s what the “it’s good enough for viz” seem to be saying.

  4. Jeremy Cothran says:

    December 11th, 2007 at 4:19 pm (#)

    Like Jason, I also see KML as a facilitating data exchange as well as data styling – my frustration with the ExtendedData element is that it seems to handle only a very simple case of listing the same attribute at multiple locations – a common situation I see in obsevation systems is that several redundant/different observation types are measured at the same location(approximately) with mainly differentiations in elevation measured. The simplest metadata schema that I developed to handle this situation is detailed at http://carocoops.org/twiki_dmcc/bin/view/Main/ObsKML

    An alternative(but file intensive) approach that fits the single attribute per placemark approach using the EPA example would be for EPA to list separate KML files for each observationType(say NOx) and unitOfMeasure(say ppm), leaving only the measurementValue, etc in the ExtendedData schema. Doing this still leaves establishing the observationType and unitOfMeasure per file as a best practice than part of an xml schema.

  5. Pawan Kumar says:

    September 7th, 2009 at 8:18 am (#)

    Hi,
    I want to know that i can imort the kml file into Bing epa if it is possible please tell what is the process to imort kml file into Bing epa

    thanks
    Pawan Kumar