The PolarLava Weather Project

This entry serves two purposes.  1) To split an older weather related entry away from a more general site update message.  And 2) To document progress and or changes to my weather project as this thread seems to have grown over time.

-klp (3/17/2004)

10/17/2003
Previously I had been using Geo::Weather which pulls weather from the weather.com site.  I've been a frequent contributor to this project as I also used it to pull weather on an hourly basis for my CycleLog project.  It's a fairly constant battle with weather.com making changes and then having to fix the Geo::Weather code to properly parse their page.  Recently they added comments in their code that encourages you to use their free XML feed and not scrape their HTML.  This is good and I welcomed the idea of an XML feed until you read the the fine print.  Basically you have to provide a link back to them which isn't the end of the world, but they also reserve the right to shove advertising down to you that you must accept.  I don't accept and as such have decided it's time to move to free publicly available weather from NOAA.  Hell I'm paying for it (via my taxes), I might as well use it right?  So as such, I've written a Perl module I call "Geo::WeatherNOAA2".  Why the "2"?  There's already a Geo::WeatherNOAA, but it only grabs current text weather data from NOAA and that wasn't what I wanted.  My goal was to duplicate the functionality of Geo::Weather, but utilizing NOAA's freely and publicly available interface.  And although I know there's still some problems with it for certain locations and sometimes the forecast displays some quirky behavior, it's working for me.  When I have time I will work on some of these bugs and make it publicly available to all.  If you'd like to play with it now you can e-mail me at kevinp ~ AT ~ polarlava.com and you can join the fun!

  

12/11/2003
Okay, I just got the following email from weather.com.  As I suspected, they will be killing Geo::Weather's ability to scrape their site in the near future.  I guess the one thing I am very glad to see is that they are willing to cooperate in make it work with their new XML feed.  Mr. Pearson's tone is quite polite and seems accommodating considering the litigious society we now live in.  Now I feel somewhat bad about slamming weather.com about possibly trying to protect their intellectual property by blocking the scraping.  It would appear that it's a technical not IP based decision.  In any case, I wanted to do this on my terms and not have to deal with weather.com's EULA so I've again informed Mike Machado that I'm done with weather.com and have successfully moved forward with pulling weather from the NWS.  I wish him well in his pursuit of implementing Geo::Weather to access weather.com's XML feed.

Date: 12/11/2003 14:18:38 -0500
From: "Joe Pearson"
To: mike ~ AT ~ innercite.com, mike ~ AT ~ cheapnet.net
CC: kevinp ~ AT ~ polarlava.com
Subject: re: Geo::Weather, better data source

Mr. Machado and Mr. Papendick,

Thanks for the effort you've put into creating and maintaining the Geo::Weather Perl module.  As a Perl Hacker myself, I appreciate the work of programmers who create the modules that make Perl so easy to use.

I've noticed that your code currently requests information for search results and data from our groups of machines that serve Web pages optimized for human consumption (www.weather.com and www.w3.weather.com).  In the very near future, those pages will be modified to request data from our XML servers, and the presentation and the data will be assembled within the browser.  At that point, you will no longer be able to scrape the pages on those servers to get weather data and search results.

In order to ensure that your module continues to function, and to prevent your module from ceasing to deliver meaningful data whenever I change an image path or HTML markup, please consider switching Geo::Weather to using our XML OAP product as your data source.  You can sign up for it at http://www.weather.com/services/xmloap.html .  While I understand that the process may be a little cumbersome, it will improve the reliability of your product by giving you a fixed, relatively permanent source of weather data. After signing up, you should receive an automated email message giving you the URL of our API and DTD in PDF format.

If you require assistance in implementing our XML data feed, I'd be happy to answer any questions.

Regards,
Joe Pearson
jpearson ~ AT ~ weather.com
770-226-2635

The one who says it cannot be done should never interrupt the one who is doing it. - The Roman Rule

3/17/2004
Just some things you should know if you are interested enough in what I'm doing here that you're still reading this message!

First off, the module name is no longer "Geo::WeatherNOAA2", but rather "Geo::WeatherNOAA_NWS".  Why?  I don't recall at this point exactly why I changed it, but I never really liked the "NOAA2" moniker.

I recently had an inquiry about this stuff from Rajarajan Rajamani (rrajarajan ~ AT ~ lucent.com) so I thought I would record some of the excerpts what I told him in case someone else might also be interested in using this module.

The current weather retrieval works very well and is stable.  The one exception being the current conditions image ($current_wx{wx_image} in the code).  The problem here is that they (NOAA/NWS) don't seem to have a standard set of condition messages, so it's hard to accurately map all of the possibilities.  Basically when I find a new one -you can tell because you get the "na.png" image I add it to the lookup table function get_wx_image().  I've thought about reworking that to do some more generic type parsing, but haven't bothered yet.  (Time/Priority issues!)

The Forecast section is a lot more unstable.  They seem to have a couple of ways of laying out the HTML depending on the location.  I'm guessing that this again is a regional sort of issue.  Thus if you are in the northeast US it will probably work fairly well as it does for me.  But sometimes you will see some quirky things.  For some locations it can be pretty messed up.  Again, this is largely due to the inconsistent HTML that is returned for one location versus another.

6/8/2004
I've added extracting the current radar image to Geo::WeatherNOAA_NWS.  It's not part of the standard current weather display, but pulled separately.  This allow the current weather page to remain as it is except the radar image now appears at the bottom of the page, below the "Get Weather" form.  I felt this was a cleaner approach without reducing the size of the radar image.  I might also add a satellite image soon as extracting and displaying would be very little work as it is quite similar to the radar image.

10/15/2005
Well, I've made another round of changes to get my weather page working again.  The forecast has been broken for a while, but recently the current weather also died.  The NWS changed their page recently, thus breaking my scraping routine.  The good news —for me at least, is that their changes have made extracting the forecast data much easier.  The bad news is that it seems to only work for data that comes from the NWS Eastern Region.  This is because each region has its own web site and they all appear to use different version of the location weather data page, thus the format of each is different.  I've disabled the ability to lookup other location via my weather page, because it now seems to only work for a very limited number of location.  I'll still give away what I have if you want to try it/modify it to work for you.  I only ask that please send your changes back to me if you do get it working for a larger area.  You can test your location quickly by going to my weather page and adding "?zip=12345" to the URL where "12345" is your zip code.  I'd like to be able to keep this up, but it's such a pain in the ass to fix this everytime they change something.  And I certainly don't have the time to figure out all of the various location specific formats.  Thus at this point I'm really only in this for myself…well, actually I've always pretty much been in it only for myself! ;)

I've done some searching around and there are a few other options available to you (and me), so you may prefer to look at those.  I've choosen to continue with this because the other solution don't meet my needs.  Also, the NWS doesn't currenly offer the data I can scrape from the location weather page in XML format.  It is beyond me why they don't offer the equivalent of the data on this page as an RSS feed.  That would certainly solve the problem for all of us.  Maybe someday.

March 17, 2004 @ 12:46 pm | Category:
comments powered by Disqus