Todd Huss has submitted the file import code for Wind and Tides. Periodically the site needs to fetch data from the NOAA and update the local database. Here’s his import code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
class Wind < ActiveRecord::Base
validates_presence_of :created, :wind_location_id
belongs_to :wind_location
# Get all of the wind locations, then fetch, parse, and save the data
def self.batch_update!(wind_locations = WindLocation.all)
wind_locations.each do | wind_location |
raw_noaa_data = Net::HTTP.get_response(URI.parse(wind_location.url)).body
raise "Received unexpected wind response from NOAA: " + raw_noaa_data unless raw_noaa_data=~/TIME/
save_wind!(raw_noaa_data, wind_location)
end
end
# Take a text based NOAA forecast, parse it, populate the model objects, and save it
def self.save_wind!(noaa_text, wind_location)
# ...
end
# NOAA publishes 181809Z for Wind which we need to turn into YEAR/MONTH/18 18:09 UTC
def self.convert_noaa_to_time(noaa_time, now = Time.now.getutc)
# ...
end
end
There’s one major structural change I’d make, and a few minor enhancements.
The structural change I’d make is to move the downloading and parsing code out of the model and into a utility class. This lets you keep your models focused on your application, and have the file import code tucked away in the lib directory. At the same time, if we tie the importer to a single WindLocation and remove the loop in the batch_update! method, we make the focus tighter:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
class WindImporter
class << self
def batch_update!(wind_location)
raw_noaa_data = Net::HTTP.get_response(URI.parse(wind_location.url)).body
raise "Received unexpected wind response from NOAA: " + raw_noaa_data unless raw_noaa_data=~/TIME/
save_wind!(raw_noaa_data, wind_location)
end
# Take a text based NOAA forecast, parse it, populate the model objects, and save it
def save_wind!(noaa_text, wind_location)
# ...
end
# NOAA publishes 181809Z for Wind which we need to turn into YEAR/MONTH/18 18:09 UTC
def convert_noaa_to_time(noaa_time, now = Time.now.getutc)
# ...
end
end
end
class WindLocation < ActiveRecord::Base
has_many :winds
def batch_update!
WindImporter.batch_update!(self)
end
end
By making WindLocation#batch_update! delegate to the importer, we can work with it in a more object oriented manner. (Also, using delegation makes it possible to use a “mock� object for the importer during testing.)
1 2 3 4 5
# Import the files for one location @wind_location.batch_update! # or, import them all WindLocation.find(:all).each(&:batch_update!)
With that change out of the way, we can make a few enhancements to the implementation of some of the methods. batch_update! has a big chunk of net/http boilerplate which can be replaced by a tiny piece of open-uri.
1 2 3 4 5 6
require 'open-uri'
def batch_update!(wind_location)
noaa_text = open(wind_location.url).read
# ...
end
There are a few snippets which can be simplified in save_wind! too. Here’s the original:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
def self.save_wind!(noaa_text, wind_location)
# Filter and append lines with data
lines = Array.new
noaa_text.each_line do | line |
next unless line=~/^\d\d\d\d\d\dZ/
lines << line
end
raise "Received unexpected wind report from NOAA: " + noaa_text unless lines.length > 0
# Process each line in reverse order (oldest first)
lines.reverse_each do | line |
created = convert_noaa_to_time(line[0,6])
wind = Wind.find_or_create_by_created_and_wind_location_id(created, wind_location.id)
class << data = line.split
# Convert field to int unless it's "--" then nil
def to_i(index)
(self[index]=~/^-+/)?nil:self[index].to_i
end
end
wind.direction = data.to_i(wind_location.position)
wind.sustained = data.to_i(wind_location.position + 1)
wind.peak = data.to_i(wind_location.position + 2)
wind.save!
end
end
The initialization of lines can be done in one line by using Array#grep and a slightly simplified regular expression. (Array#grep simply returns all elements that match the given regex.)
lines = noaa_text.grep(/^\d{6}Z/)
Next, instead of calling the unwieldy find_or_create_by_created_and_wind_location_id we can use class methods on associations to initialize wind. (For more about class methods on associations, you can read the article I wrote when we were dissecting the Tracks application.)
wind = wind_location.winds.find_or_create_by_created(created)
Finally we can simplify the code which converts integers. In the NOAA format, �—� represents nil, and other integers are as expected. While the original code converted the data by defining a method on the data array, we can instead create a new array containing just the integers with Enumerable#map. After creating the new array, we can safely retrieve the figures.
1 2 3 4 5 6
data = line.split.map{|d| d.starts_with?("--") ? nil : d.to_i }
pos = wind_location.position
wind.direction = data[pos]
wind.sustained = data[pos + 1]
wind.peak = data[pos + 2]
Jamis says:
Another option for parsing integers is to use the Integer method. It will raise an exception if the argument cannot be converted, so you could just do something like:
data = line.split.map{ |d| Integer(d) rescue nil }
That won’t work, of course, if you want any non-integer value besides �—� to map to 0, but it does reduce the line-noise a bit.
This makes the code much easier to follow and a little more explicit. The fully refactored save_wind! is presented below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
def save_wind!(noaa_text, wind_location)
lines = noaa_text.grep(/^\d{6}Z/)
raise "Received unexpected wind report from NOAA: " + noaa_text unless lines.length > 0
# Process each line in reverse order (oldest first)
lines.reverse_each do | line |
created = convert_noaa_to_time(line[0,6])
wind = wind_location.winds.find_or_create_by_created(created)
data = line.split.map{|d| d.starts_with?("--") ? nil : d.to_i }
pos = wind_location.position
wind.direction = data[pos]
wind.sustained = data[pos + 1]
wind.peak = data[pos + 2]
wind.save!
end
end
Thanks again to Todd for making the submission.
No comments yet.
You must be logged in to add your own comment.