Pastie now auto-senses if line-wrap is a bad or good idea. Feedback?
## mark a section (Learn more)
require 'rubygems' require 'scrubyt' Scrubyt.logger = Scrubyt::Logger.new scraped_data = Scrubyt::Extractor.define(:mode => :verbose) do fetch 'http://167.153.150.32/RI/web/index.do?method=alphaSearch&state=prompt&alphaValue=A&requestedSortOrder=2' next_page ">", :limit => 2, :resolve => 'http://167.153.150.32/RI/web/' item 'ARTUSO PASTRY SHOP' end puts scraped_data.to_xml ### result: =begin [MODE] Learning [ACTION] fetching document: http://167.153.150.32/RI/web/index.do?method=alphaSearch&state=prompt&alphaValue=A&requestedSortOrder=2 [ACTION] Evaluating item with /html/body/table/tr/td/table/tr/td/table/tr/td/table/tr/td/table/tr/td/div/div/div/div/div/div/div/div/div/center/table/tr/td/table/tr/td/b/a [ACTION] fetching document: http://167.153.150.32/index.do?method=alphaSearch&selection= &searchValue= &requestedSortOrder=2&boroughSelect= &state=prompt&alphaValue=A&pageNum=2 /usr/local/lib/ruby/gems/1.8/gems/mechanize-0.8.5/lib/www/mechanize.rb:228:in `get': 404 => Net::HTTPNotFound (WWW::Mechanize::ResponseCodeError) from /usr/local/lib/ruby/gems/1.8/gems/scrubyt-0.4.06/lib/scrubyt/core/navigation/agents/mechanize.rb:52:in `fetch' from /usr/local/lib/ruby/gems/1.8/gems/scrubyt-0.4.06/lib/scrubyt/core/shared/extractor.rb:158:in `evaluate_extractor' from /usr/local/lib/ruby/gems/1.8/gems/scrubyt-0.4.06/lib/scrubyt/core/shared/extractor.rb:133:in `loop' from /usr/local/lib/ruby/gems/1.8/gems/scrubyt-0.4.06/lib/scrubyt/core/shared/extractor.rb:133:in `evaluate_extractor' from /usr/local/lib/ruby/gems/1.8/gems/scrubyt-0.4.06/lib/scrubyt/core/shared/extractor.rb:132:in `catch' from /usr/local/lib/ruby/gems/1.8/gems/scrubyt-0.4.06/lib/scrubyt/core/shared/extractor.rb:132:in `evaluate_extractor' from /usr/local/lib/ruby/gems/1.8/gems/scrubyt-0.4.06/lib/scrubyt/core/shared/extractor.rb:85:in `initialize' from /usr/local/lib/ruby/gems/1.8/gems/scrubyt-0.4.06/lib/scrubyt/core/shared/extractor.rb:32:in `new' from /usr/local/lib/ruby/gems/1.8/gems/scrubyt-0.4.06/lib/scrubyt/core/shared/extractor.rb:32:in `define' from scraper.rb:7 =end
This paste will be private.
From the Design Piracy series on my blog: