Success! My chron automated Pi scrape script to newsgroup post has gone online

Woo-hoo! Although the trails and tribulations of bringing the spark of an idea to fruition were at times overwhelming, they just sweetened the success when it worked, on it's own even.

Short story long; So I'm reading Monk's Python about scraping Amazon for keywords and I expand the exercise to grab the prices as well and sort the list by cheapest item first, just to play with it. I was initially drawn to his article by the www protocol navigation ability and was amazed at the revelation that nabbing a webpage was essentially a method of opening a file. Later I saw a c.s.raspberry-pi post about posting e-mail and put that together with a contest sweepstakes I'd just learned about. "Gee, this flash page sure loads fast, but takes ten minutes to render!" Getting the keyword in a browser blew wet mangy goats. Couldn't I just scrape the site and nab only the keyword? So I set about, on my msWin laptop, using PuTTY to command and TextPad to author Python 3 code over to my otherwise idle Raspberry Pi +B. Using my getwebpage() function to snag the page to a string and my locate_value function to select the page code which consistently precedes the contest keyword and in the bargain I now have written two nifty subroutines which I'll likely re-use in the future: [code]

import urllib.request

def getwebpage(URL): rawsource = urllib.request.urlopen('http://'+URL) webpage = str(rawsource.read()) rawsource.close() return(webpage)

import sys

import os

def locate_value(fieldname): valuestart = webpage.find(fieldname, 0)+len(fieldname) valueend = webpage.find('"', valuestart+1) return webpage[valuestart:valueend]

[/code]

For awhile, I'd wake up everyday before 7:00 a.m. to greet the new keyword as the site switched it, and that was satisfactory. I'd run WotD.py (Word of the Day) and post the result manually to the relevant newsgroup before utilizing the keyword selfishly in my own online entry (for tthe same reason I always feed the cat before feeding myself - that way I don't forget to help others after I'm satisfied.) Then I came across the email posting question and realized there must be a library for newsgroups handling and found the nntplib. For a week I messed about through my stupid punctuation, syntax and conceptual errors to be able to post a newsgroup news:alt.test.only (primarily delayed by the server error 'duplicate post'.) I skipped the 'mailmerge template' style for now and just went with a hardcoded-into-the-script string variable concatenation, but I'd like to change that to a template file, a values file (as I've done with the loginID and password) and using a print.format() to insert the values into the template form (realizing what immense power the PI would then become to spew spam all day.) The wotd.py script 'push' today's keyword, or the Word of the Day, ontop of a WotDlog.txt file when it's done which makes it very simple to open and 'pop' read yesterdays_wotd=WotDlog.readline() and then, in a while loop, wait until the website publishes the change to today's new WotD. If I could get the Pi to not require it to wait for me to launch the script, well within the capabilities of Debian OS, then I could sleep in, or even go on vacation. Going live meant understanding Debian's Chron to schedule a WotD.sh script which simply contains "python3 wotd.py" - the biggest trouble in that was to remember to end the launch command with '&' so it runs in the background, not ties up the command line. Now I wonder if I should make the program silent - not printing - or to redirect on the command line a ' > stdout' so it verifies it worked? I guess just peeking at the newsgroup with my laptop Thunderbird client would be sufficient, and return me to the ranks of data users. Although I might learn a great deal from tacking onto the end a script extension which automatically fills in the Sweepstakes entry form with my WotD keyword and contact info, the thought experiment would be moot as the contest ends soon.

I skipped the proper practice of separating data from code just to 'get-'er-done' quick and dirty, primarily because I was struggling with Python3's strong typing of data. I couldn't post because my call to "servers.post(open('WotDtoday.txt', 'rb'))" came back "expected string, not bytes" error when I was reading the template from a file, but confused by the fact that it accepts and publishes if it is called to open a file in binary. WTF? Much delay was spent in stringvar.encode('utf_8') and stringvar.decode('utf_8') So I hardcoded the result into the program then told the program to write a temporay 'today' file which it later reads with a direct open in servers.post(open('WotDtoday.txt', 'rb')) , go figure. Again, poor pracice, quick and dirty, but it works. The code really should be MUCH simpler with a single line such as (but certainly miscoded) servers.post(open('WotDtemplate.txt', 'rb').replace('WotDvalues.txt')) using print formatting instead such that the template file keeps %placeholders, and the values file is the format string containing groups=alt.test.only, placeholder=WotD, newsserver=mynews.provider.com et cetera.

Attached is the way too verbose code.

Next up - another futile endevor, scrabble rack anagrams for quick high values results to an online radio contest points WordHoard, which is going away at the end of this month, January, 2016. It has a keen sort and a secret - a really good, concise, dictionary to draw candidates from.

And a prosperous new year to you all ?(o=8> wiz.

--
   All ladders in the Temple of the Forbidden Eye have thirteen steps. 
There are thirteen steps to the gallows, firing squad or any execution. 
  The first step is denial...                           Don't be bamboozled: 
        Secrets of the Temple of the Forbidden Eye revealed! 
           Indiana Jones? Discovers The Jewel of Power! 
          visit ?(o=8> http://disneywizard.com/
Reply to
DisneyWizard the Fantasmic!
Loading thread data ...

Usenet does not normally support attachments: It is a text-only medium.

--
Canada is all right really, though not for the whole weekend. 

"Saki"
Reply to
The Natural Philosopher

Shh, walk away slowly and don't look at his face.

Reply to
Rob Morley

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.