Overview reference
Other Implementations
Problem
Goal
Simply obtain a valid url from the command line, determine if the given url has a rss feed ( <url>/feed). If it does, display a list of all the items and allow the user to open a post by clicking on an item.
Pseudo-code
- Get url from user
- Make request to url
- Find rss feed link
- Check if rss feed exists: like content is in url/feed
- Else return not found
- Make request to feed link
- Create UI to display news feed
Solution
Get url from user
def geturlfromuser() -> str:
userinput: str = input("Type in a valid url ")
return userinput
This method simply retrieves the user’s url.
Validate Url
Before making any requests, we have to know if it points to something. Therefore with the urlparse module, I can extract the protocol or scheme from the user’s input and check if it is either an http or https scheme.
def validateurl(url: str) -> bool:
parsed = urlparse(url)
allowedscheme = ["http", "https"]
scheme: str = parsed.scheme
return scheme in allowedscheme
Remove trailing slashes
This function ensures there are no trailing / characters at the end of the user’s url.
def cleanurl(url: str) -> str:
if url[len(url) - 1] == "/":
return url[0: len(url) - 1]
else:
return url
Fetch the xml file at the user’s url
Once the url has been validated and cleaned up, we can make requests. Since we are simply retrieving data, the HTTP verb GET should suffice. Unlike JS where any bad response (4xx and 5xx status code) gets thrown as an error, you would have to raise the status to get a similar experience using the requests module. However when the status code is 2xx no error is thrown after response.raise_for_status call.
def fetchxmlasstring(url: str) -> str:
try:
response = requests.get(url)
response.raise_for_status()
return response.text
except HTTPError as httperror:
print("Errror:fetchxmlasstring {}".format(httperror))
except Exception as e:
print("General excepton: {}".format(e))
else:
print("done")
The response content can be read from the text property of the response object. As seen above there are two except blocks with different exception /error type. This allows us to narrow down the reaction of our program based on the type of exception thrown.
Get Item elements from XML
After that, it is time to extract the needed elements of the xml tree. According to this article, the most recent blog posts or articles are represented as item elements in the tree. The children nodes of each item element provides basic information about the specific post. For instance article title, description and link. For this exercise, only these elements are needed. By using the ET module the string xml content can be transformed into an object with methods allowing for specific item selection.
def getitems(content: str) -> List[Item]:
root = ET.fromstring(content)
items: List[Item] = list()
print("finding items")
for item in root.findall(".//item"):
title = item.find("title").text
description = item.find("description").text
link = item.find("link").text
item: Item = Item(title, description, link)
items.append(item)
return items
You may wonder, why did I use “.//item” and why not “item” ? I did actually, until I read the documentation in more detail. Based on the documented syntax, searching and selection by the tag alone means selection of DIRECT child nodes with the given tag e.g. item. If you open an RSS XML file, you’d notice that the item elements/nodes are not direct children of the root. Thus root.findall(“item”) would return nothing or an empty list. On the other hand, by using “.” ( current node) and / (a level down the current node) we can specify the level at which the selection should begin. Try to view the tree with an xml to json tree online tool; you should see that the item falls under the channel node, which is a direct child of the root node.
Note: .// would mean, two levels below the current node.
Setup the Graphical User Interface
With all the effort we’ve put in, we have to see something right? . Tkinter is apparently a well known module for quickly drawing up an interface in python. The expected GUI for this exercise is a list of items, each showing two non editable text fields ( title and descriptions) and a button ( for opening of the post in a browser).
def setupui(items: List[Item]) -> None:
top = tkinter.Tk()
for item in items:
label = tkinter.Label(top, text=item.title)
button = tkinter.Button(top, text="Open Post",
command=item.openbrowser)
description = tkinter.Message(top, text=item.description)
label.pack()
description.pack()
button.pack()
tkinter.mainloop()
Since each button has to open a browser to a different url, i decided to create a class for each item which has title, description and link instance variables. The link value will be used by the instance method called “openbrowser”, which will be bound to the corresponding button element. Once the elements are created, the gui is displayed.
class Item:
title: str
description: str
link: str
def __init__(self, title, description, link):
super().__init__()
self.title = title
self.description = description
self.link = link
def openbrowser(self) -> None:
webbrowser.open_new_tab(self.link)
Start the Method
def start():
url: str = geturlfromuser()
isvalid: bool = validateurl(url)
if isvalid == True:
cleanedurl: str = cleanurl(url)
content: str = fetchxmlasstring("{}/feed".format(cleanedurl))
if content and len(content) > 0:
items: List[Item] = getitems(content)
setupui(items)
else:
print("invalid url {}".format(url))
Run Script

Running python script for RSS Feed GUI
Full Code
Author Notes
This is most likely not the cleanest implementation, just a heads up.