Monday, September 17, 2012

Finding Unpublished RSS Feeds In Sites

I do most of my content consumption via RSS (at the moment using Google Reader). Technical blogs, comic strips, news and podcasts. All coming to me in one centralized location, there is simply no beating it (If I had to go through all the sites of my 70+ subscriptions to see what was new it would have taken me a considerable amount of time each day).

Yesterday a journalist I enjoy reading updated her site and moved to a new domain. (It's an Israeli blog about moving to a small town in the desert, and it moved from: http://inmydeserthome.blogspot.co.il/ to http://www.mydesert.co.il/).

The new design looks great, but I couldn't find the RSS/Atom button to subscribe to the new site. I was very disappointed but I knew from past experience that  even if the site doesn't have an RSS button, many sites are built on Content Management Systems (CMS) that automatically create RSS feeds whether you want them or not.

So to find this unpublished RSS feed we can follow these steps:
1. We open the site using our browser, and look at the source (in all browsers that can be done by clicking the right button on your mouse anywhere in the site and selecting "View Page Source" or "View Source".
We will get a page filled with the site's code, and it will appear like so:


2. In the source code we search for RSS (in Firefox this can be done by clicking Ctrl+F, and entering RSS in the search box)






3. Search for a link element for the RSS stream in that tag. A link will most likely be appear as an "href=" and a tag is defined between two angle brackets <>.




4. Now, if this link is pointing to an absolute (AKA, full. AKA starting with http) url we can just copy that url and this will be what we are looking for - http://mydesert.co.il/blogrss.aspx.






However, some sites (like Ctrl+Alt+Delete [Cad Comic] ) post a relative url, that can't be used as is. You can recognize this type of url because it doesn't start with http.



In relative urls you have two options. The url either starts with "/" or it doesn't.
If it doesn't start with / you append it to the url of the page you are in.
For example, if the url of our page was: "http://www.somefakeurl.com/somefakeblog/" and the RSS link we found was "rssfeeds.com". The correct url for the RSS feed would be: "http://www.somefakeurl.com/somefakeblog/rssfeeds.com".

If it does start with / we append it to the domain's url.
For example, if the url of our page was: "http://www.somefakeurl.com/somefakeblog/" and the RSS link we found was "/rssfeeds.com". The correct url for the RSS feed would be: "http://www.somefakeurl.com/rssfeeds.com".

Or in the case of Ctrl+Alt+Delete - http://cdn.cad-comic.com/rss.xml




This method worked for me on several sites, and can also be used to easily find hard-to-spot RSS buttons.

2 comments:

  1. Great thinking! I'm disappointed RSS never caught on and that Google seems to be abandoning it. Nice to know some platforms (e.g. Wordpress) still respect it. -b

    ReplyDelete
  2. Thanks Benjamin! I couldn't agree more. Twitter/G+/Facebook and all those streams are great if you don't care whether you miss items or not. On them things come and go. But one of the benefits of RSS is that things get piled up until you see them. You don't lose a great post just because you weren't online and watching when it was posted.

    ReplyDelete