In this post I’ll discuss an issue I tackled a short while ago – open redirects. But first, the story of how I got to it. Feel free to skip ahead to the technical discussion.
Our analytics for plnnr.com – the website for trip planning wasn’t working as well as we wanted. We’re using Google Analytics, and it’s hard generating the specific report we want, and when we did get it, it seemed to show inaccurate numbers. To partially alleviate the issue, I was required to add tracking pixels for facebook & adwords, so we can better track conversions.
For us, an “internal” conversion is when a user clicks on a link to a booking url (for a hotel, or any other “bookable” attraction).
After reviewing it, I decided that the best course of action would be to create an intermediate page, on which the tracking pixels would appear. Such a page would receive as a parameter the url to redirect to, and will contain the appropriate tracking pixels.
Description of problem
Let’s say we build the url for the page like so:
This page will load the appropriate tracking pixels, and then redirect to the given url (the X).
The problems are:
1. We are potentially exposing ourselves to Cross Site Scripting (XSS), if we don’t filter the redirect url correctly. A malicious website could create links to our page that will run scripts in our context.
2. A malicious webmaster could steal search engine authority. Let’s say he has two domains: a.com and b.com, of which he cares about b.com. He creates links on a.com to:
A search engine crawls his page, sees the links to his domain, and gives ourdomain.com authority to b.com. Not nice.
3. A malicious website could create links to ourdomain.com that redirect to some malware site, this way harming the reputation of ourdomain.com, or creating better phishing links for ourdomain.com.
Before we handle the open-redirect issues it’s important to block cross site scripting attacks. Since the attack might be possible by inject code into the url string, this is doable by correctly filtering the urls, and by using existing solutions for XSS.
As for the open redirect:
1. Non solution: cookies. We can pass the url we want in a cookie. Since cookies may only be set by our domain, other websites would not be able to set the redirect url. This doesn’t work well if you want more than one redirect link, or with multiple pages open, etc.
2. Checking the referrer (“referer”), and allowing redirects to come only from our domain. This will break for all users who use a browser that hides referrer information, for example, those using zone-alarm. Google also suggests that if the referrer information is available, block if it’s external. That way we are permissive for clients that hide it.
3. Whitelisting redirect urls. This solutions actually comes in two flavors – one is keeping a list of all possible urls, and then checking urls against it. The other is keeping a list of allowed specific url parts, for example, domains. While keeping track of all allowed urls may be impractical, keeping track of allowed domains is quite doable. Downside is that you have to update that piece of the code as well each time you want to allow another domain.
5. Robots.txt. Use the robots.txt file to prevent search engines from indexing the redirect page, thereby mitigating at least risk number 2.
6. Generating a token for the entire session, much like CSRF protection. The session token is added to all links, and is later checked by the redirect page (on the server side). This is especially easy to implement if you already have an existing anti-csrf mechanism in place.
7. A combination of the above.
Discussion and my thoughts
It seems to me, that blocking real users is unacceptable. Therefor, only filtering according to referrer information is unacceptable if you block users with no referrer information.
At first I started to implement the url signing mechanism, but then I saw the cost associated with it, and reassessed the risks. Given that cross-site-scripting is solved, the biggest risk is stealing search-engine-authority. Right now I don’t consider the last risk (harming our reputation) as important enough, but this will become more acute in the future.
Handling this in a robots.txt file is very easy, and that was the solution I chose. I will probably add more defense mechanisms in the future. When I do add another defense mechanism, it seems that using permissive referrer filtering, and the existing anti-csrf code will be the easiest to implement. A whitelist of domains might also be acceptable in the future.
If you think I missed a possible risk, or a possible solution, or you have differing opinions regarding my assessments, I’ll be happy to hear about it.
My thanks go to Rafel, who discussed this issue with me.