Distance from London to Cambridge where exactly?

Sometimes I get a little confused. I take on too much and drop the ball or lose my focus. Yesterday I messed up big time and put something live that was broken. Sorry to the 1000 or so people I must have annoyed during the time it took me to fix it.

Lesson learnt, being lean and agile has its downsides too.

I find that putting my thoughts to paper helps clarify things and sets me back on track. So, in order to help with this aspect of my life and perhaps keep up with where I’m going I thought I’d share some of my thought processes with whoever wants to read them.

I won’t do it every day or every time I make a change, but in addition to myself it might help someone else somewhere and if that’s achieved too then it’s a double win bonus.

When it comes to web pages and forms, people often do odd things or take actions that we didn’t anticipate that cause us headaches initially but eventually help us to become better coders, and as a consequence perhaps find ways of improving user visits for their next time (*if they come back).

For some reason, lots of people tend not to think outside of their own locales. So, a person who lives in London England, looking to go to Cambridge England might just search for “distance from London to Cambridge” in a search engine, not really caring for or appreciating the notion that there are other places on the planet that so happen to share similar names.

The pages returned in search engines are often pretty hit and miss with some doing ok and some not so.

Some search engines do the geo look up thing and try and give people results from their own country tld’s (top level domains).

So a user using Google UK might see results that are predominantly from the .UK domain space and user in India a TLD from the .IN domain space etc.

The better websites (IMO) do a little bit of disambiguating. They appreciate that their sites need to be well structured and well thought out and are doing their best to diminish user confusion or consternation trying to ensure that the majority of users, if not all, are satisfied.

Today, I noticed that some users had arrived at distantias via old redirects that were sub optimal in terms of most were absent of country or state data.

Previous to my latest update a user who had managed to arrive at a page like this “http://www.distantias.com/how-far-from-london-to-cambridge.html” would have received a message delivered via a ‘lightbox’ that informed them that the locations they’d searched for were absent of any country information and that as a result, the data returned might not be as accurate as they’d hoped. This worked okish, but I grumbled everytime I saw it and was quietly thinking about how I could improve it.

So, in order to do so I had to think about how I could programmatically determine a variety of things relative to their locational intent e.g. what they were looking for precisely and how was I going to determine that and what where the options for doing so.

I thought about the user first and what I can determine or know about them.

Most people provide an IP address via the $_SERVER[‘REMOTE_ADDR’] variable from their browsers, we can then parse these addresses via various IP lookup tools and determine geo location specifics like city, region, country, postcode.

They’re not perfectly accurate, but they’re an indicator nonetheless. Based on these, I can then begin to take more educated guesses that a user from the UK isn’t likely to be that interested in knowing the distance from London UK to Cambridge MA, USA or from London Ohio to Cambridge UK just as much a user from Cambridge Massachusetts would n’t be that interested in the distance from London England to Cambridge England.

With these in mind, I can begin to hone any query to say

“Ok, so there’s this guy from the USA and he’s landed on this page, try to give him what he needs”.

So using his IP address country I can redirect him to http://www.distantias.com/distance-from-london_united_states-to-cambridge-united_states.htm and give him what he needs right?


Alas,whilst the page in question is certainly better than the one he arrived at first off, it happens to default to London KY and not London Ohio (or London Arkansas and London West Virginia for that matter) .

On this occasion he happens to live in London Ohio, so is a little confused when he sees the maps for London, Kentucky *.

Unless his IP address provided to us is really accurate with state data too (not all will) I have to begin to think about what other factors I can take in to account that will cater to the biggest audience.

Showing other disambiguation options is definitely one option (coming soon) and you might already have seen them elsewhere  but they aren’t always necessary and then can clog limited space with things that aren’t needed.

“Hello fool,you landed on X but there are also other towns named X in states ABC and D here are a few other options if you need them”

That works generally and whilst in an ideal world we’d be able to force people to enter, city, state and country inputs, the reality is we won’t and if we try too hard all we’ll do is piss people off. Better to cater for the needs of the many is the best approach which means in situations like those described if we don’t have the users state or city data then we’ll look to use something else.

In this regard we have population data.

London OH with it’s 9978 or so residents beats London Kentucky with its 8078 residents so, in the spirit of democracy I’m going to do any look ups based around population size first. Any data I ping upstream will be based on population size.

Of course, these are edge cases, but there are a hell of a lot of them and virtually millions of combinations of conflicting city names, so it’s not something to ignore. Overtime, I might also find that other places were more popular in terms of use. I might determine that the data suggests that more people are interested in page location Y than they are Z and fold that data in to the outcome. An example here, might well be a popular tourist location. Population small, but transient visitors high. Seasonality might too be an additional factor that is worthy of consideration also.
Ultimately, if we can get people to ask the right specific questions then we can better give them what they need. The Google places API is a fantastic step in the geo location space and when used intelligently is a great tool for funneling people to enter the right data as it pre-populates additional choices based on user keystrokes in realtime, but it’s no final solution…yet.
Anyways, thanks for reading; hope you gained a perspective or two.

(* Curiously, in the example shown here http://www.distantias.com/distance-from-london_united_states-to-cambridge-united_states.htm the Google map API defaults to the second most populated place of the three Londons which also happens to be the second farthest away too)



Posted in development and tagged , , , .

Leave a Reply

Your email address will not be published. Required fields are marked *