Phil Factor's Phrenetic Phoughts

Simple-Talk columnist
The wilder shores of Transact SQL    Phil on Twitter   Phil on SQL Server Central"

Spoofing Popularity-A warning to Webmasters

Published Monday, March 13, 2006 6:56 PM

In which Phil tries to warn you of the dangers of over-valuing Website-traffic Stats.

A friend who runs a local history website in a rural area of England surprised me by saying that he values a single letter of interest or appreciation more highly than any amount of increased web-traffic on his site. He gave me a telling example. He once did a transcription of a five hundred year old bill for repairing a bridge in the locality. He admitted that it was probably of no interest or importance to anyone but a small group of historians. Yet it became consistently one of the most popular pages on his site. Many web-masters would have then filled their site will lots of other 400 year-old transcriptions of bills or repair, in the hope of increasing their traffic. Not he; He was curious and so investigated. After 'googling' around for a while he discovered that there was, near the bridge a car park that the misguided local authority had placed for people who came out to enjoy the countryside. It was being used as a meeting place for people who wished to engage in unusual or bizarre sexual practices in cars or hedges. his page was being picked by various automated crawlers and various hopeful 'googlers' looking for a partner.

I know of many other examples where 'hits' and 'visits' to websites have been misinterpreted. My friend the historian couldn't care about traffic as he does the work for pleasure but...on a commercial site this could be hard on the wallet..

As part of my job, I test websites with simulated traffic to see how they stand up, and to iron out the problems before they are made public. I use tools that would horrify the average Webmaster. One can simulate the user agent and the source IP address. It is easy to simulate normal traffic, send POST and REQUEST HTTP messages, do logins and XML-based transactions, FTP, POP3, and SMTP. The part I enjoy most is to spoof names and addresses, card details and so on, or provide messages that seem to have come from a real person. I use only SQL Server functions and procedures, with the bare minimum of command-line utilities. I mention all this not to boast, but to point out that it is part of the toolkit of a lot of IT people..

It is easy to be malicious with this sort of tool. Have you noticed the sponsored links on Google? These are 'pay per click'. I shudder when commercial concerns sign up to this sort of deal, just as I do when marketing firms offer to charge by the amount of increased traffic to a site. Every time your automaton clicks on one of these links, someone is charged for it. In the next world, perhaps, where there is no malice or competitive drive, this would make perfect sense. In this world, the only deal that makes commercial sense is to pay by the number of people who first used the sponsored link, and then went on to purchase something. In the meantime, it is a clilling idea that someone might set an automaton to click on the links of their rivals in business. How could one possibly tell?.

I recently had to advise a client who were completely transfixed by the idea of hits and visits as measures of the performance and quality of their website. I implored them to take a realistic and cynical approach. They were about to sign a contract with a ‘Web Marketing’ firm that involved them paying them fees in proportion to the upturn in traffic to their site. This smelt bad to me. It smelt so bad that I promised them I’d do it for free. I rushed home, got out my toolkit and let ‘em have it through both barrels. When I returned, they were walking on air and were delighted with whatever it was I’d done, even though I was purple in the face and shouting “It’s spoofed! It’s spoofed!”. It was only after considerable sober talking that the truth sank in: that their rock solid measure of the site’s performance was a quagmire. This was culture shock..

Unfortunately, it is not only the angels who have the ability to spoof web traffic. Those on the dark side share the technology. One ingenious fraud that has taken in several IT websites in the states starts with plagiarism and then gets worse. Initially, someone, usually a lecturer in an IT department of (for some reason) an Asian University, copies out of something written by an expert. In the case of SQL databases, it tends to be taken from Joe Celco or Ken Henderson, making only slight cosmetic changes. Joe Celco has written so much he wouldn’t notice and would just think it was someone agreeing with him (I just plagiarised that from Ken Henderson). Recently, they have become confident enough to lift stuff straight out of MSDN. They then offer it to one of the commercial fee-paying websites. It is quite easy to spot them. The surprise is how popular they seem to be. The number of visits they get is quite amazing. The Webmasters therefore love them and buy more from the same source. Number of visits? Hmmmm.… My thoughts go back to my trusty traffic-spoofing toolkit. Setting this up to produce visits in any website statistics, and fool all but the cleverest stats packages would the work of an idle moment. I could even generate the various appreciative comments that they get in the forums. This is not because of the artificial intelligence of my programs, but the natural stupidity of some of the real contributors to the forums. If it is done right it is very difficult to prove, but I’d just warn all webmasters to treat website statistics with a lot of caution and not to draw too many conclusions from them.

Comments

 

AjarnMark said:

A little bit late to be adding on, I know, but I just ran across your posting.

You have made the perfect argument in favor of Affiliate Marketing programs, like the one Amazon.com uses. These programs are set up such that the traffic referrer is paid not based on the volume of traffic they generate, but the volume of SALES that their traffic generates. A true pay-for-performance scenario that the commercial site is happy to support. Expect this trend to continue to grow dramatically as more and more companies realize and accept that high traffic does not necessarily equal high sales.
May 24, 2006 1:18 AM
 

Phil Factor said:

Commercial concerns are certainly under a lot of pressure to agree to Pay-Per-Click. Pay-Per-Sale is infinitely better as all parties then win, unless, of course the traffic is bogus. So, if your Internet Marketing Company refuses a deal based on resulting sales, it is likely that they know that their traffic isn't going to result in sales.
I recently worked for a company that got cable channels to agree to putting their adverts on the Telly for a percentage of sales rather than a fixed fee. Obviously, everyone must trust the monitoring software but that is where we come in isn't it?
May 24, 2006 1:44 PM
You need to sign in to comment on this blog

















<March 2006>
SuMoTuWeThFrSa
2627281234
567891011
12131415161718
19202122232425
2627282930311
2345678
A SysAdmin's Guide to Change Management
 In the first in a series of monthly articles, ‘Confessions of a Sys Admin’, Matt describes the issues... Read more...

Exchange: Recovery Storage Groups
 It can happen at any time: You get a request, as Admin, from your company, to provide the contents of... Read more...

Build Your Own Virtualized Test Lab
 Desmon explains the fundamentals of building a test lab for Windows servers and Enterprise applications... Read more...

Rendering Hierarchical Data with the Treeview
 It sometimes happens that Web Server controls that visualize data don't quite fit with the way that... Read more...

SQL Server 2008: Performance Data Collector
 With Performance Data Collector in SQL Server 2008, you can now store performance data from a number of... Read more...