According to Google Webmaster tools, an affiliate site of mine has over 26,000 links from several hundred different domains. That sounds great, doesn’t it?
One problem though – the majority of the links don’t seem to actually exist, and I have no idea where they came from!
I swear, I’ve been up to nothing fishy with this domain, (really!) but the vast majority of these links are coming from spammy sounding subdomains, and I can’t figure out what’s going on. Is someone trying to “linkspambomb” me?
The other day I was doing some routine checks in WMT, and I noticed that my inbound link count was 100 times higher than I had expected.
Upon further examination, I noticed that most of the backlinks came from subdomains, but when i tried to view the pages they didn’t exist! In fact, some of the subdomains are even on a domain CALLED dontexist.com!
Here’s a small sample of some of the domains that are linking to me..
7zoom.com/
al367.homelinux.org/
alons819.dontexist.com/
ammon310.dontexist.org/
aries.fluther.com/
arnett486.servegame.org/
august166.homeftp.net/
belv218.ath.cx/
blayloc855.dyndns.ws/
bloun966.webhop.net/
bodna258.webhop.net/
bohrman422.podzone.net/
WTF?
Again, I’ve not been participating in any automated link schemes, or auto-generating content, or doing anything that could explain what’s going on, so it must be some form of attempted sabotage, right?
All I can figure is that someone is trying to knock me out of the game but it makes no sense that most the links can’t be viewed. I’ve done a site: search of many of the root domains trying to figure out what they are, but they’er mostly crap, and many are in other languages too.
I’d love some feedback on what or how someone’s jerking me around, as well as some thoughts on what to do about it. There’s no “remove backlinks” option in webmaster Tools like there is for URL’s, but I hate to just leave ’em there… Anyone got any ideas?
Scott Hendison is the CEO of Search Commander, Inc. and a recovering affiliate marketer. He is also one of the founding board members of SEMpdx. Find out more about him at his website, SearchCommander.com.
Hi Scott,
Google revamped the data behind the backlinks feature in Webmaster Tools and has started using more data from “Caffeine”. Their goal is to have more fresher & up-to-date data.
S.E.Land post: https://searchengineland.com/google-shows-webmasters-more-links-in-webmaster-tools-45579
We have also seen the newly reveled link data for ourselves and clients. While we too have steered away from dubious linking schemes, it seems that most sites will attract them regardless.
The GMT data also seems to be more in line with Yahoo’s Open Site Explorer in regards to the questionable types of scraper site links that a website will gather without attempting to.
thanks James, and I know that’s true. but I’m not sure it applies in this case because of two things…
1. Out of nearly 50 sites in that WMT account, this is the only one that’s had this happen to it.
2. Why are the links all bad? Pages not found? Is there some shady competitor damaging tactic of building spammy link and removing them?
My guess is someone’s trying to google bowl you. I’d see if other sites in the niche aren’t getting hit…
FWIW, there’s a youmoz post with a similar story, I think by Rishi Lakhani (@rishil), where thousands of links showed up in GWT then were filtered out by Google.
Also, as to you not seeing them, the links may be cloaked to just show up to Googlebot, on a Google IP. A human reviewer would think there was nothing there, or just some technical bug…
.-= Gab Goldenberg´s last blog ..Finding Mass Niches With Content-Spinning =-.
Maybe they know they dont need to build permanent spam links to you anymore, just build for long enough for Google to come through once and then they can take them down
You may have to just ride it out
Thanks Stephen, yep, riding it out seems to be my only option – I SUPPOSE I could submit a question in Webmaster tools, but that may just draw unnecessary scrutiny. Pretty strange stuff…
There are valid cases where a referral URL cannot be reached — such as when someone is clicking on links and viewing search results from behind an extranet or intranet. They can access the links, but you cannot.
What’s more curious is when Google sees the links but you cannot reach them. You might set your browser up to spoof as Googlebot and then revisit the links to see if Googlebot is allowed to crawl them while general web visitors are not able to do so.
Also try performing an “info:” search with the URLs and see if Google has the pages cached. They could have temporarily appeared due to it being a spammer/content-scraper which Google subsequently delisted, and the data is still sticking for a brief while in the Webmaster tools database.
“dontexist.com” appears to’ve been hosted via https://www.dyndns.com/
It appears to allow people to cheaply host sites from their home PCs, which could also explain why these addresses seem very unstable and temporary…
Thanks Chris – Good ideas – I spoofed as Googlebot and then revisited a few links, but they’re still not found – info: gets me very little – site: does show that most of the pages are indexed, but none seem to be cached. Huh – oh well…
Wow that’s is some really strange links, but ive also seen that for some sites, it maybe a combination of someone messing with you and scrapper bots…
I think its a fairly clear as you stated its an affiliate site, if you are really worried maybe keep an eye on your impressions and clicks in your affiliate account to look for spikes.
If you are using adsense use the allowed sites setting to reduce the chance you might be banned if someone is doing something sketchy….
.-= David´s last blog ..Whats a website Worth =-.
Yeah, it’s affiliate, but all good content, with legitimate reviews and it’s 100% on the up and up – there’s nothing “thin” about it – so maybe, yeah, its mostly scraper bots.
The really weird part though is the disappearance of nearly all of ’em! Oh well, i’m not too worried about it, since rankings or traffic haven’t dropped – still, it IS a bit unnerving – glad it’s not happening to a client!
@Scott could it just be that the other sites had been hacked and its purely random?
.-= David´s last blog ..Whats a website Worth =-.
@Scott that’s easy enough to do with a meta tag to prevent google caching the sites
.-= David´s last blog ..Facebook censors links =-.
I have a feeling as these are all related to DynDNS they are all user owned sites, but I imagine this must be an automated process to setup the multiple accounts and link them all to you. The interesting thing is that for these links to get indexed the sites would have needed to have been linked to from somewhere.
If you do a query such as site:dontexist.org you can see that these subdomains are used extensively for some kind of spamming behaviour, I guess someone has figured out a system to make money doing this on a large scale, still none the wiser as to how your site would have been involved/targetted through this though.
**** update****
Over Sunday night, the hosting account for the domain was suspended due to excessive CPU usage. – (actually it was suspended because nobody heeded the warning email from the server admins, but I digress…)
Examination of the log files showed these backlinking domains to be the cuplrits – WTF? – Is this some wort of weird Spamlinking / Denial of Service combo attack?
I don’t have time for this crap, and I am SO GLAD ITS NOT A CLIENT SITE, but that could happen tomorrow for all we know, right? Then what?
.-= Scott Hendison´s last blog ..Fetch As Googlebot Shows Something Wrong =-.
It would be great if you could list the actual URL of the dontexist.com. A check on site:alons819.dontexist.com at Google, shows nothing is being indexed from that sub-domain.
That is actually the accurate subdomain!
It seems like this domain has had some interesting subdomains over the past 1-2 years
https://www.majesticseo.com/search.php?q=dontexist.com&folder=&x=0&y=0
The case and point is that Don’t exist is a free domain provider so it’s fairly easy to use their platform for spam
.-= David´s last blog ..Foxcom fails SEO =-.