Creating a disavow and/or identifying spam in SEO has a lot of considerations, and is especially complex for sites with lots of backlinks. The process can be alleviated by using Moz and/or SEMRush, but not without the following checklist.
Thinking about disavowing sites purely based on a high spam and/or toxic score? Consider these things first:
Common Sense & Context
First and foremost, see if the link makes sense. For example, a link to beauty products on an engineering blog would be likely be out of place.
If a backlink like this also has a high spam score, ditch it. It could have been a result of a poor quality link building campaign, and should likely be added to a disavow to avoid future harm.
Once the above is verified, check for blacklists. To do so, visit any one of the following three tools and punch in the respective domain/ip:
The information returned will indicate whether or not the site is listed.
Sites listed on blacklists are likely suffering from their own SEO problems, and it’s possible that can be passed on.
Link Farms & PBN
The next step to identifying spam is looking for links from matching networks. Use ahrefs referring IPs tool by visiting Ahrefs.com and navigating to “Referring IPs.’ Once a site is entered, a list of referring is returned. By default, these IPs/domains will be grouped by subnet.
There’s no real rule of thumb, and it depends on the size of the site in question, but if one or two subnets make up 15% or more of the entire backlink profile, there might be cause for worry.
To identify pagerank sculpting, look at other pages on the site. If there’s a consistent pattern of mixing do and nofollow link attributes, it’s possible the owner’s purposely trying to pass SEO equity. As a result, the site could be part of a PBN or farm.
Experienced SEOs can pick up on this more easily, but it’s fairly straightforward. If the site feels overly ‘advertorial’ and/or contains a lot of pages where it’s likely someone paid for links, there again might be cause for concern.
This could be anything from an ‘advertise’ section with clear intent to sell links to frequently distributed product-focused content with highly targeted transactional-specific anchor text (e.g. buy used cars online). It could also include an overwhelmingly large number of ads placed above the page fold.
Note – This should be considered, but never without ticking off all other boxes first. Many blogs qualify under these guidelines but still pass SEO equity with no issue from Google.
Google’s favorite types of offending links are guest posts. When they’re not overdone and/or in an irresponsible manner (irrelevant or blatantly advertorial), they’re fine. But if their sole intent is to manipulate Google rankings, they become hazardous. Check the offending site for guest posts. If there are a lot, it’s possible it’s just another sign of spam.
To check a site for guest posts, simply visit Google and type in ‘site:example.com guest post’. Replace ‘example.com’ with [example site]. The results will be what Google has indexed in the way of guest posts. Note – use “guest post” for the exact phrase. Also, substitute ‘post’ for ‘author’.
If more than 20% of the site/blog is ‘guest post’ content, it’s possible that could correlate with spam.
With EAT being a big (re-emerging) topic, it’s important website’s put faces to content. Checking a site’s upkeep in this area is a good way of understanding whether or not they’re building valuable content, or just creating something solely made for SEO.
If it’s a blog with lots of content on varying topics but little to no authorship, it’s possible it’s more the latter.
Sites with good SEO scores but minimally relevant or low traffic numbers are likely up to no good, and it’s not uncommon for any of the above to also correlate with low and/or irrelevant traffic. Checking site traffic with a tool like Alexa / Similarweb and/or SemRush can be a quick way to legitimize a page / site.
Duplicate Content & Scraping
Lastly is duplicate content / scraping. A site with lots of duplicate content or lots of ‘mission-similar’ pages could be a spammy. The same goes for scraper sites. If it’s pulling in content from somewhere else at scale, especially without proper use of things like cross-domain canonicals, it’s possible it could be considered spam.
While previously classified as an automated process, disavow still has major implications for Pagerank / equity. If any one and/or combination of the above holds true, careful consideration should be given to the link in question before submission. Also, any attempt at inclusion should also be carefully monitored following its submission. For more information or a quote on spam reconciliation, please contact us directly via the link below and/or connect with us on Twitter and Linkedin.