Spam blog
Encyclopedia
A spam blog, sometimes referred to by the neologism splog, is a blog
Blog
A blog is a type of website or part of a website supposed to be updated with new content from time to time. Blogs are usually maintained by an individual with regular entries of commentary, descriptions of events, or other material such as graphics or video. Entries are commonly displayed in...

 which the author uses to promote affiliated websites, to increase the search engine rankings of associated sites or to simply sell links/ads.

The purpose of a splog can be to increase the PageRank
PageRank
PageRank is a link analysis algorithm, named after Larry Page and used by the Google Internet search engine, that assigns a numerical weighting to each element of a hyperlinked set of documents, such as the World Wide Web, with the purpose of "measuring" its relative importance within the set...

 or backlink portfolio of affiliate websites, to artificially inflate paid ad impressions from visitors (see MFA-blogs), and/or use the blog as a link outlet to sell links or get new sites indexed. Spam blogs are usually a type of scraper site
Scraper site
A scraper site is a spam website that copies all of its content from other websites using web scraping.In the last few years scraper sites have proliferated at an amazing rate for spamming search engines...

, where content is often either inauthentic text
Inauthentic Text
An inauthentic text is a computer-generated expository document meant to appear as genuine, but which is actually meaningless. Frequently they are created in order to be intermixed with genuine documents and thus manipulate the results of search engines, as with Spam blogs...

 or merely stolen (see blog scraping
Blog scraping
Blog scraping is the process of scanning through a large number of blogs, usually daily, searching for and copying content. This process is conducted through automated software. The software and the individuals who run the software are sometimes referred to as blog scrapers.Scraping is copying a...

) from other websites. These blogs usually contain a high number of link
Hyperlink
In computing, a hyperlink is a reference to data that the reader can directly follow, or that is followed automatically. A hyperlink points to a whole document or to a specific element within a document. Hypertext is text with hyperlinks...

s to sites associated with the splog creator which are often disreputable or otherwise useless websites.

There is frequent confusion between the terms "splog" and "spam in blogs
Spam in blogs
Spam in blogs is a form of spamdexing. It is done by automatically posting random comments or promoting commercial services to blogs, wikis, guestbooks, or other publicly...

". Splogs are blogs where the articles are fake, and are only created for search engine spamming. To spam in blogs, conversely, is to include random comments on the blogs of innocent bystanders, in which spammers take advantage of a site's ability to allow visitors to post comments that may include links. In fact, one of the earliest uses of the term "splog" referred to the latter.

This is used often in conjunction with other spamming techniques, including sping
Sping
Sping is short for "spam ping", and is related to pings from blogs using trackbacks, called trackback spam. Pings are messages sent from blog and publishing tools to a centralized network service providing notification of newly published posts or content...

s
.

History

The term splog was popularized around mid August 2005 when it was used publicly by Mark Cuban
Mark Cuban
Mark Cuban is an American business magnate and investor. He is the owner of the National Basketball Association's Dallas Mavericks, Landmark Theatres, and Magnolia Pictures, and the chairman of the HDTV cable network HDNet....

, but appears to have been used a few times before for describing spam blogs going back to at least 2003. It developed from multiple linkblogs that were trying to influence search indexes and others trying to Google Bomb
Google bomb
The terms Google bomb and Googlewashing refer to practices, such as creating large numbers of links, that cause a web page to have a high ranking for searches on unrelated or off topic keyword phrases, often for comical or satirical purposes...

 every word in the dictionary.

The term may be applied to more recent infections, most noticeably those reported by Webtrends
WebTrends
Webtrends is a private company headquartered in Portland, Oregon, United States. It provides web analytics and other software solutions related to marketing intelligence...

 in April 2008. Leveraging botnet
Botnet
A botnet is a collection of compromised computers connected to the Internet. Termed "bots," they are generally used for malicious purposes. When a computer becomes compromised, it becomes a part of a botnet...

s, spammers have infected several thousand pages which display prominent keywords from the Google Trends
Google Trends
Google Trends is a public web facility of Google Inc., based on Google Search, that shows how often a particular search-term is entered relative to the total search-volume across various regions of the world, and in various languages...

 site by bypassing the CAPTCHA authentication method, which had previously subdued all spam bloggers. A recent sighting puts the top ten Google hottest terms of the day as all being owned by spambots on the Blog Results page. As they have gone mostly unchecked, they have also infected real SERP Page One web results and corrupt any hot search terms more than a month old.
Hackers are using a number of methods including link farm
Link farm
On the World Wide Web, a link farm is any group of web sites that all hyperlink to every other site in the group. Although some link farms can be created by hand, most are created through automated programs and services. A link farm is a form of spamming the index of a search engine...

ing, spamdexing
Spamdexing
In computing, spamdexing is the deliberate manipulation of search engine indexes...

 and keyword stuffing
Keyword stuffing
Keyword stuffing is considered to be an unethical search engine optimization technique. Keyword stuffing occurs when a web page is loaded with keywords in the meta tags or in content...

 each in a simple, moderated form to achieve top PageRank
PageRank
PageRank is a link analysis algorithm, named after Larry Page and used by the Google Internet search engine, that assigns a numerical weighting to each element of a hyperlinked set of documents, such as the World Wide Web, with the purpose of "measuring" its relative importance within the set...

 results. Most of the sites contain an animated graphic which appears as a YouTube
YouTube
YouTube is a video-sharing website, created by three former PayPal employees in February 2005, on which users can upload, view and share videos....

 streaming video. Once clicked, users become infected with one of several variants of spyware. This generates revenue for the spambot
Spambot
A spambot is an automated computer program designed to assist in the sending of spam. Spambots usually create fake accounts and send spam using them, although it would be obvious that a spambot is sending it...

's owner.

Controversy

Splogs have become a major problem on free blog hosts such as Google
Google
Google Inc. is an American multinational public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program...

's Blogger
Blogger (service)
Blogger is a blog-publishing service that allows private or multi-user blogs with time-stamped entries. It was created by Pyra Labs, which was bought by Google in 2003. Generally, the blogs are hosted by Google at a subdomain of blogspot.com. Up until May 1, 2010 Blogger allowed users to publish...

 service. By one estimate, about one in five blogs are spam blogs. These fake blogs waste valuable disk space and bandwidth as well as pollute search engine results, ruining blog search engines and damaging bloggers community networking (e.g. Blogger's next blog link).

Web search engine
Web search engine
A web search engine is designed to search for information on the World Wide Web and FTP servers. The search results are generally presented in a list of results often referred to as SERPS, or "search engine results pages". The information may consist of web pages, images, information and other...

s are commonly susceptible to link flooding, especially from highly weighted
Weight function
A weight function is a mathematical device used when performing a sum, integral, or average in order to give some elements more "weight" or influence on the result than other elements in the same set. They occur frequently in statistics and analysis, and are closely related to the concept of a...

 bloggers.

Splogs sometimes choose a name similar to a popular blog in order to benefit from the occasional incoming link from careless bloggers, who think they are linking to the popular site. Splog activity can cause problems for legitimate bloggers, if search engines respond to splog by blocking or treating as 'suspicious' all web addresses in a particular domain.

Stealing expired domain's PageRank

Some spammers register domains
Domain name
A domain name is an identification string that defines a realm of administrative autonomy, authority, or control in the Internet. Domain names are formed by the rules and procedures of the Domain Name System ....

 that have just recently expired, so that the previous owner's pagerank persists. Thus, building a splog on a expired domain enhances the pagerank that the links.

See also

  • Adversarial information retrieval
    Adversarial information retrieval
    Adversarial information retrieval is a topic in information retrieval related to strategies for working with a data source where some portion of it has been manipulated maliciously. Tasks can include gathering, indexing, filtering, retrieving and ranking information from such a data source...

  • Spam in blogs
    Spam in blogs
    Spam in blogs is a form of spamdexing. It is done by automatically posting random comments or promoting commercial services to blogs, wikis, guestbooks, or other publicly...

  • Blog scraping
    Blog scraping
    Blog scraping is the process of scanning through a large number of blogs, usually daily, searching for and copying content. This process is conducted through automated software. The software and the individuals who run the software are sometimes referred to as blog scrapers.Scraping is copying a...

  • CAPTCHA
    CAPTCHA
    A CAPTCHA is a type of challenge-response test used in computing as an attempt to ensure that the response is generated by a person. The process usually involves one computer asking a user to complete a simple test which the computer is able to generate and grade...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK