January 14, 2005
Last Updated: January 20, 2005
- Methodology & Results
- About the OpenNet Initiative
Weblogs, most often called "blogs" have become one of the fastest growing segments of the Internet. The idea is simple: blogs allow everyone to have an online journal, to share their experiences while requiring a minimum of technical knowledge. Everyone can become a pundit, critic or proponent of any issue. There are an estimated 300,000 (1) bloggers in China, and thousands of new blogs are created daily. In addition to bloggers who use the service as an online diary, there are those who use blogs as a form of "participatory journalism" discussing news, politics and government policy. The Chinese government actively regulates and restricts the Internet, including blogs, by employing a combination of legal and technological mechanisms to maintain information control. (2)
Over the last year, the Chinese government has focused increasing attention on the control of blogs. Three popular domestic blog providers in China were temporarily shutdown in March 2004. The websites of the three blog providers, blogcn.com, blogbus.com and blogdriver.com, were not filtered or blocked through technical means, rather they were closed down. The home pages of each of the sites carried a message indicating that their services were temporarily suspended. The blog providers were reportedly closed down because they contained discussions concerning "a letter" written by Dr. Jiang Yanyong pertaining to the Tiananmen Square massacre. (3) Dr. Jiang's letter calls upon the Communist Party to admit it made a mistake in the way the protestors in Tiananmen Square in 1989 were treated and openly contradicts official government figures and statements concerning the SARS outbreak in China. (4) The blog providers were later re-opened after purging the sensitive content. One of the blog providers, Blogbus, asked its users "not to post any news about current events, sensitive content or political comments." (5) Although the blog providers have been re-opened, all three have implemented a filtering mechanism to control the content of blog posts.
Methodology & Results
In November 2004, the ONI created blog hosting accounts on three domestic blog providers in China: www.blogbus.com, www.blogcn.com and www.blogdriver.com. We created blog entries that contained keywords extracted from a file found in the popular Chinese Instant messaging program, QQ. Chinese hackers found that a dynamic link library (DLL) (COMToolKit.dll), special files called by applications to execute specific functions, bundled with the QQ software contains a list of keywords in both Chinese and English that are filtered by the software. The hackers posted the file to a Chinese Bulletin Board System in August 2004. The list contains keywords roughly categorized into obscenities, Falun Gong, Chinese officials, political terms, national minorities, and banned media. Although this list was found in an Instant Messaging application, we used these keywords to test the filtering capacity of blog providers in China. Since China does not publicize its list of banned keywords, we use the QQ list as a representative sample of the type of topic areas the government may be targeting for filtering on blogs.
We created blog entries of all 987 keywords to each blog service and identified the individual keywords that triggered filtering mechanisms built in to the blog software. Two of the Chinese blog providers prevented the creation of entries that contained these keywords while one censored the entry by replacing the offending words with "*" characters.
The full list of keywords and filtered keywords is available here.
While Blogbus and Blogcn filter 18 of the 987 and 19 of the 987 keywords, Blogdriver filters 350 of the 987 keywords tested. The filtered keywords generally fall into five categories:
The filtering system is not designed to be foolproof. The filtering mechanism on all three blog providers we tested can be circumvented by adding characters, such as dashes, to split up the filtered keywords. For example, all three blog providers filter the Chinese characters for six/four ("六四"), which is short for June 4, 1989 -- the date of the Tiananmen Square massacre. However, if users include a dash, "六-四", the blog post will not be filtered.
Weblogs are a fast-growing and important component of Internet communication in China. The Chinese government has recognized this development by devoting increasing resources to controlling blogs: first by shutting down the three main blog providers, and then by forcing those providers to implement filtering software as a condition of offering their services. While relatively coarse, the filtering mechanism targets keywords related to content known to be sensitive to the government, such as the names of political leaders, terms related to the Tiananmen Square incident, and words used to discuss the banned Falun Gong movement. The blog filtering mechanism also demonstrates the multi-level Internet controls in China which extend from regulation of Internet access to limits on what content may be posted on-line.
About the OpenNet Initiative
The OpenNet Initiative is a partnership of the Citizen Lab at the Munk Centre for International Studies, University of Toronto, the Berkman Center for Internet & Society at Harvard Law School, and the Advanced Network Research Group at the Cambridge Security Programme, University of Cambridge.
The OpenNet Initiative releases occasional bulletins based on our ongoing research. These bulletins are meant to be limited responses to current events, policy debates, and/or issues raised by our ongoing research that we feel justify immediate wider circulation. Our more detailed analyses can be found in our major reports.
* The OpenNet Initiative would like to thank Stian Haklev for his contributions to this bulletin.
1. Estimates of the total number of blogs in China vary, www.cnblog.org estimates that there are more than 600,000. We thank Isaac Mao for noting the varying estimates of Chinese blogs and Fons Tuinstra for providing a link to the CNBlog data.