Google.cn Filtering: How It Works

By: nart on 25 January 2006
Posted in Asia, China

Google has opened a new Chinese-language search engine at www.google.cn that filters out results from sites that are considered “sensitive” by the Chinese government. In addition to filtering news.bbc.co.uk search results are also filtered for the human right groups hrw.org and hrichina.org and all of the geocities.com free hosting community. This filtering is quite similar to the filtering conducted by domestic Chinese search engines.

The filtering takes place in two ways:

1. de-listed domains: specific websites are removed entirely from search results; it is as if the website never existed.
2. de-listed urls: specific urls are removed from search results if they contain a de-listed domain.

For example, the domain news.bbc.co.uk has been removed from www.google.cn. Using Google’ “site:” modifier, a search for “site:news.bbc.co.uk” in google.cn returns no results and appears as if there is not such a website all. In addition to Google’s usual text that appears when searching for a non-existent website additional text appears informing the user that results have been removed to comply with local law.

However, using Google’s “inurl:” modifier, a search for “inurl:news.bbc.co.uk” does appear to return results although they are not listed and instead are replaced with text informing the user that results have been removed to comply with local law. Furthermore, a search for “site:bbc.co.uk inurl:news” shows that although bbc.co.uk is indexed and searchable the specific domain news.bbc.co.uk is not listed in the search results.

Another illustrative example is a search in google.cn for “site:www9.beijing999.com inurl:dmirror” versus the same search in google.com. In google.com 3 results are returned and all three are listed whereas google.cn returns 3 results but only lists 2 of them. The missing URL is “https://www9.beijing999.com/dmirror/http/mirror.epochtimes.com/gb/nf3154.htm” which contains the text “epochtimes.com/” in the URL path.

The website epochtimes.com is treated as a de-listed domain (site:epochtimes.com) however, a search with the modifier “inurl:” (inurl:epochtimes.com) does return results although none of the results are actual the requested website. But a search for “inurl:epochtimes.com/” (with a trailing slash) also returns results but does not list them for the user.

This fine grain control allows google.cn to keep websites such as “epochtimes.com.ua” in its index while eliminating epochtimes.com. There is similar fine grain control targeting Chinese language content. While there are results for "site:faluninfo.net" there are no results for "site:chinese.faluninfo.net".

To be clear, this filtering only affects www.google.cn; users who choose to access Google’s Chinese language search engine at http://www.google.com/ig?hl=zh-CN are not subjected to this filtering.

While this filtering can be easily circumvented most users will simply use google.cn, since users from China are redirected there by default.

Here are just some of the sites that have been de-listed by google.cn:

site:hrw.org
site:hrichina.org
site:boxun.com
site:tsquare.tv
site:freechina.net
site:rfa.org
site:news.bbc.co.uk
site:geocities.com
site:peacehall.com
site:64memo.com
site:voa.gov
site:falundafa.org
site:epochtimes.com
site:xinsheng.net
site:savetibet.org
site:bignews.org
site:topforum.com
site:omnitalk.com
site:laogai.org

(Crossposted on ICE)

<strong>Lesser evil?...</strong> Joining the fray in

Lesser evil?...

Joining the fray in the Google China censorship (soon-to-be?) debacle.
I’d like to put things in perspective, so a chronology is in order:

24 January 2005. Google agrees to censor sites that are objectionable to the Chinese government, puts up a...

<strong>Como funciona (y evitar) el

Como funciona (y evitar) el filtrado de Google.cn...

Como nos cuenta Antonio, Google anunciaba esta semana que lanzaba Google.cn, una versión del buscador conforme con las exigencias de control del gobierno chino y capando por tanto la búsqueda en sitios considerados sensibles para el gobierno chino.
Si ...

i strongly think dat it

i strongly think dat it is against "Right to Information Act"....which is unlawful!! Atleast in India...Grow up guys

Why would this be restricted?

Why would this be restricted? Is it because it links directly to the GOP website and they wanted to limit traffic during the election? There is no underhanded reason to limit access, is there?

<strong>How to evade censorship on

How to evade censorship on the web...

Try some Anonymous Surfing
Take a look at Peacefire
You can also visit OpenNet Initiative
There is also en entry on OpenNet Initiative’s Blog about the way Google uses censorship in Chinahere.
Or better yet install your own proxy server to help o...

Hey It's hard

Hey

It's hard to believe how Orwellian the Chinese government is in relation to the Internet. In a recent Harpers magazine article there was a short piece on the tobacco industry. It's controlled by the government (or provides the lion share of the revenues for the government). One of the results of this is that the internet is used as a means for spreading the idea that smoking is beneficial. Websites set up by the government detail the benefits in very grandiloquent fashion. Anyways, it's very terrifying to think that a country of 1.3 billion people can be cut off from the worlds dominate source of information, or have it so manipulated that common knowledge (like smoking increases risk of cancer) can be distorted for large political-economic rational.

Eric

Then what is the

Then what is the use of google in that country when a majrity of sites are delisted. We can fight back this move of google by putting up as many "Anti China" banners in the majority of website.Evntually china will have to block them too ,this will make the internet surfing a pain in the A** for the chineses people and well they MAY revolt against the gov.t policy .
NO much of help but that's my contributuion

I doubt english protest do

I doubt english protest do you any good, for those who do not know, the chinese ppl speak chinese(you'd be suprised how many americans don't know that), and I doubt much of you can pursuade any sites in china to change their theme into anti china, nationality aside, business people try to make money, getting no customers is the worst way to make money. the problem is, the gov don't tell google what is sensitive, so google blocks off more than nessisary, google's competitor baidu shows way more results, simply because it knows what to block. for example, a website that claims: "chinese gov suck ass" would not be blocked, however, a website that claims: "chinese gov suck ass so people should over throw it" will be blocked. google just blocks both to save the trouble. in fact, if you go to some of the more famous chinese forums, a good amount of the themes is anti-china(some even front page, and to the extreme sometimes, I might say), and that's what leaves google baffled:why are they allowed? to google there is no obvious pattern, so they simply block everything. the ingenious part is: there is no actual sensorship done on the part of chinese gov, they just tell people: watch what you shows, and leave the sensoring to themselves. and occasionally they would barge in and purge your database in completely random ways. sometimes a banned site is allowed again for no particular reason, yet the site itself have not changed. this is why google is confused, cuz sometime the gov seem to be asking for bad press, the trick is to know how things work. for example, google consider the word "communist party" to be sensitive, forgetting that in china it is not a taboo word like it is in america.

others open up comments to

others open up comments to everyone (like here on SPLAT). You should be able to toggle this setting somewhere on the admin pages. As you've also discovered, some blog providers are more or less user-friendly. But there are so many out there, it can be worth trying a few out to find the right fit for you. I recommended Blogger because it's so ubiquitous, but it definitely isn't as user-friendly as some.

Let me start off by

Let me start off by saying that I’m not going to pretend to be as politically savvy or as much of an activist compared to some... However, when I caught wind of the Google.cn cersorship news back in January, it just didn’t sit well with me... which will probaly make my comment/reply more of a “personal struggle” nature than most you might get.

I eventually decided to stop using Google’s services. However, more and more applications/online services/company websites are building Google services right into their own... it’s been getting harder and harder to avoid them. In addition to this, I was recently pointed to your Google.cn vs Google.com comparison site where I learned that “Google informs its users when their search results have been filtered (to date, Microsoft and Yahoo!'s Chinese search services do not), and provides users with a link to the unfiltered Google.com home page”.

After having read many-an-article arguing that some censored information is better than no information at all and just as many slamming the big “G” calling for a full-out boycott, in an ongoing personal effort to stay as informed as possible, I guess I have come to OPI with the hopes for further clarification. This site, out of all the ones I have visited, seems to best research and report on the severity of Google.cn censorship and internet censorshop in general.

Does the fact that Google.cn offers the “Sorry we had to censor your results — government policy. Give the Chinese Google.com a spin... the results are still censored but hey, at least we didn’t do it ourselves” approach somehow lessen the severity of Google’s agreement to censor its results at all? Or is Google.cn meerly the result of another mega-company caring more about more money than how their actions actually affect people? Is a full-out boycott hasty or appropriate?

Any input/opinion at all would be greatly appreciated.
Cheers.
--
steelie

I can not access www.google.cn

I can not access www.google.cn - if I try I just get redirected to the New Zealand Google site. However this is not teh case for every other Google country site I have accessed. If it is only Google.cn then Google are not only censorin gteh people of China but also blocking access to me and others, presumably to stop them from seeing how the people of China are beign censored.

Wow. I also can't get

Wow. I also can't get google.cn It redirects me to google.com What's the idea?

<a></a>

thanks for this documents The filtering

thanks
for this documents

The filtering takes place in two ways:

1. de-listed domains: specific websites are removed entirely from search results; it is as if the website never existed.
2. de-listed urls: specific urls are removed from search results if they contain a de-listed domain.

Hello everybody, my name is

Hello everybody, my name is Damion, and I'm glad to join your conmunity,
and wish to assit as far as possible.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Use [fn]...[/fn] (or <fn>...</fn>) to insert automatically numbered footnotes.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <sup> <h1> <h2> <h3>
  • Lines and paragraphs break automatically.

More information about formatting options

CAPTCHA
This question helps to reduce spam on the site. If you need new words, click the double-arrow icon on the form. If you need spoken word, click the speaker.