Last Sunday I did a post on internet censorship in China where I mixed in various different ideas and I’m afraid the final result regarding Search Engine Censorship didn’t come out as clear as I would have liked. I think it is an important subject, so here are the complete results:
We will be looking at Google.cn, Google.com and Baidu.com, and we will try in each of them 3 different kind of search terms.
A- Chrter 08: In all its combinations, which are 08宪章 and 零八宪章
B- Political Terms: Tiananmen incidents (天安门六四事件), FLG.
C- Vulgar words: Sex. I will employ the “blog job” and the “chicken bar”.
It is understood that in all cases the search terms are in Simplified Chinese. The browser is Firefox 3.0.5. and the connection is a normal home DSL by China Telecom. The possible results are:
- Free Search - Results look consistent and realistic, like the ones obtained in the West.
- Reset Connection (RC) - This can only be seen in Mainland China. The result is an image like the one below and the search engine cannot open anymore for a while (I estimate 30 seconds). RC is not directly done by the Search Engine. Wikipedia internal search also gives RCs for B Terms.
- Forbidden Message (FM) - This is the forbidden Message that, with slight variations, is the same as shown below. It says something in the lines of: “Some results are not displayed according to the local laws, regulations and policies”.
- Manipulated Results (MR)- This is the case where the results are obviously manipulated, for example in the search of 天安门六四事件 (Tiananmen incident) on Baidu, where all the results are official newspapers such as People’s Daily, etc. Sometimes it can also carry on top of the page a FM.
A -Free Search. (But click some individual results gives RC).
B- Reset Connection
C- Manipulated Results.
A- Forbidden Message and (sometimes *) Manipulated Results
B- Reset Connection.
C- Forbidden Message. When used “” gives Manipulated Results.
A- Manipulated Results. When used “” gives Forbidden Message.
B- FM and Manipulated results.
C-FM and Manipulated Results.
1- The results are somewhat erratic and it is difficult to see a pattern: it all looks like a series of patches on top of each other rather than a systematic implementation. Also, things change in time, as in *, where the Manipulated Result I saw Sunday cannot be seen anymore.
2- Baidu has a different system from Google: it has no Reset Connections. This is very advantageous for Baidu and I understand it is unfair competition, as a RC is one of the worst experiences while surfing.
3- This might be due to Google’s own preference server location: the involvement of the Search Engines in the RC is unclear no direct involvement (even Wikipedia has RCs!) whereas Manipulated Results obviously requires their action, and can more easily attract attention from Advocacy Groups. Of course, in the case of sexual terms (C), this is not a problem as the Manipulated Results can just be called “Safe Search”.
4- The Chrter 08 has different treatment than other political terms, but it might just be because it was banned urgently and suddenly, so it is only a quick fix added to existing structure. It does not provoke RC in any case. It looks like they have decided to leave it alone on Google.com to avoid attention from Western advocacy groups, but in exchange Google has had to give up Google.cn and apply the infamous “porn block” to it which is active censorship by SE. Why the FM and not RC? Who knows, I am guessing perhaps RC is more complicated to implement.
5- In any case, and however negative, I understand it is always better to show FM than Manipulated Results, because the former is openly admitting censorship, whereas the latter is a lie and a distortion of reality. Forbidden Message does increase transparency, yet does not justify involvement in political censorship. From this perspective, Google is closer to the truth than Baidu. Baidu seems indeed a more active participant in the government’s information control schemes, and Chinese users of Baidu are clearly the most exposed to Search Engine brainwash.
UPDATE: Following corrections by international expert Nart Villeneuve below: I have introduced a few changes of my own (in blue). In any case, this post is just a very basic review of the SE Censorship system from the perspective of a normal user. If you really want to understand how the GFW works, you should read proper research papers like this one, or this one.
1- FORBIDDEN MESSAGE (FM)
2- RESET CONNECTION (RC)
NOTE: If someone is interested in this or has some more information to share please put it in comments. Unfortunately my time is very limited so I only ran 2 or 3 terms for each of the classes A, B and C above. There might be things I overlooked and I would be grateful if you can point them out.