Why it’s Good that Google.cn Leaves + SEM (2)

Written by Julen Madariaga on January 22nd, 2010

GoogleAngel2_thumb3Back on the job. On re-read, I have the feeling that I might have been too optimistic yesterday. Sure, the style of Google’s announcement betrayed personal involvement, and once at the negotiation table it is to be expected that a more businesslike atmosphere will prevail. But even if G shuts up, it is not sure that the CCP will let them get away with it. Depending on who they have at the table, the outcome will be anything between the two extremes we have considered.

But let’s leave our bipolar guesswork aside for a while, so we can concentrate on a more interesting issue. Namely, that it’s great that Google.cn is going to disappear, and that whatever happens to the rest of the Gs, the Chinese internet will be a better place when Google.cn is gone. Let’s start with some crude survey work:

Baidu, Google.cn or Google.com?

I improvised a little survey today in the office, where I asked three of my young Shanghai colleagues which search engines they like to use. Interestingly, the answers were very similar, and all included some form of the following statements:

  • Baidu.com is better for local information and Chinese culture.
  • Google.cn we use sometimes for international information.
  • Google.com? Nah, that’s for foreigners.

These results are surprising, because as we saw yesterday, Google.com and Google.cn are exactly the same engine.  It doesn’t make any sense to search on Google.cn, where anything as innocent as 胡锦涛 (HuJintao) is obviously SEM manipulated. For the first experiment of the day we can see how, using this slightly conflictive term, results start to differ between G.com and G.cn. Try the links, see where there’s a Wikipedia article missing?

But the best of all is the answer given by the sample colleagues when I insist on why they use Google.cn: Oh well, the browsers here  direct you to Google.cn by default. That is probably the main reason why G.cn is ranked 3rd on Alexa for China, while G.com is only ranked 6th.

Hey, wait a second. Are you telling me that all it takes to get an identical, non SEM-ed Google Search in China is to type a “.com”, and 300 million netizens haven’t noticed in the last 4 years? Well, yeah. Kind of. Let me introduce you to:

The Chinese censorship and its peculiar victims

This is one of the most misunderstood aspects of Chinese censorship in the West. I realized this with the crazy Wang post, the one that was linked in an article 3 days straight on the Most Read list of the New York Times. I got lots of hits, and also lots of mail from creative Americans proposing ideas to help “free the Chinese” from the claws of the GFW.

But listen, the sad reality is, the CCP’s systems of censorship are so effective not because they are diabolically sophisticated, but because… because the Chinese netizens can’t give a damn if they are being censored by their government or not.

You don’t believe me? Then perhaps you have a better theory to explain why nobody uses the widely available, free web proxies to surf the internet. Or why the majority of Chinese netizens still use Google.cn when they have an identical search engine that is not manipulated on Google.com

Shocking, right? But not so much. The truth is that, in spite of popular funny memes and the occasional juvenile rant, the majority of Chinese who are rich enough to use the internet are happy with the status quo. They do find it mildly annoying to be treated like children by the CCP, but as long as the bills are paid, they don’t think so much of it.

And this is also why, if someone wants to create a device against the GFW, the user activated systems like proxies or Tor are not effective, because people simply don’t use them. The idea of a Server Side Proxy, or the Unblockable Host that would unblock a site WITHOUT action by the end user, was discussed here, and I concluded it was not feasible.

This is also the reason why initiatives like Chrter 08 never make it in China: it is not about users trying to get access to dissident sites, it is about dissidents unable to market their ideas to a general population that is unreceptive.

Advanced SEM for Dummies (Search Engine Manipulation)

The most amusing thing in the Google crisis is all the commentators crying about the loss of Google.cn and its negative consequences for the freedom of the Chinese. In fact, I maintain that Google.cn is the most evil product to ever have existed in the Chinese internet, and the World will be a better place without it.

That is because, unlike the Chinese official sites that practice censorship, what the search engines do is manipulation. Why? Because Google.cn is not a content site in itself, it is a gateway to the internet. When people type in a keyword into the search field, they are actually trusting it to return a fair picture of what is on the net.

When you type a “sensitive” term and G.cn removes all the results except the People’s Daily and Xinhua, Google’s responsibility is double: not only it supports those often objectible views on the first page, but it also implicitly states that it is the ONLY opinion existing in the World.

And the worse is, the Chinese who believed that would be right to do so, because Google’s well known principles clearly specify their commitment to give all the information available  in a democratic way. The little warning message that is displayed on Google.cn SEM searches is meant to avoid this situation, but it is tiny and often placed right at the bottom of the page, so most Chinese users just ignore it.

In the case of Google.cn, SEM is not about “good” or “evil”. It is about breaking the very principles that give a sense to the Google company, and it is understandable that Google has never been comfortable with it.

TEST TRANSLATION GOOGLE.COM GOOGLE.CN
Neutral Word Shoe

Shanghai Pudong

Normal Results

Normal Results

Normal Results

Normal Results

Sensitive
Term
Hu Jing Tao

TNM massaccre

Normal Results

Normal Results

SEM  Results

SEM  Results

RC trigger string chinayouren-free.comg

Fallunggong

RC Block

RC Block

Normal Results

SEM Results

All tests in Chinese, English spelling is on purpose. The anomaly in the chinayouren string proves that in some rare cases G.cn does give better results that G.com, as SEM does not apply to petty disharmony.

Google.com and the GFW URL/IP blocks

One clarification is necessary here: when we say Google.com is free of SEM, it doesn’t mean that it is entirely uncensored. Google.com is hosted outside China, so it suffers from GFW blocks, in particular URL string blocks, like all the sites hosted out of the mainland. A funny way to test this block is to introduce the string  “chinayouren-free.comg” in any site search like Wikipedia, Yahoo, etc hosted out of China. The RC block is guaranteed! (fortunately I have changed my URL since and I  managed to unblock my blog).

A less funny example is when you search things like FLG, the famous religion fiercely opposed to the CCP. Google passes the characters of FLG to the URL, and this triggers the RC. And here is where we see the evil of Google.cn compared to Google.com:

  • Google.com gives a very disruptive RC,  unmistakable sign of GFW block. This is plain censorship. Try here.
  • Google.cn gives results from People’s Daily and Xinhua, the second of which starts: “FLG: the evil nature of the bloody…”. This is SEM manipulation. Try here.

Add to this that, for obvious practical reasons, the Chinese government cannot RC block too many common terms, or it would make the internet unusable. There are far less terms RC’ed on Google.com than terms SEM-ed on Google.cn, which is the reason why it is fairly easy, using Google.com in China, to get to very explicit material about Tiananmen 89 and many other delicate subjects. And even when those sites often are blocked themselves, the user who has seen the excerpts and thumbnails in the search results and then obtained an RC when clicking on them should be enlightened.

In conclusion, Google.com is a Search Engine that is:

1- Exactly as good quality as Google.cn (identical index)
2- Without the manipulation of Google.cn
3- AND much less censored than Google.cn

And the only reason why Chinese don’t use it is that Google.cn sounds more Chinese to them, and they just don’t care enough. Most probably the disappearance of G.cn will push the present G.cn users to switch to G.com, and the outcome will be increased freedom in the Chinese internet.

Conclusion

It is obvious that, if Google manages to keep Google.com and eliminate Google.cn, and in the process convince the CCP that it has already suffered the deserved punishment (for Google it DOES matter to lose G.cn, it is a source of revenue) then all will have ended well. That was probably not the initial intention of Google’s move, but it is a possible outcome.

On the other hand, some commenters are already saying that I am too optimistic, and that the CCP will quickly come to the same conclusion I have come and block Google.com. The good news is that EVEN if they do block Google.com, the situation may still be better than today. The Chinese Google users will start to miss the G, and they will start to use web proxies to access Google.com, expanding their use and making the Chinese net population more conscious of the GFW and of the ways to cross it.

And I think with this I have said all I had to say about Google and China. I am planning to write about other things for a while, at least until important, first hand news come out. In the meantime, you can ask questions or raise objections in the comments below, and I will try to help.

.

*NOTES AND CLARIFICATIONS:

1- If you are outside mainland China probably the experiments will not work, because Google.com is tailored to your own country. There are web proxies hosted in the mainland that you can use to trick Google into thinking you are in China. I don’t remember any now, if anyone knows please comment below.

2- People have asked me about Bing and Yahoo’s SEM policies. A very quick test shows that in principle Yahoo.cn/com is similar to G.cn /.com, while Bing only has one version here, and it looks SEMed, but I’m not sure. In any case those sites are way behind Google in China.

3- Many tech-savvy Chinese obviously DO know that Google.com is not SEM manipulated. And no, I am not implying here that most Chinese are stupid. I am just saying that most of them don’t give a damn if they search for Hu Jintao and the only answer they get is the People’s Daily bio. There is just no notion of the advantages of diversity of information, something that the CCP is not very good at promoting.

Sharing is free, support my work:

  • Twitter
  • Facebook
  • email
  • Google Bookmarks
  • Digg
  • del.icio.us
  • Haohao
  • StumbleUpon
  • Technorati
  • LinkedIn
  • Netvibes
  • Reddit
  • Posterous
  • Live
  • QQ书签
  • MSN Reporter
  • 豆瓣
  • Yahoo! Buzz
  • MySpace
  • FriendFeed
  • Print



Comments so far ↓

  1. Jan
    22
    9:48
    AM
    Shelly

    I remember a joke told by a Chinese guy who works for Google.His big boss said they have two products,one is Google, the other is Google China. Many Chinese people know the diffrence.But we think Google China is good enough and what’s more,we are used to it.

    [Reply to this comment]

    Uln Reply:

    Ah, good joke! 冷不冷?我这里很冷。。。
    That is precisely the problem, that even if most Chinese know about the SEM (it is pretty obvious as Google results have the SEM message) they still continue to use Google.cn.

    BTW, I didn’t include a copy of the SEM message. It is important so I will do it here. There are different versions but it is usually like this, at the top or bottom of the Search results:

    据当地法律法规和政策,部分搜索结果未予显示 (ie. According to local legislation, a part of the results arenot displayed)

    [Reply to this comment]

  2. Jan
    22
    9:55
    AM
    Wukailong

    Thanks for saying this, especially the description of how local people view censorship and why they aren’t flocking to use proxies en masse. On the other hand, there are some things I like with google.cn, especially their map service - http://ditu.google.cn/ . Maybe it’s just my personal interest, but it has a remarkably able transliteration engine built in - just go to any non-Chinese location and it will translate the name into characters properly.

    [Reply to this comment]

    Uln Reply:

    @Wukailong, good point about the transliteration engine. However, I don’t think that is the main reason Chinese use Google.cn, most of them have the IME by default in Chinese mode, so they wouldn’t see the difference.

    I think the main reason is just that the URL google.com redirects to google.cn in the default settings of iexplorer. BTW, does anyone know how this happens? When I use Firefox or Chrome there is no redirect, what ie setting makes this?

    [Reply to this comment]

  3. Jan
    22
    10:08
    AM
    FrenchFrog

    Very goog blog, I discovered it recently but I added it immediatly to my rss reader.

    Also I agree with you that google.cn disappearance is a good thing, I don’t think chinese will try to “break” the GFW just to access google(.cn or .com). They’ll just switch to another search engine (probably baidu) and the situation will become worse. I don’t see why they’ll use proxies to access g.com when they don’t use proxies for other website.
    Wait and see…

    [Reply to this comment]

    Uln Reply:

    @FrenchFrog, Most Chinese will switch to Baidu. Some (the ones that use G.com now) may learn how to use proxies. It is difficult to know exactly what the trend will be.

    Regardless of the above, G.cn is a shame for Google and it is probably the single most evil page on the Chinese internet (because it manipulates just like Baidu, but lends the brand name of Google to the manipulation). I will not cry when it closes down.

    PS. Thanks for subscribing. Now expect a lot of posts about learning language and old Chinese novels before I start to do tech writing again. This Google series was exhausting :)

    [Reply to this comment]

    FrenchFrog Reply:

    @Uln, There is a hope :)
    I realized that most of graduate students (i’m studiying in tsinghua) use google.cn for scientific research because Baidu is really poor when it comes to english publications.
    So at least, most of scientific 研究生 will continue to use google (but there are little manipulated results about technical issues)

    [Reply to this comment]

  4. Jan
    22
    11:36
    AM
    Woods

    Uln, I must say this post explains exactly what so many people just do no understand. It is the case for Internet but the same apply to other subjects. Many people, even they are not 100% free to speak or else, just do not care less and can live very happily while in some western countries we think they are oppressed…
    I don’t say everything is great and that nothing should change, but hey it’s not hell out there ! :)
    Hope lot people will come and read that.
    - Woods

    [Reply to this comment]

  5. Jan
    22
    2:13
    PM
    JUH

    What about google.tw or google.hk? Yes, they are in traditional Chinese, but at least there’s less of a language hurdle than English.

    [Reply to this comment]

    yinbin Reply:

    And there is googe.sg which does offer simplified Chinese as one of its search language options.

    [Reply to this comment]

    Uln Reply:

    @yinbin, I haven’t tried those yet, might do some tests if I have the time.

    But I don’t think it will make any difference anyway, because if the Chinese authorities decide to block Google.com, then I would imagine they will just block all the Googles. I mean, I don’t think they would be so dumb to leave the .tw open in that case. But who knows, the CPP is surprising sometimes.

    [Reply to this comment]

  6. Jan
    22
    8:35
    PM
    NFX

    Hi there ULN,
    My technical knowledge is far less than yours so I apologize if I have misunderstood matters a bit here. I wanted to say that it’s an interesting post that has made me think, plus all the info. from the various links- I have learnt more than I knew this morning. I actually wrote a short note recently and did conclude that it would be better if google.cn and google.com did stay on in China and that they should- you have certainly made me think more about that.

    A couple of things I would like to question though: One, you do seem to permanently be making the comparison between google.cn and .com as if it is a natural and significant choice for the Chinese, I would say that those that can, would want to, or a comfortable reading whatever information they are searching for in English is a tiny minority, and thus doesn’t seem to warrant such emphasis. Secondly, a superficial point and one made above, the majority will just switch to Baidu or whatever other alternative that may come along.
    Thirdly, it seems that the government here and Chinese companies are most respected and endowed with veneration, not an American company like Google, no matter its reach elsewhere or its motto, and thus the greatest blight on the internet is not google.cn but the home grown Baidu, which from my experience carries the people along with it, thus the breaking of that trust is most significant. I had read elsewhere that google.cn (outside the issue of censored topics) generally in its structure of searching gives much deeper and detailed information than Baidu does for example. It is in this sense that I really saw a continued presence for google.cn in China as valuable and necessary. One thing I am not clear on and seems as if it would still be a problem is the searching of sensitive issues on google.com, I certainly in the past didn’t have free reign of what I could read. I must say again my knowledge here is not so deep so I may well have misunderstood the context. ps. I took your advice some time ago to use wordpress.org and also the host you recommended- Cheers.

    [Reply to this comment]

    Uln Reply:

    @NFX, the whole point of my post is that Google.com and Google.cn are exactly the same index. So if Google.cn leaves, you can do exactly the same on Google.com, and even more free.

    Now, regarding Baidu, I know many Chinese prefer it, great. But this post is not about whether B is better than G. It is about freedom and diversity, in the West many of us believe that the MORE different sources of information you have, the more difficult it is for someone to control your head, because you can always doublecheck. Get it?

    [Reply to this comment]

  7. Jan
    22
    8:42
    PM
    NFX

    Sorry ULN and error that I can’t seem to undo- you can delete the first comment.

    [Reply to this comment]

  8. Jan
    23
    8:51
    AM
    maxiewawa

    Nice post! I’ve added it to my facebook feed, surely the highest praise a blogger can get.

    [Reply to this comment]

  9. Jan
    23
    10:03
    AM
    NFX

    Uln,

    I have just done a couple of tests and do now see that you mean that you can use Chinese on .cn and .com and receive greater results from the latter when searching for sensitive subjects. When searching on both for everyday subjects in Chinese the results are the same.

    One concern, and that i just experienced, is that the searches seemed to be quickly blocked completely on .com. If that were to continue to be the case, sensitive issues being blocked on google.com, then there would still be an important place for .cn to cater for Chinese peoples’ wider interests, as it would be a more natural choice of site for them.

    Anyway, I am not quite sure about your second emphasis, this is of course and most certainly all about the diversity and depth of information accessible to people. You may well believe that this diversity is best served by losing .cn and you may be right, I am just trying to work this out a bit as well, your posts and site have helped, thanks.

    ps. With regard to freedom and diversity of information I found it ironic that the day Hilary Clinton made her speech on Internet Freedom, The New York Times announced it was going to start charging for online content.

    [Reply to this comment]

    Uln Reply:

    @NFX:
    1- Those searches on .com are not “blocked completely”, what you are seeing is an RC, if you wait 1 minute it goes back to normal.
    2- Anyway, you have a point that, RC blocking being more disruptive, Google.cn is more convenient for lazy people to use.
    3- But that is precisely my point: BECAUSE of the existence of a more convenient Google.cn, most Chinese Google users are seeing the internet as an outlet of the People’s Daily. Now THAT is what I call manipulation, and it is a shame for Google.

    [Reply to this comment]

  10. Jan
    23
    5:47
    PM
    NFX

    Uln,

    That is certainly a fair and interesting point, thanks for helping me see it more clearly- there is much food for thought on this topic that’s for sure. Cheers.

    [Reply to this comment]

  11. Jan
    25
    9:57
    PM
    Raj

    Uln, can you e-mail me about this post please? Cheers.

    [Reply to this comment]

  12. Feb
    12
    7:38
    AM
    Juan Torregrosa

    I still think that G.cn is much better for chinese people than G.com. Among other things it has:
    Writing in pinyin, chinese instructions, the music tool, the chinese aesthetic,etc.
    …not to mention that G.cn reminds people that the information they search might be limited by Chinese law.

    …just stumble upon your great site one days ago. Awesome work 佩服佩服

    [Reply to this comment]

    Uln Reply:

    Hi, thanks for the peifus! ::)

    Some comments said similar things before, and I agree only partially. More than all that however, I think one of the reasons why many Chinese don’t use G.com is that they find it extremely annoying. Every time you get an RC it is really disruptive and you have to wait a few minutes to use google again. That was partly my point: if Goggle.cn is closed down, Google users will be forced to use G.com, and this will make the censorship more obvious and it will push more people to start using proxies and get better surfing habits…

    The little notice appearing on G.cn is a lie, because it says: “according to local regulations”. This might be true for porn, but I would like someone to explain which law in the popular republic forbids writing biographies of the Chinese leaders, for example (such as a wikipedia article). It is misleading, because users just think: “ah, good, in China we have our own regulations”, when they should be thinking “oh, damn, a bunch of crooks in the propaganda department are manipulating me and covering up information at their will”.

    [Reply to this comment]

Leave a Comment





6 Trackbacks / Pingbacks

  1. Censor me » The Peking Duck
  2. Google In China Is Better Than No Google In China | CNReviews
  3. Chаrter 08: Why it should be called Wang | CHINAYOUREN
  4. Google Leaving China Will Not Be A Revolution, Televised Or Not | CNReviews
  5. Uln on Google.cn - “Why it’s Good that Google.cn Leaves” | Fool's Mountain: Blogging for China
  6. Google Redirects! But Will The Chinese Government Block?! | china/divide