Instructions to deal with the GFW

Written by Julen Madariaga on July 8th, 2009

I have written a lot recently about the Great Firewall of China (GFW). I had my site blocked for two weeks and this inspired some frustrated posts until eventually I worked my way through the Wall. The good news is I learnt a lot in the process, and now I can write some tips to help others with the same problem. Anyone who has a website hosted outside China can use these instructions to try to keep it accessible here. Here is the index, follow the links for details.

Prevention – Try to stay out of trouble

From the beginning when you set up your website, there are a series of measures you can take to reduce the probabilities of getting blocked and/or making your life easier if this happens. If you follow these points hopefully you will never need get to the next Section.

  • Be careful with what you publish. >>>
  • Try to avoid writing GFW keywords. >>>
  • Choose where you want to be hosted. >>>
  • Choose a good, flexible hosting service. >>>
  • Host your blog/site on a subdomain. >>>

Action – When trouble is at your door

Then one day you realize that your Chinese readership has fallen to zero, and you wonder why you can’t open your website from China. If this happens to you, these are the simple steps to follow:

  • Make sure it is really the GFW. >>>
  • Check if there is an IP block. >>>
  • Find out if the target is really you. >>>
  • Check if there is an URL block. >>>
  • Move to a new IP address. >>>
  • Change your URL and Redirect. >>>
  • Check that you don’t have links. >>>
  • Try to eliminate the keywords. >>>
  • Take it easy, and send feedback. >>>

Notes and Disclaimers

  • Don’t forget to read the party of the first part >>>

.

PREVENTION

From the beginning when you set up your website, there are a series of measures you can take to reduce the probabilities of getting blocked and/or making your life easier if this happens. If you follow these points hopefully you will never need get to the next Section.


1. Be careful with what you publish. Sites in English have more leeway than Chinese to publish political content, but the more popular your posts get the tighter the line will be. Any kind of political activism can get you blocked, in particular subjects that are seen as potential menace to the one-party system. You can easily be misunderstood if you mention those subjects, even if your intention was different. Apart from politics, the other big subject that can get you blocked is porn.

2.Try to avoid writing GFW keywords. OK, if you need to write about these subjects, then try to avoid writing GFW keywords. To test which words are GFW keywords, run a search on Wikipedia in China, and see if it resets the connection. There are many ways to avoid keywords, such as periphrasis, or use of similar characters (omicron “ο” for “o”, etc.).

The problem is nobody will find your words on Google if you use this, so up to you to see how much you want to trade safety for search engine hits. Bear in mind that keywords won’t get you blocked automatically, and if your site is small enough you can relax and write as many as you want. The risk is some day a big website may link to you, then your keywords will attract the attention of GFW and your trouble starts.

3- Choose a good location for your host. If you are legit business with a business license you can apply for a permit and host your website in China. The permit is easy to get, as long as you don’t do politics or journalism. This solution will make your site faster for users in the mainland, and it will avoid all sorts of trouble with the Great Firewall.

If you are planning to write some controversial content, or if a large part of your readers will be outside the mainland, then it is better to host out of China, otherwise you risk to get your whole site closed down and your data lost. A good location in the middle is Hong Kong. It is well connected internationally and close to one of the nodes entering the mainland. Therefore it should be quick on both sides, plus many Hong Kong hosts are reportedly understanding with the GFW problems and they will help you to get rid of an IP block.

4- Choose a good, flexible host. Regardless of where you are hosted, the most important is that you choose a hosting company that will give you flexibility and will try to help you if anything happens. In particular you should check if there is an affordable option to change your IP. In my case it was 30$/year, I heard from other webmasters they paid double. The essential is to check that your host will give you this options, because not all hosts offer it.

5- Host your website/blog on a subdomain. I learnt this trick in my recent experience. When you get an URL string block, it will not necessarily affect the whole domain, but only the particular subdomain where your site lives.

If you host your site on a subdirectory, such as www.domain.com/blog/, this will allow you to deal with a URL string block very easily, just by moving to, for example, www.domain.com/myblog/. This way you can save the money and the trouble to register a new domain, and you can continue to use that domain that your users already identify with you.

.

ACTION
Then one day you realize that your Chinese readership has fallen to zero, and you wonder why you can’t open your website from China. If this happens to you, these are the simple steps to follow:


1. Make sure you are actually blocked by the GFW. The first obvious step is to check that you are really blocked by the GFW, and not by some other problem. You should get in contact with some friend who is outside the mainland and a friend inside the mainland. If it is a GFW block, the outside friend should see no difference at all, whereas the inside one should get a Reset Connection error or sometimes a Server Time Out error when he tries to open your site. There are also online tools available to do the test, but  I haven’t tried their accuracy.


2- Check if there is an IP block. The first thing to do in this case is to check if you have an IP block. It is very simple, from China type your site’s IP in the address bar of your browser after the “http://” and if it doesn’t work  then it is an IP block. IP blocks are the most common GFWs blocks so this test should normally confirm step 1. NOTE: If you don’t know what is your site’s IP there are many Firefox free add-ons, and other web based services that will give you this information.

3- Find out if the target is you, or some site living on your server. Very often when you are down to point 3 what is actually happening is that you are collateral damage of the GFW. They are not targeting you, but just somebody else’s site which lives in the same IP or the same subnet (group of IPs). It is not easy to check this, but there are 2 simple techniques that can give you a clue. One is to use a reverse IP service, such as this one, where you can see more or less how many sites are sharing IP with you, and you can get a list of these sites. If you are hosted in the same address as FLG.com (ie. an anti-CCP organization) then you know it was not you.

4- Check if there is an URL block. In China, go to Wikipedia search box (or some other search bar on a site hosted out of China ) and type the strings you want to test. First try www.domain.com, and then www.domain.com/blog. If you are lucky only the second one will reset your connection, which means that your domain root URL is still unblocked and you can easily switch to a new subdirectory as explained above.   NOTE: URL blocks typically give RCs. When you do RC tests you must wait for at least a minute after getting a Reset Connection, otherwise you will RC again no matter what you type in (the connection is still not re-established)


5- Change your IP address. OK, so you have done all the checks. It is time for action. Supposing you followed the prevention steps in the first place, you have a cool, understanding host that is not pissed off each time you get his whole subnets blocked in China, and who is willing to swap IPs for a moderate price. Get in contact with him or select a new IP directly from your admin panels functions (if available) from the site.

6- Change your URL and Redirect. This part can be very easy or very difficult, depending largely on how well organized is your website. In any case, the process is not without pain, and you will lose many incoming links. Why? Because even if you have a Redirect on your old URL, anybody in China clicking external links to your old site will get an error, as the connection resets before your server has the time to redirect to the new URL.

If your blog is powered by a good system like WordPress, and if you have been consistent with the use of permalinks over the years, it should be very easy for you to move to the new directory in your same domain. Just follow the steps given in your blog engine and you will be done on a few minutes. If you need to move to a new domain because the whole domain URL is blocked, then it will take a bit longer, but it shouldn’t be a big deal.

7- Check you don’t have remaining requests to your old URL. Many of the blog’s elements, such as pictures, are probably still linked to a URL in your old subdirectory, and therefore when the page loads they instruct the users computer to connect to that subdirectory in order to display the picture. If you have been paying attention you know what will happen in this case: the fretful RC. Don’t panic, run a search in the whole blog (not the search for users on the website, the admin tool in your blog admin panel, you might need to install it as a plug-in. In wordpress it is called Regex Search) and replace all instances of the blocked URL with the new URL.

At this point, you might still see that the blog continues to trip RC. In this case it might be some JavaScript or other active content in your site that is sending requests to the old URL. Use a traffic viewing tool like the free Firefox add-on Tamper Data which will easily show you all the requests sent while loading your site. Type the blocked string in the search bar and you should see the old URL coming up at some point, and the software will tell you which application is the culprit.

8- Try to eliminate all keywords. If you have followed the instructions up to now your site should be open again in China. What I recommend you to do is to run a search for keywords in your blog and try to eliminate/disguise some of them. Especially look at the posts that were linked by big websites and they received a lot of hits in China. NOTE: you don’t need to eliminate outgoing links, a link to a forbidden site does not usually get you blocked, otherwise every major website in China would be down.

9- Take it easy, and tell me about it. Go back to step 1 “prevention” above and try to get smarter so next time you don’t have to go again through the whole process. Or else remain blocked and take it as a honour, make yourself a T-shirt saying “I was blocked by da GFW”, after all not every site is worth the effort of blocking. You must have had at least some relevance in the eyes of the GFW for it to censor you. In any case, if you get GFW’ed and follow these steps, give me your feedback in comments below so we can improve the method.

.

NOTES AND DISCLAIMERS

  • There is no such a thing as censorship in China, the GFW is due to a bug in the Chinese internet that defies all scientific explanation.
  • These instructions are empirical knowledge, I have no internal information on the GFW, and the basis for these instructions is my own experience and that of a few other bloggers shared with me by email. I am not giving credit to anyone here to avoid getting them in trouble again, but guys let me know if you want to be linked.
  • It cannot be excluded that the GFW reads these instructions and changes its blocking devices. Also the GFW has a random nature and not all the steps above will work in al the situations.
  • These instructions can only be used by websites of a small to moderate size who are blocked more or less automatically by the GFW. I you are big, especially if you write in Chinese, forget about the IP/URL swap. The GFW can find your new address in 5 minutes. Well, let’s say 5 hours, with the help of the Secret Service.
  • If you are hosted within mainland China, GFW instructions do not not apply to you, instead you get what is usually known as the Net Nanny. See this post if you want to understand the difference.
  • As far as I know there is nothing illegal in writing instructions for webmasters to deal with an internet bug. However, if this text is somehow illegal or unharmonious, I would be grateful to the concerned department if they can inform me so I can correct it before they take more drastic measures.

Sharing is free, support my work:

  • Twitter
  • Facebook
  • email
  • Google Bookmarks
  • Digg
  • del.icio.us
  • Haohao
  • StumbleUpon
  • Technorati
  • LinkedIn
  • Netvibes
  • Reddit
  • Posterous
  • Live
  • QQ书签
  • MSN Reporter
  • 豆瓣
  • Yahoo! Buzz
  • MySpace
  • FriendFeed
  • Print



Comments so far ↓

  1. Jul
    9
    12:40
    AM
    Kai

    Hey, looks like your post is cut off mid-way through Action #2…

    [Reply to this comment]

  2. Jul
    9
    6:05
    PM
    uln

    Hi Kai, thanks for telling me. I have been travelling these days and I opened the blog editor with my mobile phone browser last night. Bad idea. many of the editing functions are not available for the mob browser and I ended up messing up the whole post.

    Here it is open again. Anybody out there who was blocked, let me know if this is useful, or add any other suggestion.

    [Reply to this comment]

  3. Jul
    10
    10:31
    PM
    Pierre

    “If you are hosted in the same address as FLG.com, then you know it was not you.”

    I did not know Forward Logistic Group was that sensitive a topic in China ;)

    BTW, do you know if anyone writing in a language other than English and Chinese has ever been blocked ? Like in Spanish or French ? Or are we free to write home about la Paz Celestial en mille neuf cent quatre-vingt neuf because they don’t care ?

    [Reply to this comment]

  4. Jul
    24
    4:16
    PM
    FOARP

    I’ve had a couple of ideas on getting around blocks:

    1) Writing the post and then scanning it to create an image file, maybe even hand-writing it to make it harder for a non-native speaker to read. Essentially your web page would look like a photo blog, but with scanned pictures of your writing. The content would not show up on a text keyword filter, and would require the censor to actual upload the page and read it to know its content. The only problem would be that it wouldn’t attract search-engine traffic, but blogs can be advertised in some other fashion, and the titles of the pictures would be searchable.

    2) Like Pierre said, writing in some obscure half-forgotten language (like French) and then using an auto-translator to turn it into English - not exactly ideal as much of meaning/nuance would be lost, and some parts garbled, but better than nothing.

    I think that the image-file method would be a particularly good method at least for a low-to-medium activity blog. Of course, once the censor actually gets around to reading your website (or just sees high-traffic ‘anti-China’ websites linking to it) the game would be up, but you would go much longer between blocks.

    For myself, I’m happy to have my blog on good old blocked blogspot. There’s no pressure to avoid blockage when you’re already blocked!

    [Reply to this comment]

  5. Jul
    24
    7:53
    PM
    Uln

    Hi,

    Thanks for your ideas. These methods you propose, as well as some others like writing Chinese in vertical, using russian characters, etc. have all one thing in common: they are ways to disguise the keywords to avoid them being found by a GSW automated search system. And they all have 2 inherent problems:

    1- The search engines (google, etc.) also can’t find the keywords, so you miss potential readers.
    2- If you get big enough, especially if you write in Chinese, then you get human intervention and all these tricks will not work.

    But how automatic the GFW really is? I don’t know. The Chrter 08 post that got blocked here had been written a few weeks before with the same wording, but it only got blocked when the NYT linked to it (with a few hours lapse). So, clearly, the block is not just on keywords; the “hits” or “links” variable plays an important role as well.

    The other option is: it was actually a human who followed the NYT link and gave the signal to put me on a “list”. In this case, my writing the keywords in Russian or in scanned pictures would have had no effect.

    [Reply to this comment]

Leave a Comment