Here are some useful tools I created.
I wrote this package to fuzzy match Chinese words. It is particular useful to correct OCR results or match keywords from different data sources. Because there are no alphabets in Chinese, commonly used packages for English are not directly applicable to matching short Chinese words.
Chinese administrative units are subject to frequent changes, such as name change, boundary change, jurisdiction change etc. This project aims to preserve the historical record of the Chinese administrative units by scraping records for all township and above units from National Bureau of Statistics of PRC website.