python

Don’t use python httplib (or any library that uses it) to transfer large files, especially to China

As mentioned in previous blog posts, we are distributing our app in China through a local Chinese server. So a part of our deployment process is to upload the newest version of our APK via an HTTP API to that host in China. We were noticing that the job was taking an extremely long time (> 25min) to complete, and our APK size is less than 7MB. The connection would often timeout before the file could even be uploaded. We were using the python requests library to do the actual upload.

At first we thought this might just be due to high network latency caused by the great firewall. However, we soon realized that cURL did not have this problem. In fact, it took less than a minute to upload the same file to this host when using cURL. So we dug into the requests a little bit and found this issue.

Turns out httplib reads and sends files in 8192 byte chunks, which isn’t going to work very well especially when that file is being sent over a high latency network. So we made a quick change to the script to use cURL for now.

For those interested, here is the source code for the httplib send method:

The full source is here.

Tags:

Discussion

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s