Tag Archives: python

Tornado and Concurrency

Reading the Tornado documents did not grand me basic understanding of asynchronous web service calls. After referring back and forth external examples and blogs, I summarize my brief understanding of tornado’s concurrency model.

  1. Common multi-thread web service frameworks, where one user request is given one dedicated thread, controls the number of concurrent threads running on a web server. The thread is blocked once the request hits some blocking calls (e.g. database access or an external blocking http call). The thread will wait until a result is returned and the control is returned to the thread.
  2. Different from above, Tornado makes use of IOLoop, a single thread to handle all requests, though a server may have multiple IO Loops, we did not discuss this case here. The design of IOLoop makes the design of asynchronous calls clearer, i.e. every asynchronous calls is spawned or yielded  from this single IOLoop, and eventually return controls to this single IOLoop. IOLoop is created by Application which is a singleton to maintain global status.
  3. Tornado making asynchronous easier does not mean by default Tornado is asynchronous. By default, handlers are synchronous, and it means other user requests will be blocked severely since IOLoop has only one thread, IF the blocking factor is non-negligible, like database calls, aws calls, long-polling (websocket maybe). In order to make handler asynchronous for those blocking monsters, first we need to annotate handler method with @tornado.web.asynchronous using callbacks, or annotate with @tornado.gen.coroutine using yield. Second, you need to make your calls asynchronous or in a executor pool: the design consideration will follow recommendations like this,
    1.  Try make the calling component asynchronous by using asynchronous library (database asynchronous drivers, e.g.) or AsynchHttpClient for external calls.
    2. Make synchronous call faster by using faster local database etc.
    3. If remote synchronous (blocking) calls have to be used, then make use of ThreadPoolExecutor.

So what to do for blocking database drivers like redshift?  Yes, redshift does not have asynchronous query driver support. And, via AWS we can’t safely assume the synchronous connection is fast (DynamoDB maybe fine).

Then we need to make use of ThreadPoolExecutor.

 

 

Advertisements
Tagged ,