[Twisted-Python] Storing site-wide information and scheduling tasks

Andrew Bennetts andrew-twisted at puzzling.org
Thu Jun 26 04:26:23 EDT 2003


On Thu, Jun 26, 2003 at 09:08:30AM +0200, Thomas Weholt ( PRIVAT ) wrote:
> I want to start a webserver which have object ( or whatever ) available in
> all Resource-instances running in that server which holds common information
> for the entire site. What's the appropriate module to use for this;
> Application, Factory ??

I'll let someone else answer this, but I think you want a Service.

> In the same server I need to start tasks ( ie. run functions ) at different
> intervals. These function-calls can take some time ( they can fetch files
> from the net, scan the local filesystem etc. ) so I guess I need threads,
> deferred or something similar. These function-calls must return a result and
> update the persistent object mentioned above.

(Btw, you seem to think Deferreds make blocking code magically non-blocking.
They don't -- they're actually very straightforward and non-magical.)

For the "fetch files from the net", you don't need threads.  Just do
something like (warning -- untested code):

    from twisted.web.client import getPage
    from twisted.internet import reactor
    from twisted.python import log

    refreshInterval = 30

    def periodicFileFetch(url):
        d = getPage(url)
        # Process the page, and "update the persistent object"
        d.addCallback(processPage).addCallback(updateObject)

        # Log any errors in downloading or processing
        d.addErrback(log.err)

        # Reschedule this function
        d.addBoth(reactor.callLater, refreshInterval, periodicFileFetch, url)

    def processPage(x):
        "Your code goes here!"
    def updateObject(x):
        "Your code goes here!"

    reactor.callLater(refreshInterval, periodicFileFetch, url)

For scanning the local filesystem, you could treat it like one big blocking
operation, or you could break it into small chunks (i.e. one directory at a
time), and process each chunk with callLater(0, processNextChunk).  For the
sake of discussion, I'm going to choose a thread :)

The trick with threads is to avoid interacting directly with *any* of your
existing objects that your main event loop uses.  Disregarding this advice
will lead to race conditions, and thus horrid, horrid bugs.  So, we scan for
files in a thread, but to make it safe we send the instruction to do work to
the thread via a Queue.Queue, and make sure it returns the results to the
deferred via reactor.callFromThread.

    # WARNING: More completely untested code.

    from twisted.python import log, failure
    from twisted.internet import reactor, defer
    import Queue, threading

    refreshInterval = 30

    def processEvents(queue):
        """A thread that processes events. 
        
        It receives (deferred, function) 2-tuples from a Queue.Queue, runs
        the function, and fires the deferred with the result.
        """
        while 1:
            try:
                deferred, func = queue.get()
            except:
                log.err()
                continue
            try:
                reactor.callFromThread(deferred.callback, func())
            except Exception, e:
                reactor.callFromThread(deferred.errback, failure.Failure(e))

    def periodicFileScanner(queue, path):
        # Tell the thread it's time to do some work
        d = defer.Deferred()
        q.put((d, lambda: scanFiles(path)))

        # Arrange for the result/error to be dealt with
        d.addCallback(updateObject)
        d.addErrback(log.err)

        # Schedule this fun merry-go-round to happen again
        d.addBoth(reactor.callLater, refreshInterval, periodicFileScanner,
                  queue, path)

    def initFileScanning(path):
        q = Queue.Queue()
        t = threading.Thread(target=processEvents, args=(q,))
        t.start()
        reactor.callLater(refreshInterval, periodicFileScanner, q, path)

    def scanFiles(path):
        "Your code goes here!"
    def updateObject(x):
        "Your code goes here!"
        
    initFileScanning('/path')

This is actually more-or-less what twisted.internet.threads.deferToThread
does (once you dig deep enough), so you probably want to use it rather than
my completely untested code.  I've written it out explicitly in the hope
that you'll have a better understanding of how it all works.  

Note also how Deferreds are just messengers -- they don't do any interesting
work beyond calling callbacks when they're told to.

> Can anybody show me a very basic example on how to do this?

I hope I've my example code is basic enough that it makes sense for you --
let us know if you're still uncertain about anything.

-Andrew.





More information about the Twisted-Python mailing list