Wednesday, August 19, 2009

Django Boundary Iterators

The Django Python web application framework provides tools to parse multi-part form data as any web framework should. In fact, the developer responsible for creating Django web applications need not concern themselves with the underlying parsing machinery. However, it is still there and very accessible.

The Django boundary iterators are used to help parse multi-part form data. These iterators are also general enough to be used in different contexts. The BoundaryIter class collaborates with other iterator classes such as ChunkIter and LazyStream. An example of these classes collaborating are illustrated below.
#Example; Using the Django boundary iterator.

#Imports.
from StringIO import StringIO
from django.http.multipartparser import ChunkIter, BoundaryIter, LazyStream

if __name__=="__main__":
#Boundary data.
b_boundary="-B-"
c_boundary="-C-"

#Example message with two boundaries.
_message="bdata%scdata%s"%(b_boundary, c_boundary)

#Instantiate a chunking iterator for file-like objects.
_chunk_iter=ChunkIter(StringIO(_message))

#Instantiate a lazy data stream using the chunking iterator.
_lazy_stream=LazyStream(_chunk_iter)

#Instantiate two boundary iterators.
_bboundary_iter=BoundaryIter(_lazy_stream, b_boundary)
_cboundary_iter=BoundaryIter(_lazy_stream, c_boundary)

#Display the parsed boundary data.
for data in _bboundary_iter:
print "%s: %s"%(b_boundary, data)

for data in _cboundary_iter:
print "%s: %s"%(c_boundary, data)
In this example, we have a sample message containing two boundaries that is to be parsed. In order to achieve this, we create three iterators, and a stream. The _chunk_iter iterator is a ChunkIter instance that is a very general iterator used to read a single chunk of data at a time. This iterator expects a file-like object to iterate over. The two boundary iterators, _bboundary_iter and _cboundary_iter, are instances of the BoundaryIter class. This iterators expect both a data stream and a boundary. In this case, a LazyStream instance is passed to the boundary iterators.

We finally display the results of iterating over the boundary iterators. By the time the second iterator is reached, the data stream is now shorter in length.