Flask Mailinglist

« back to archive

Any way to stream file uploads?

Any way to stream file uploads?

From:
Michael Fogleman
Date:
2011-09-09 @ 17:01
request.files['file'] gives you a FileStorage instance, which has a
file-like object that you can read from. But that file is basically
the entire upload already in memory or in a temporary file. Is there
any way in Flask to process the stream as it is being uploaded?

Re: Any way to stream file uploads?

From:
Armin Ronacher
Date:
2011-09-09 @ 17:11
Hi,

On 9/9/11 7:01 PM, Michael Fogleman wrote:
> request.files['file'] gives you a FileStorage instance, which has a 
> file-like object that you can read from. But that file is basically 
> the entire upload already in memory or in a temporary file. Is there 
> any way in Flask to process the stream as it is being uploaded? 
The way the form parsing in Flask (or rather Werkzeug) works is by
consuming wsgi.input via werkzeug.formparser.parse_multipart which is
invoked by werkzeug.formparser.parse_form_data.  It's nontrivial to hook
into that generally but that is not necessary for any task I came up
with since you can just wrap the wsgi.input.

Werkzeug guarantees you that it will always only use the .read() method
with a given size from the input stream so you can easily wrap it:


class StreamWrapper(object):
    def __init__(self, stream):
        self._stream = stream
    def read(self, bytes):
        rv = self._stream.read(bytes)
        # do something with rv
        return rv


@app.route('/upload', methods=['GET', 'POST'])
def upload_files():
    request.environ['wsgi.input'] = \
        StreamWrapper(request.environ['wsgi.input'])
    # at that point access request.files and it will read via
    # your StreamWrapper.  Careful not to access request.files at
    # any point earlier.  request.shallow can be set to True to make
    # sure this does not happen by accident.
    ...

Werkzeug calls .read() on your stream in buffer_size steps which is
currently kinda hardcoded to 10KB.  If there are wishes to make this
more pluggable, please file a ticket in the Werkzeug issue tracker.


Regards,
Armin

Re: Any way to stream file uploads?

From:
Michael Fogleman
Date:
2011-09-09 @ 17:29
Cool, but it looks like it's trying to use readline:

  File 
"C:\Python26\lib\site-packages\werkzeug-0.6.2-py2.6.egg\werkzeug\formparser.py",
line 208, in parse_multipart
    file = LimitedStream(file, content_length)
  File "C:\Python26\lib\site-packages\werkzeug-0.6.2-py2.6.egg\werkzeug\wsgi.py",
line 662, in __init__
    self._readline = stream.readline
AttributeError: 'StreamWrapper' object has no attribute 'readline'

On Fri, Sep 9, 2011 at 1:11 PM, Armin Ronacher
<armin.ronacher@active-4.com> wrote:
> Hi, 
> 
> On 9/9/11 7:01 PM, Michael Fogleman wrote: 
>> request.files['file'] gives you a FileStorage instance, which has a 
>> file-like object that you can read from. But that file is basically 
>> the entire upload already in memory or in a temporary file. Is there 
>> any way in Flask to process the stream as it is being uploaded? 
> The way the form parsing in Flask (or rather Werkzeug) works is by 
> consuming wsgi.input via werkzeug.formparser.parse_multipart which is 
> invoked by werkzeug.formparser.parse_form_data.  It's nontrivial to hook 
> into that generally but that is not necessary for any task I came up 
> with since you can just wrap the wsgi.input. 
> 
> Werkzeug guarantees you that it will always only use the .read() method 
> with a given size from the input stream so you can easily wrap it: 
> 
> 
> class StreamWrapper(object): 
>    def __init__(self, stream): 
>        self._stream = stream 
>    def read(self, bytes): 
>        rv = self._stream.read(bytes) 
>        # do something with rv 
>        return rv 
> 
> 
> @app.route('/upload', methods=['GET', 'POST']) 
> def upload_files(): 
>    request.environ['wsgi.input'] = \ 
>        StreamWrapper(request.environ['wsgi.input']) 
>    # at that point access request.files and it will read via 
>    # your StreamWrapper.  Careful not to access request.files at 
>    # any point earlier.  request.shallow can be set to True to make 
>    # sure this does not happen by accident. 
>    ... 
> 
> Werkzeug calls .read() on your stream in buffer_size steps which is 
> currently kinda hardcoded to 10KB.  If there are wishes to make this 
> more pluggable, please file a ticket in the Werkzeug issue tracker. 
> 
> 
> Regards, 
> Armin 
>