Fix: Old method upload #535

1yam · 2023-12-19T10:57:41Z

Problem:
We require the old upload method to remain functional. In some cases, it may crash due to the HTTP client not setting the multipart header for the file_field.

Solution:
If no message is sent, we don't need to know the file size. We can simply read chunks until we reach the maximum size for unauthenticated uploads or end of file.

…is not set

MHHukiewitz

Needs a bit of work. Also, some tests checking the failure states (too big file, Content-Length mismatch) should be included.

src/aleph/web/controllers/storage.py

MHHukiewitz

LGTM

MHHukiewitz

After doing some research, I discovered that await request.post() reads the WHOLE request body into memory, clearly defeating the purpose of much of the code here.

MHHukiewitz · 2023-12-21T09:48:53Z

src/aleph/web/controllers/storage.py

+
+    def read_file_with_max_size(self, max_size: int) -> Union[bytes, None]:
+        buffer_size = 64 * 1024  # 64 KB buffer size
+        content = b""
+        total_read = 0
+
+        while True:
+            chunk = self.file_field.file.read(buffer_size)
+
+            if not chunk:
+                break
+
+            total_read += len(chunk)
+            if total_read > max_size:
+                raise web.HTTPRequestEntityTooLarge()
+
+            content += chunk
+
+        return content if content else None


This part is actually insufficient and in the actual position, where we process the request, we need to use

await request.multipart()

https://docs.aiohttp.org/en/stable/web_quickstart.html#file-uploads

You might have noticed a big warning in the example above. The general issue is that aiohttp.web.BaseRequest.post() reads the whole payload in memory, resulting in possible OOM errors. To avoid this, for multipart uploads, you should use aiohttp.web.BaseRequest.multipart() which returns a multipart reader:

nesitor · 2024-02-29T12:03:13Z

Closed in favor #559

1yam added 2 commits December 18, 2023 13:38

Fix: bug where upload would send back a 422 error

7c9fdf7

Fix: Bug where old upload method would crash if header of multi part …

1e5874d

…is not set

MHHukiewitz requested changes Dec 19, 2023

View reviewed changes

src/aleph/web/controllers/storage.py Show resolved Hide resolved

src/aleph/web/controllers/storage.py Outdated Show resolved Hide resolved

src/aleph/web/controllers/storage.py Outdated Show resolved Hide resolved

Refactor

ba0e512

MHHukiewitz approved these changes Dec 20, 2023

View reviewed changes

MHHukiewitz requested changes Dec 21, 2023

View reviewed changes

MHHukiewitz mentioned this pull request Dec 21, 2023

Implement workaround for axios not passing Content-Length headers in multipart fields #537

Merged

nesitor closed this Feb 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: Old method upload #535

Fix: Old method upload #535

1yam commented Dec 19, 2023 •

edited

Loading

MHHukiewitz left a comment

MHHukiewitz left a comment

MHHukiewitz left a comment

MHHukiewitz Dec 21, 2023

nesitor commented Feb 29, 2024 •

edited

Loading

Fix: Old method upload #535

Fix: Old method upload #535

Conversation

1yam commented Dec 19, 2023 • edited Loading

MHHukiewitz left a comment

Choose a reason for hiding this comment

MHHukiewitz left a comment

Choose a reason for hiding this comment

MHHukiewitz left a comment

Choose a reason for hiding this comment

MHHukiewitz Dec 21, 2023

Choose a reason for hiding this comment

nesitor commented Feb 29, 2024 • edited Loading

1yam commented Dec 19, 2023 •

edited

Loading

nesitor commented Feb 29, 2024 •

edited

Loading