We are aware of the issue with the badge emails resending to everyone, we apologise for the inconvenience - learn more here.

Forum Discussion

deeuu's avatar
deeuu
Explorer | Level 4
7 years ago

list_folder_continue -> 504 errors

Hello,

I'm using `list_folder_continue` of the Python SDK to recursively traverse (via pagination) a folder with a very large number of files (hundreds of thousands).  After a while, I'm running into internal server errors. Here's the log:

2018-09-24 20:09:12,831 - dropbox - INFO - Request to files/list_folder/continue
2018-09-24 20:10:42,843 - urllib3.connectionpool - DEBUG - https://api.dropboxapi.com:443 "POST /2/files/list_folder/continue HTTP/1.1" 504 35
2018-09-24 20:10:42,845 - dropbox - INFO - HttpError status_code=504: Retrying in 0.7 seconds
2018-09-24 20:10:43,505 - dropbox - INFO - Request to files/list_folder/continue
2018-09-24 20:12:13,521 - urllib3.connectionpool - DEBUG - https://api.dropboxapi.com:443 "POST /2/files/list_folder/continue HTTP/1.1" 504 35
2018-09-24 20:12:13,522 - dropbox - INFO - HttpError status_code=504: Retrying in 0.8 seconds
2018-09-24 20:12:14,320 - dropbox - INFO - Request to files/list_folder/continue
2018-09-24 20:13:44,336 - urllib3.connectionpool - DEBUG - https://api.dropboxapi.com:443 "POST /2/files/list_folder/continue HTTP/1.1" 504 35
2018-09-24 20:13:44,337 - dropbox - INFO - https://www.google.com/search?client=ubuntu&channel=fs&q=recusrively+traverse+&ie=utf-8&oe=utf-8HttpError status_code=504: Retrying in 6.8 seconds
2018-09-24 20:13:51,148 - dropbox - INFO - Request to files/list_folder/continue
2018-09-24 20:15:21,163 - urllib3.connectionpool - DEBUG - https://api.dropboxapi.com:443 "POST /2/files/list_folder/continue HTTP/1.1" 504 35
2018-09-24 20:15:21,165 - dropbox - INFO - HttpError status_code=504: Retrying in 9.4 seconds
2018-09-24 20:15:30,565 - dropbox - INFO - Request to files/list_folder/continue
2018-09-24 20:17:00,580 - urllib3.connectionpool - DEBUG - https://api.dropboxapi.com:443 "POST /2/files/list_folder/continue HTTP/1.1" 504 35
2018-09-24 20:17:00,582 - dropbox - INFO - HttpError status_code=504: Retrying in 2.0 seconds
2018-09-24 20:17:02,557 - dropbox - INFO - Request to files/list_folder/continue
2018-09-24 20:18:32,572 - urllib3.connectionpool - DEBUG - https://api.dropboxapi.com:443 "POST /2/files/list_folder/continue HTTP/1.1" 504 35

Is this attributable to the overall activity in the account (tis a business account with 10 users)?

Many thanks

  • Thanks for the report! This can be related to these operations taking a long time due to a large number of files and/or a large amount of file activity in the affected account(s).

    We'll look into it, but there are a few potential workarounds:

    1) Use the 'limit' parameter on files_list_folder:

    https://dropbox-sdk-python.readthedocs.io/en/latest/moduledoc.html#dropbox.dropbox.Dropbox.files_list_folder

    If you specify a smaller value (the default is effectively around 2000), that should help reduce how long each of these calls takes and reduce the likelihood that they will fail. Note that you supply the 'limit' value to files_list_folder itself, and it will apply to all results from files_list_folder_continue using the returned cursor as well.

    2) If you are using recursive=True, switch to recursive=False when calling files_list_folder. This means you would need to make a call for each sub-folder you need though.

  • deeuu's avatar
    deeuu
    Explorer | Level 4

    I should also add that when re-running my script, calling list_folder_continue with the most recent cursor prior to catching the InternalServerError, takes a very long time to get going again, with the same API request attempts:

    2018-09-25 08:59:14,422 - dropbox - INFO - Request to files/list_folder/continue
    2018-09-25 09:00:44,445 - urllib3.connectionpool - DEBUG - https://api.dropboxapi.com:443 "POST /2/files/list_folder/continue HTTP/1.1" 504 35
    2018-09-25 09:00:44,446 - dropbox - INFO - HttpError status_code=504: Retrying in 2.0 seconds
    2018-09-25 09:00:46,445 - dropbox - INFO - Request to files/list_folder/continue
    2018-09-25 09:02:16,463 - urllib3.connectionpool - DEBUG - https://api.dropboxapi.com:443 "POST /2/files/list_folder/continue HTTP/1.1" 504 35
    2018-09-25 09:02:16,464 - dropbox - INFO - HttpError status_code=504: Retrying in 3.8 seconds
    2018-09-25 09:02:20,277 - dropbox - INFO - Request to files/list_folder/continue
    2018-09-25 09:03:50,319 - urllib3.connectionpool - DEBUG - https://api.dropboxapi.com:443 "POST /2/files/list_folder/continue HTTP/1.1" 504 35
    2018-09-25 09:03:50,320 - dropbox - INFO - HttpError status_code=504: Retrying in 7.6 seconds
    2018-09-25 09:03:57,961 - dropbox - INFO - Request to files/list_folder/continue
    2018-09-25 09:05:27,982 - urllib3.connectionpool - DEBUG - https://api.dropboxapi.com:443 "POST /2/files/list_folder/continue HTTP/1.1" 504 35
    2018-09-25 09:05:27,984 - dropbox - INFO - HttpError status_code=504: Retrying in 0.3 seconds
    2018-09-25 09:05:28,278 - dropbox - INFO - Request to files/list_folder/continue

    • Greg-DB's avatar
      Greg-DB
      Icon for Dropbox Staff rankDropbox Staff

      Thanks for the report! This can be related to these operations taking a long time due to a large number of files and/or a large amount of file activity in the affected account(s).

      We'll look into it, but there are a few potential workarounds:

      1) Use the 'limit' parameter on files_list_folder:

      https://dropbox-sdk-python.readthedocs.io/en/latest/moduledoc.html#dropbox.dropbox.Dropbox.files_list_folder

      If you specify a smaller value (the default is effectively around 2000), that should help reduce how long each of these calls takes and reduce the likelihood that they will fail. Note that you supply the 'limit' value to files_list_folder itself, and it will apply to all results from files_list_folder_continue using the returned cursor as well.

      2) If you are using recursive=True, switch to recursive=False when calling files_list_folder. This means you would need to make a call for each sub-folder you need though.

      • deeuu's avatar
        deeuu
        Explorer | Level 4

        Thanks for getting back to me Greg.

        I'd rather not have to go with #2 because this implies I can't reliably check for changes in the folder at a later stage using the cursor obtained from `files_list_folder_get_latest_cursor`. i.e. #1 would require me to traverse the entire folder each time I want to check for changes (or require me to store multiple cursors for sub-directories etc.).

        I've already been `running files_list_folder` with `limit` set to 500, and each time the internal server error occurs after about approx 310,000 files/folders. So I can almost predict when the error will occur.

        I'll try it again with `limit` set to 50 and report back tomorrow.

        Is this issue related to the number of files in a folder, i.e. limit -> 1 as number of files -> infinity? If so, I'll stop using cursors altogether.