`ValueError: I/O operation on closed file` · Issue #8 · deepgram/deepgram-python-sdk · GitHub
Skip to content

ValueError: I/O operation on closed file #8

Description

@mbyzhang

What is the current behavior?

A ValueError: I/O operation on closed file is raised when calling dg_client.transcription.prerecorded and the following conditions satisfy

  • The passed buffer is a stream object (e.g. file)
  • The request failed for some reason (e.g. bad token)

Such an error is misleading. It is a side effect described in expected behaviour.

What's happening that seems wrong?

The SDK automatically reattempts requests if they fail. However, stream objects (often) cannot be re-read from the beginning.

Steps to reproduce

To make it faster to diagnose the root problem. Tell us how can we reproduce the bug.

DEEPGRAM_API_KEY = "BAD_TOKEN"
dg_client = Deepgram(DEEPGRAM_API_KEY)
with open("some_file_that_exists.wav", "rb") as f:
    await dg_client.transcription.prerecorded({"buffer": f, "mimetype": "audio/wav"})

Expected behavior

If the buffer is a stream object, the SDK should not automatically retry the request because streams cannot be directly restarted from the beginning. Retrying the request will cause ValueError: I/O operation on closed file exception because the stream is fully consumed and hence closed after the first attempt. Instead, the request should be only made at most once. If it fails, the real exception is thrown (e.g. in case of bad token, an Unauthorized exception should be thrown).

What would you expect to happen when following the steps above?

Please tell us about your environment

We want to make sure the problem isn't specific to your operating system or programming language.

  • Operating System/Version: Not relevant
  • Language: Python
  • Browser: Not relevant

Other information

Anything else we should know? (e.g. detailed explanation, stack-traces, related issues, suggestions how to fix, links for us to have context, eg. stack overflow, codepen, etc)

Suggested fix

In _request defined in _utils.py of the SDK, there is a retry logic.

async def _request(
    path: str, options: Options,
    method: str = 'GET', payload: Payload = None,
    headers: Optional[Mapping[str, str]] = None
) -> Any:
    # ...
    tries = RETRY_COUNT
    while tries > 0:
        try:
            return await attempt()
        except Exception as exc:
            print(exc)
            tries -= 1
            continue

To fix the problem, check the type of payload. If it is stream-like, only try once by assigning 1 to tries.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions