Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix error when saving utterances #140

Merged
merged 10 commits into from
Sep 16, 2024
Merged

Conversation

Erotemic
Copy link
Contributor

@Erotemic Erotemic commented Sep 14, 2024

When the config listener.save_utterances is true the current version 0.2.0a is causing a KeyError because "transcription" is no longer in the stt_meta dictionary. It looks like the key is now "transcriptions" and it is a List[Tuple[str, float]], which I'm going to guess is a list of possible transcriptions and their confidence scores.

In any case, I don't think it matters because if the user wants to save transcriptions they probably want to save them even if they are saying "the same thing", so hashing the transcription is probably not the best way to come up with a file name.

Instead I opted to just use a uuid. It might be better to do something with a ISO timestamp instead, but this at least works for now.

Summary by CodeRabbit

  • New Features

    • Introduced customizable filename templates for saved audio files, incorporating dynamic elements like transcription hashes and UUIDs for better organization.
    • Added a utility for generating dynamic filenames based on user-defined templates, enhancing flexibility in audio data management.
  • Bug Fixes

    • Enhanced error handling in the audio processing workflow to accommodate varying transcription data availability.
  • Documentation

    • Updated documentation to reflect changes in audio file naming conventions and the new filename formatting options.

Copy link
Contributor

coderabbitai bot commented Sep 14, 2024

Walkthrough

The pull request introduces a new _TemplateFilenameFormatter class in the ovos_dinkum_listener/_util.py file, enabling dynamic filename generation based on customizable templates. The _save_stt method in ovos_dinkum_listener/service.py has been updated to utilize this new formatter, allowing filenames to include elements like transcription hashes and UUIDs. The implementation includes error handling for unsupported keys in templates, enhancing the robustness of the filename generation process.

Changes

Files Change Summary
ovos_dinkum_listener/_util.py Added _TemplateFilenameFormatter class for dynamic filename generation with customizable templates.
ovos_dinkum_listener/service.py Enhanced _save_stt method to support customizable filename templates using UUIDs and transcription hashes. Removed import of hash_sentence.

Poem

In the burrow where sounds take flight,
A UUID now shines so bright.
No more hashes, just unique names,
Each audio file plays its own games.
Hopping along, we celebrate,
New filenames, oh how great! 🐇🎶


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    -- I pushed a fix in commit <commit_id>, please review it.
    -- Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    -- @coderabbitai generate unit testing code for this file.
    -- @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    -- @coderabbitai generate interesting stats about this repository and render them as a table.
    -- @coderabbitai read src/utils.ts and generate unit testing code.
    -- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    -- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Outside diff range and nitpick comments (1)
ovos_dinkum_listener/service.py (1)

684-685: Ensure UUID change doesn't break existing code.

The change to use UUIDs for generating unique filenames is valid and eliminates the potential for naming collisions.

However, be aware that this change breaks the previous convention of correlating filenames with transcription content. Downstream code that relies on this correlation will need to be updated.

Consider adding a comment to explain the rationale behind switching to UUIDs, to aid future maintainers.

Example:

# Using UUID to ensure each audio file has a unique filename.
# This breaks the previous convention of correlating filenames with 
# transcription content, but eliminates potential naming collisions.
filename = str(uuid.uuid4())
Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 07b6a22 and 184296c.

Files selected for processing (1)
  • ovos_dinkum_listener/service.py (2 hunks)
Additional comments not posted (1)
ovos_dinkum_listener/service.py (1)

16-16: LGTM!

The uuid module import is a valid addition and is likely being used to generate unique identifiers.

Copy link

@mikejgray mikejgray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UUID is a good idea but a major change in functionality impacting every current user of the feature. Please consider making this a separate config item rather than fully replacing the existing system

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Outside diff range and nitpick comments (1)
ovos_dinkum_listener/service.py (1)

696-706: Consider refactoring the transcription concatenation logic to avoid duplication.

The hash key builder tries to handle both legacy and new STT result formats, which is great for backwards compatibility. However, the transcription concatenation logic is duplicated. Consider extracting it into a helper function to improve readability and maintainability.

@formatter.register('hash')
def transcription_hash():
-    try:
-        return hash_sentence(stt_meta.get('transcription'))
-    except KeyError:
-        tests = [t[0] for t in stt_meta.get('transcriptions')]
-        concat_text = '_'.join(tests)
-        return hash_sentence(concat_text)
+    transcription = stt_meta.get('transcription')
+    if transcription:
+        return hash_sentence(transcription)
+    else:
+        transcriptions = stt_meta.get('transcriptions', [])
+        concat_text = '_'.join(t[0] for t in transcriptions)
+        return hash_sentence(concat_text)
Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 46315e3 and 07aaaed.

Files selected for processing (1)
  • ovos_dinkum_listener/service.py (2 hunks)
Additional comments not posted (2)
ovos_dinkum_listener/service.py (2)

684-708: Flexible and robust filename generation for saved STT utterances.

The new _TemplateFilenameFormatter class provides a flexible way to generate filenames based on a user-defined template. Including a hash of the transcription and a UUID in the default template helps avoid overwriting files and ensures unique filenames. Nicely done!


1140-1263: Excellent implementation of a flexible and extensible filename formatter!

The _TemplateFilenameFormatter class is a well-designed helper for dynamically building filenames based on user-specified templates. It offers the following benefits:

  • Supports a mix of built-in keys ("uuid", "date", "utcdate") and custom keys registered through a decorator.
  • Cleanly separates the parsing logic (_build_fmtkw) from the formatting logic (format).
  • Provides detailed docstring with usage examples, serving as good documentation.
  • Raises a descriptive KeyError if the template contains an unknown key.

The class is flexible, extensible, and easy to use. Great job!

@Erotemic
Copy link
Contributor Author

I've added a new config option: listener.filename_template. This can take a value similar to a Python format string. Keys inside curly braces are replaced with a special value that needs to be pre-defined.

I think this feature to allow users to specify how file paths are stored gives a good deal of flexibility but without the overhead of adding lots of niche config variable. The tradeoff is that this config variable is a good deal more complex, but I've tried to make it somewhat easy to work with. The following text will describe it and perhaps can be used for documentation:

Here are several example values for "filename_template" and file-saving strategy they implement.

  • "last" - a template with no variables. This just will save the last utterance, which has very low storage cost.

  • "{hash}" - This is the legacy behavior, it names the file based on the hash of the transcrypt.

  • "{uuid}" - This saves a file using only a uuid

  • "{date:%Y-%m-%dT%H%M%S%z}-{uuid}" - save the file based on the ISO date and uuid.

  • "{hash}-{date:%Y-%m-%dT%H%M%S%z}-{uuid}" - This is the proposed "best-of-3-worlds" default, but it is rather lengthy.

Note: currently "{date}" can be used by itself, but it may use non-path-safe characters.

The logic to do the "dynamic formatting" is in _TemplateFilenameFormatter. You'll note that it has a lot of doctests which demonstrate its usage (and could be run in a test suite via xdoctest). I think this might be generically useful enough to be in a util module, but I wouldn't mind feedback on the API.

I'm also open to suggestions for what the special keys we allow are, or the form that they take. We could use a nicer and safer function for a generic "date" key if we use or vendor in ubelt.timestamp.

I'll also note that I have written a tool to help "liberate" simple code with explicit dependencies from their module's

import liberator
import ubelt 
lib = liberator.Liberator()
lib.add_dynamic(ubelt.timestamp)
lib.expand(['ubelt'])
print(lib.current_sourcecode())

This will give you the 276 lines of code to port all of the features in ubelt.timestamp() to a standalone utility. But it also might be easier to stick with stdlib modules.

Copy link
Member

@JarbasAl JarbasAl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple nitpicks, but it's looking!

@Erotemic
Copy link
Contributor Author

I've made another change that seemed reasonable after trying to write the docs. The name "filename_template" was too generic and was not clear if it corresponded to the utterance or the wake word, so I think it would be good to rename the config option as listener.utterance_filename.

@JarbasAl
Copy link
Member

I've made another change

seems you didnt push ;)

@Erotemic
Copy link
Contributor Author

Apparently, I didn't even commit 🙃. Fixed now.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Outside diff range and nitpick comments (1)
ovos_dinkum_listener/service.py (1)

685-707: Looks good! Just a minor suggestion.

The changes to the _save_stt function introduce a flexible way to generate filenames for saved utterances using the _TemplateFilenameFormatter class. The md5 key registration handles different versions of the STT metadata format, ensuring compatibility. The error handling for missing transcriptions is also a good addition.

Consider adding a comment explaining the purpose of the md5 key and its handling of different metadata formats for better code readability. For example:

@formatter.register('md5')
def transcription_md5():
    # Generate an MD5 hash of the transcription text.
    # Handles different versions of the STT metadata format:
    # - Legacy API: 'transcription' key contains the text.
    # - New API: 'transcriptions' key contains a list of tuples (text, confidence).
    # Returns 'null' if the transcription is not available.
    try:
        ...
Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 7befbcb and 690c6c4.

Files selected for processing (2)
  • ovos_dinkum_listener/_util.py (1 hunks)
  • ovos_dinkum_listener/service.py (2 hunks)
Files skipped from review as they are similar to previous changes (1)
  • ovos_dinkum_listener/_util.py
Additional comments not posted (3)
ovos_dinkum_listener/service.py (3)

45-45: LGTM!

The _TemplateFilenameFormatter class provides a flexible and extensible way to generate filenames based on user-defined templates. The ability to register custom keys allows for dynamic generation of filenames based on various metadata. The class is well-structured and the methods have clear responsibilities.

Also applies to: 685-707


Line range hint 708-720: Skipped review.

This function is not part of the changed code segments.


Line range hint 721-729: LGTM!

The changes to the _stt_audio function ensure that it uses the updated _save_stt function with the new filename generation logic. The existing functionality of saving and uploading the STT audio and metadata remains intact.

@JarbasAl JarbasAl merged commit c959f2e into OpenVoiceOS:dev Sep 16, 2024
10 checks passed
JarbasAl added a commit to OpenVoiceOS/ovos-config that referenced this pull request Oct 19, 2024
companion to OpenVoiceOS/ovos-dinkum-listener#140

the current default value in config is no longer valid
JarbasAl added a commit to OpenVoiceOS/ovos-config that referenced this pull request Oct 19, 2024
companion to OpenVoiceOS/ovos-dinkum-listener#140

the current default value in config is no longer valid
JarbasAl added a commit that referenced this pull request Oct 19, 2024
backwards compat check wasnt quite right in #140

```
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]: 2024-10-19 08:04:47.463 - voice - ovos_dinkum_listener.service:_stt_audio:747 - ERROR - Error while saving STT audio
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]: Traceback (most recent call last):
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:   File "/home/goldyfruit/.venvs/ovos/lib64/python3.11/site-packages/ovos_dinkum_listener/service.py", line 742, in _stt_audio
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:     stt_context["filename"] = self._save_stt(audio_bytes, stt_context)
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:   File "/home/goldyfruit/.venvs/ovos/lib64/python3.11/site-packages/ovos_dinkum_listener/service.py", line 707, in _save_stt
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:     filename = formatter.format(utterance_filename)
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:   File "/home/goldyfruit/.venvs/ovos/lib64/python3.11/site-packages/ovos_dinkum_listener/_util.py", line 107, in format
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:     fmtkw = self._build_fmtkw(template, **kwargs)
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:   File "/home/goldyfruit/.venvs/ovos/lib64/python3.11/site-packages/ovos_dinkum_listener/_util.py", line 91, in _build_fmtkw
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:     fmtkw[key] = builder()
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:                  ^^^^^^^^^
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:   File "/home/goldyfruit/.venvs/ovos/lib64/python3.11/site-packages/ovos_dinkum_listener/service.py", line 705, in transcription_md5
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:     return hash_sentence(text)
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:            ^^^^^^^^^^^^^^^^^^^
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:   File "/home/goldyfruit/.venvs/ovos/lib64/python3.11/site-packages/ovos_plugin_manager/utils/tts_cache.py", line 20, in hash_sentence
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:     encoded_sentence = sentence.encode("utf-8", "ignore")
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:                        ^^^^^^^^^^^^^^^
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]: AttributeError: 'NoneType' object has no attribute 'encode'

```

also needs OpenVoiceOS/ovos-config#171
@JarbasAl JarbasAl mentioned this pull request Oct 19, 2024
JarbasAl added a commit that referenced this pull request Oct 19, 2024
* fix:save utterances

backwards compat check wasnt quite right in #140

```
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]: 2024-10-19 08:04:47.463 - voice - ovos_dinkum_listener.service:_stt_audio:747 - ERROR - Error while saving STT audio
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]: Traceback (most recent call last):
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:   File "/home/goldyfruit/.venvs/ovos/lib64/python3.11/site-packages/ovos_dinkum_listener/service.py", line 742, in _stt_audio
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:     stt_context["filename"] = self._save_stt(audio_bytes, stt_context)
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:   File "/home/goldyfruit/.venvs/ovos/lib64/python3.11/site-packages/ovos_dinkum_listener/service.py", line 707, in _save_stt
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:     filename = formatter.format(utterance_filename)
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:   File "/home/goldyfruit/.venvs/ovos/lib64/python3.11/site-packages/ovos_dinkum_listener/_util.py", line 107, in format
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:     fmtkw = self._build_fmtkw(template, **kwargs)
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:   File "/home/goldyfruit/.venvs/ovos/lib64/python3.11/site-packages/ovos_dinkum_listener/_util.py", line 91, in _build_fmtkw
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:     fmtkw[key] = builder()
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:                  ^^^^^^^^^
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:   File "/home/goldyfruit/.venvs/ovos/lib64/python3.11/site-packages/ovos_dinkum_listener/service.py", line 705, in transcription_md5
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:     return hash_sentence(text)
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:            ^^^^^^^^^^^^^^^^^^^
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:   File "/home/goldyfruit/.venvs/ovos/lib64/python3.11/site-packages/ovos_plugin_manager/utils/tts_cache.py", line 20, in hash_sentence
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:     encoded_sentence = sentence.encode("utf-8", "ignore")
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]:                        ^^^^^^^^^^^^^^^
Oct 19 08:04:47 x270.home.lan ovos-dinkum-listener[152114]: AttributeError: 'NoneType' object has no attribute 'encode'

```

also needs OpenVoiceOS/ovos-config#171

* Update requirements.txt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants