0.3.0 minor issues #40

ghost · 2024-06-05T20:33:37Z

Just did some testing on 0.3.0, pretty good!
Android 14 (GrapheneOS)

Minor issues:

If app is started and microphone is blocked, even though app has microphone permission, the app doesn't trigger the "unblock microphone" notification.
The app will give "♪ ♪" when you aren't talking and you tap the microphone. 🤷🏻‍♂️ Nothing big, doesn't seem to do it if you talk, even a little.
I still think there should be some type of trigger to cause it to output the text instead of just stop because the output is very fast when you trigger it via the stop thus, tapping such a trigger key would output what's been input as the user then continues on. Best example is when finishing a paragraph: tap the trigger → output → return key → continue talking 😁

All in all though, great upgrade!

Edit

Not sure about this stuff:

What is 1200 divided by 6?

16 times 12 equals

🤷🏻‍♂️

soupslurpr · 2024-06-05T23:23:17Z

Oh hey there again! Thanks for the feedback on 0.3.0!

Welp unfortunately this one is an upstream issue with the SpeechRecognizer class it seems. It also happens with Google's SpeechRecognizer, which you can see by trying to use Google Maps with it selected. Also there's an issue for that at App won't work if global mic toggle is enabled when you try an initial transcription, even if you re-enable it later #3
Ah that's because of the new sound effect that gets played, actually those music notes shouldn't be appearing when using whisper.cpp's supress_non_speech_tokens option, and it seems to be a bug on their end. I believe there was a PR resolving it. I could just manually find those and simply replace them with nothing, but ideally it would be suppressed with the suppress_non_speech_tokens option.
Yeah that does sound good, but I don't think I'm going to implement it soon. There needs to be a way to customize which tiles are visible and their size before adding more.

Oh what do you mean this stuff? What's that?

ghost · 2024-06-05T23:30:13Z

Oh what do you mean this stuff? What's that?

🤣

When saying phrases like "16*7=" they come out as "16 times 7 equals"

I mention it in my suggestion #41

soupslurpr · 2024-06-05T23:43:25Z

Ah, well that's probably difficult to accomplish, as you can't just replace every instance of times to * and equals to = or it would break nonmathematical uses of those words. Maybe prompting Whisper would work but I haven't tested that and it might also affect the quality of other things. The best solution would probably be using a local AI to process the outputs if prompting doesn't work well. Not sure how much the AI would affect speed on relatively modern device (such as a Pixel 7). It would also be several gigabytes large and take up several gigabytes in memory.

ghost · 2024-06-06T00:56:52Z

It would also be several gigabytes large and take up several gigabytes in memory.

😱

Yeah, I'm for lean and fast 😁

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.3.0 minor issues #40

0.3.0 minor issues #40

ghost commented Jun 5, 2024 •

edited by ghost

Loading

soupslurpr commented Jun 5, 2024

ghost commented Jun 5, 2024

soupslurpr commented Jun 5, 2024 •

edited

Loading

ghost commented Jun 6, 2024

0.3.0 minor issues #40

0.3.0 minor issues #40

Comments

ghost commented Jun 5, 2024 • edited by ghost Loading

soupslurpr commented Jun 5, 2024

ghost commented Jun 5, 2024

soupslurpr commented Jun 5, 2024 • edited Loading

ghost commented Jun 6, 2024

ghost commented Jun 5, 2024 •

edited by ghost

Loading

soupslurpr commented Jun 5, 2024 •

edited

Loading