Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speech Recognition v2 - StartListening/StopListening #1382

Merged
merged 33 commits into from Sep 23, 2023

Conversation

VladislavAntonyuk
Copy link
Collaborator

@VladislavAntonyuk VladislavAntonyuk commented Sep 4, 2023

Description of Change

StartListeningAsync is to start listening for speech. The method returns "immediately", it doesn't wait for the recognition result. I would not name it Background, to not confuse users, that it works like a background service and allows to spy for the user.
so flow for users:
StartListeningAsync
Subscribe for events (SpeechCompleted/PartialSpeech)
do some other stuff
Whenever I want, I can stop listening without caring for the result.

In future we can extend the API with Pause/Resume async

trying to replicate this API: https://learn.microsoft.com/en-us/uwp/api/windows.media.speechrecognition.speechcontinuousrecognitionsession.startasync?view=winrt-22621

Added new APIs :

  • StartListeting
  • StopListening
  • SpeechToTextState
  • SpeechRecognitionUpdated event
  • SpeechRecognitionCompleted event

Removed:
Offline Speech Recognition on Windows. Maybe that is my pronunciation, but the output result is much worse than with online recognition.

Linked Issues

PR Checklist

Additional information

@VladislavAntonyuk VladislavAntonyuk self-assigned this Sep 4, 2023
@VladislavAntonyuk VladislavAntonyuk requested a review from a team September 4, 2023 19:01
pictos
pictos previously requested changes Sep 6, 2023
Copy link
Member

@pictos pictos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good in general, just a friendly reminder this will be a breaking change since you're removing a feature on Windows. Also, I could see a lot of places where the CancellationToken provided by the user is being ignored, and this isn't ideal.

Copy link
Collaborator

@brminnick brminnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Vlad!

Looks like we just need to add unit tests for the new public APIs and write the the docs.

Otherwise, the implementation looks good.

I made a couple tweaks:

  • Added event ISpeechToText.StateChanged
  • Renamed ISpeechToText.State -> ISpeechToText.CurrentState
  • Renamed ISpeechToText.StartListeningAsync -> ISpeechToText.StartListenAsync
  • Renamed ISpeechToText.StopListeningAsync -> ISpeechToText.StopListenAsync
  • Renamed OnSpeechToTextRecognitionResultCompleted -> SpeechToTextRecognitionResultCompletedEventArgs
  • Renamed OnSpeechToTextRecognitionResultUpdated -> SpeechToTextRecognitionResultUpdatedEventArgs
  • Updated XML Comments for new APIs
  • Updated Sample App to be more descriptive

Screenshot_1694303409

@brminnick brminnick added pending documentation This feature requires documentation do not merge Do not merge this PR labels Sep 9, 2023
@brminnick
Copy link
Collaborator

Adding the do not merge and pending documentation labels until we have the Docs PR opened

@VladislavAntonyuk
Copy link
Collaborator Author

VladislavAntonyuk commented Sep 10, 2023

Thank you @brminnick.
StateChanged event only makes sense on Android, where we manually switch the state, all other platforms have get-only property, that returns the real state of speech recognizer.

@brminnick brminnick removed the do not merge Do not merge this PR label Sep 16, 2023
Copy link
Collaborator

@brminnick brminnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JoonghyunCho Hey Jay!

Could you incorporate the newly added event, ISpeechToText.StateChanged into the Tizen implementation?

We have a new method, void OnSpeechToTextStateChanged(SpeechToTextState), that you can call any time the SpeechToText state changes on Tizen.

void OnSpeechToTextStateChanged(SpeechToTextState speechToTextState)
{
speechToTextStateChangedWeakEventManager.HandleEvent(this, new SpeechToTextStateChangedEventArgs(speechToTextState), nameof(StateChanged));
}

I'm not sure if Tizen has it's own event, like Windows' SpeechRecognizer.StateChanged event, that you can subscribe to, or if you'll need to call OnSpeechToTextStateChanged() manually, like we're doing for iOS, MacCatalyst and Windows.

@brminnick
Copy link
Collaborator

brminnick commented Sep 16, 2023

Thanks @VladislavAntonyuk! I think we're super close 💯

Once Jay implements StateChanged for Tizen, we should be ready to merge!

Copy link
Member

@pictos pictos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just found a couple of areas that we can improve and a question regarding an opened issue.

@JoonghyunCho
Copy link
Member

Sorry I'm late!!!!! Just added StateChanged to Tizen :)

@JoonghyunCho
Copy link
Member

Not sure why it suddenly cause the vulnerability error with the SkiaSharp. Can you guess anything?

@brminnick brminnick dismissed their stale review September 22, 2023 16:08

Tizen implementation added

@VladislavAntonyuk
Copy link
Collaborator Author

Thank you @brminnick , @JoonghyunCho and @pictos !
@brminnick Can we merge a PR?

@brminnick
Copy link
Collaborator

brminnick commented Sep 23, 2023

For sure! I was just giving it a final look right now 🙌

#1398 is about to merge and then this one will be next

@brminnick brminnick enabled auto-merge (squash) September 23, 2023 19:21
@brminnick brminnick merged commit 429b205 into main Sep 23, 2023
7 checks passed
@brminnick brminnick deleted the speech-recognition-v2 branch September 23, 2023 19:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pending documentation This feature requires documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants