Requirement: can use voice recognition api from cloud service vendors

The current speech-to-text function only relies on the local big model, in the Windows computer to open more software or configuration is a little poor, after recording a minute or so of voice, the end of the recording will be the moment, the CPU fan running at high speed, the occupancy rate rises.
Measured about a minute of voice, the text to appear more than 7-9 seconds, after deducting the cloud model error correction more than 1 second, the local text to spend 6-8 seconds.
Consider adding a new demand, you can fill in the big model api like, use tencent cloud, ali cloud, ke daxunfei this kind of cloud service provider's speech recognition api interface.
Because in personal use is not large, in fact, the cost of payment is not high, there is also a free amount, put the speech to text in the cloud, improve the delay experience.

Please authenticate to join the conversation.

Upvoters
Status

In Review

Board
💡

Feature Request

Date

4 months ago

Author

Zhi Qin

Subscribe to post

Get notified by email when there are changes.