C SDK
Real-time Speech Recognition C SDK
Before using the SDK, please read the Interface Protocol first. For details, refer to Cloud API.
1 How to Use SDK Demo
The SDK is divided into Windows/Linux versions, each system includes x86/x64 bit libraries (stored in the libs directory), and the platform-specific run scripts facilitate execution on their respective platforms.
- Windows-x86 : run-x86.bat
- Windows-x64 : run-x64.bat
- Linux-x86 : run-x86.sh
- Linux-x64 : run-x64.sh
When compiling code with g++, you need to use -lasr to specify the required library.
2 Key Interface Description
| Function Name | Parameters | Description |
|---|---|---|
| getAsrVersion | None | Get SDK version number |
| asrGetToken | char* appId : appId, char* appSecret : appSecret, bool proEnvironment : Whether it is a production environment | Get token, return in JSON format: {"status":"00000","message":"success","data":{"app_id":"string","token":"string","expiration_time":1686622013}} expiration_time is the token expiration time, the token can be reused before expiration When using, you need to extract data.token and pass it to the corresponding interface |
| getAsrParams | None | Get recognition parameters, refer to AsrParams structure description |
| startTranscribe | char* token : obtained through the asrGetToken interface AsrParams params : recognition parameters. Refer to the AsrParams structure description TranscriberListener listener : recognition callback. Refer to the TranscriberListener structure description | Start real-time speech recognition |
| asrFeed | char* taskId : taskId returned by onStart, char* data : audio data int length : array length | Send audio data, need to be called multiple times in the listener's onStart callback |
| stopTranscribe | None | Stop real-time speech recognition |
| sentenceEnd | char* taskId | Force sentence ending (only valid for real-time speech recognition) |
| speakerStart | char* taskId, char* speakerId | Customize speaker number (only valid for real-time speech recognition) |
AsrParams Structure
| Field Definition | Description | Required | Default Value |
|---|---|---|---|
| const char *langType | Language option | Yes | None |
| const char *format | Audio encoding format | No | pcm |
| int sampleRate | Audio sampling rate | No | 16000 |
| bool enableIntermediateResult | Whether to return intermediate recognition results. If enabled, the intermediate recognition results will be returned in the Listener's onIntermediateResult function | No | false |
| bool enablePunctuationPrediction | Whether to add punctuation in post-processing | No | false |
| bool enableInverseTextNormalization | Whether to perform ITN in post-processing | No | false |
| int maxSentenceSilence | Speech sentence breaking detection threshold. Silence longer than this threshold is considered as a sentence break. The valid parameter range is 200~1200. Unit: Milliseconds | No | 450 |
| bool enableWords | Whether to return word information | No | false |
| bool enableModalParticleFilter | Whether to enable modal particle filtering | No | false |
| const char *hotwordsId | Hotwords ID | No | None |
| float hotwordsWeight | Hotwords weight, the range is [0.1, 1.0] | No | 0.4 |
| const char *correctionWordsId | Forced correction vocabulary ID Supports multiple IDs, separated by a vertical bar; all indicates using all IDs. | No | None |
| const char *forbiddenWordsId | Forbidden words ID Supports multiple IDs, separated by a vertical bar; all indicates using all IDs. | No | None |
TranscriberListener Structure
| Function Definition | Description |
|---|---|
| void (*onStart)(const char * taskId) | Recognition start callback, need to call feed to send audio data in the callback |
| void (*onSentenceBegin)(const char * taskId, const char *msg) | Sentence start callback |
| void (*onSentenceEnd)(const char * taskId, const char *msg) | Sentence end callback |
| void (*onIntermediateResult)(const char * taskId, int index, const char *msg) | Intermediate result callback, need to set enableIntermediateResult to true in the parameters first |
| void (*onStop)(const char * taskId) | Recognition stop callback |
| void (*onError)(const char *taskId, const char *code, const char *msg) | Recognition error callback. code is the error code, msg is the error message |
| void (*onWarning)(const char *taskId, const char *code, const char *msg) | Recognition warning callback. code is the error code, msg is the error message |
| void (*onGetVolume)(const char *taskId, float volume) | Real-time volume callback |
3 Sample Code
Please refer to asr-demo.cpp
4 Status Code List
| Error Code | Error Message | Description | Solution |
|---|---|---|---|
| 110002 | Unauthorized | Unauthorized | Change parameters to an authorized ability or contact the business department to add AI abilities |
| 110003 | APP ID Expired | APP ID expired | Contact the business department to extend the validity period of the APP ID |
| 110004 | Calling Quota Exceeded | Exceeded the call quota | Contact the business department to increase the approved period or number of calls |
| 110010 | Online Capabilities Not Authorized | Cloud API capabilities are not approved | Contact the business department to add authorization for Cloud API capabilities |
| 120000 | Network Error | Network error | Check the network connection status of the device |
| 200000 | Invalid Parameter | The parameter is incorrect | Check parameters (parameters are not compliant, incorrect, or empty strings) |
| 210203 | Invalid Number Of Channels | The number of audio channels is incorrect | Check the number of audio channels (WAV audio channel number is not 1) |
| 210000 | Gateway Timeout In Receiving Data | The gateway timed out when receiving data | Check if data is being sent normally |
| 210100 | Invalid Calling Sequence | The calling sequence is incorrect | Call in the normal sequence |