Short Speech Recognition
Android SDK
Short Speech Recognition Android SDK
- When integrating multiple SDKs at the same time, "so" file conflicts may occur. Please use decompression software (such as 7-zip) to decompress the aar package and delete conflicting "so" files.
- Supports Android 5.0 and above versions.
- Before using the SDK, please read the Interface Protocol first. For details, refer to Cloud API.
1 Integration Steps
1.1 Add aar Dependency
Place asr-sdk.aar in the project's libs directory and modify the build.gradle of the app module to add the aar file and okhttp as dependencies.
implementation fileTree(dir: "libs", include: ["*.jar","*.aar"])
implementation 'com.squareup.okhttp3:okhttp:4.10.0'//Here we add okhttp. If the project is relatively old, you can use version 3.14.2.
implementation 'org.java-websocket:Java-WebSocket:1.5.3'
implementation 'com.google.code.gson:gson:2.10.1'If you are using a 32-bit SDK, you need to add the following code:
defaultConfig {
//Add the following code
externalNativeBuild {
ndk {
abiFilters "armeabi-v7a"
}
}
}1.2 Add App-Related Permissions
Modify the AndroidManifest.xml file
<!--Recording permission must be added-->
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<!--Network permission must be added-->
<uses-permission android:name="android.permission.INTERNET" />1.3 Invocation Steps/Sample Code
1.3.1 Obtain Recording Permissions
registerForActivityResult(new ActivityResultContracts.RequestPermission(), success -> {
String msg = "Acquiring recording permission" + (success ? "success" : "error");
tip(msg);
}).launch(Manifest.permission.RECORD_AUDIO);1.3.2 Create a Speech Recognition Class
private Recognizer recognizer;//Short Speech Recognition
private BaseAsr getAsr() {
if (type == Asr.Type.RECOGNIZER) {
return recognizer;
}
return transcriber;
}
//Initialize the SDK in onCreate
transcriber = Transcriber.getInstance(activity);
recognizer = Recognizer.getInstance(activity);1.3.3 Initialize the Microphone
getAsr().initRecorder();//Initialize the microphone when creating the recognition class 1.3.4 Set Up Callbacks
(1) Callback Parameter Description
| Name | Type | Description | Return Parameters |
|---|---|---|---|
| onStart | Function | Callback method when the engine connection starts | String type of the current task ID |
| onResult | Function | Callback method for engine result content | String type of result data |
| onIntermediateResult | Function | Callback method for engine intermediate results | String type of intermediate result data |
| onWarning | Function | Callback method for engine result warnings | Task ID and Errors type of status code |
| onError | Function | Callback method for engine result errors | Task ID and Errors type of status code |
| onGetAudio | Function | Callback method for engine recognition audio data | byte[] type of audio data |
| onStop | Function | Callback method when the engine ends | None |
(2) Parameter Example
getAsr().setListener(new Asr.Listener() {
@Override
public void onStart(String taskId) {
}
@Override
public void onError(String taskId, Errors.Err err) {
}
@Override
public void onResult(String msg) {
}
@Override
public void onIntermediateResult(String msg) {
}
@Override
public void onWarning(String taskId, Errors.Err err) {
}
@Override
public void onGetAudio(byte[] data) {
}
@Override
public void onStop() {
}
}); 1.3.5 Activate Permissions
(1) Configuration Parameters
| Name | Type | Description | Default Value |
|---|---|---|---|
| onSuccess | Function | Callback method for successful initialization | None |
| onFail | Function | Callback method for failed initialization | None |
(2) Parameter Example
@NonNull
private Asr.InitListener getInitListener() {
return new Asr.InitListener() {
@Override
public void onSuccess() {
tip("Initialization successful");
}
@Override
public void onFail(Errors.Err err) {
}
};
}Online Authentication
getAsr().initOnline(appId, appSecret, getInitListener());1.3.6 Set Parameters
Interface Parameter Description
| Parameter | Type | Required | Description | Default Value |
|---|---|---|---|---|
| lang_type | String | Yes | Language option | Required |
| format | String | No | Audio encoding format | mp3 |
| sample_rate | Integer | No | Audio sampling rate | 16000 |
| enable_intermediate_result | Boolean | No | Whether to return intermediate recognition results | true |
| enable_punctuation_prediction | Boolean | No | Whether to add punctuation in post-processing | true |
| enable_inverse_text_normalization | Boolean | No | Whether to perform ITN in post-processing | true |
| max_sentence_silence | Integer | No | Speech sentence breaking detection threshold. Silence longer than this threshold is considered as a sentence break. The valid parameter range is 200~1200. Unit: Milliseconds | 450 |
| enable_words | Boolean | No | Whether to enable returning word information | false |
| enable_modal_particle_filter | Boolean | No | Whether to enable modal particle filtering | true |
| hotwords_id | String | No | Hotwords ID | None |
| hotwords_weight | Float | No | Hotwords weight, the range is [0.1, 1.0] | 0.4 |
| correction_words_id | String | No | Forced correction vocabulary ID Supports multiple IDs, separated by a vertical bar. All indicates using all IDs. | None |
| forbidden_words_id | String | No | Forbidden words ID Supports multiple IDs, separated by a vertical bar. All indicates using all IDs. | None |
//4.1 Set recognition parameters
JsonObject params = new JsonObject();
params.addProperty("lang_type", langType);//The language type to be recognized (Required)
params.addProperty("sample_rate", 16000);//Audio sampling rate
params.addProperty("enable_intermediate_result", true);//Whether to return intermediate recognition results
params.addProperty("enable_punctuation_prediction", true);//Whether to add punctuation in post-processing
params.addProperty("enable_inverse_text_normalization", true);//Whether to perform ITN in post-processing
params.addProperty("max_sentence_silence", 800);//Speech sentence breaking detection threshold. Silence longer than this threshold is considered as a sentence break. The valid parameter range is 200〜2000(ms), and the default value is 800ms
params.addProperty("enable_words", true);//Whether to enable returning word information1.3.7 Start/Stop Recognition
| Name | Type | Description | Default Value |
|---|---|---|---|
| autoRecording | boolean | Microphone initialized as true | true |
| onlineAsr | boolean | True for Cloud API, false for On-Device | true |
| params | JsonObject | Parameter json | None |
Start Recognition (SDK Built-in Mic)
getAsr().start(autoRecording, onlineAsr, params);Start Recognition (SDK External Audio Transfer)
//1. The method of external recording transmission
//1.1 start
getAsr().start(false, onlineAsr, params);
//1.2 feed audio
getAsr().feed(data, false);
//1.3 end
getAsr().stop();
//2. The method of transmitting audio by reading files externally
//2.1 start
getAsr().start(false, onlineAsr, params);
//2.2 feed audio
getAsr().feed(data, false);
//2.3 end(The ending data requires the transmission of tail packet data)
getAsr().feed(data, true);Start Recognition (File Recognition Method)
getAsr().startPath(onlineAsr,params,"Local audio address");
//No need to manually stop, automatically stops and calls back the onstop method when the file transfer is complete.Force Sentence Ending
getAsr().sentenceEnd();Customize Speaker Number
getAsr().speakerStart("speaker_name")Stop Recognition
getAsr().stop();Save Recognition Audio
getAsr().setSaveAudio(true);
//By default, saving is not enabled. Set to true to save, with the path being context.getExternalFilesDir(null)/asrCache