Android SDK
Real-time Speech Recognition Android SDK
When integrating multiple SDKs at the same time, "so" file conflicts may occur. Please use decompression software (such as 7-zip) to decompress the aar package and delete conflicting "so" files. Supports Android 5.0 and above versions.
Before using the SDK, please read the Interface Protocol first. For details, refer to Cloud API.
1 Integration Steps
1.1 Add aar Dependency
Place asr-sdk.aar in the project's libs directory and modify the build.gradle of the app module to add the aar file and okhttp as dependencies.
implementation fileTree(dir: "libs", include: ["*.jar","*.aar"])
implementation 'com.squareup.okhttp3:okhttp:4.10.0'//Here we add okhttp. If the project is relatively old, you can use version 3.14.2
implementation 'org.java-websocket:Java-WebSocket:1.5.3'
implementation 'com.google.code.gson:gson:2.10.1'If you are using a 32-bit SDK, you need to add the following code:
defaultConfig {
//Add the following code
externalNativeBuild {
ndk {
abiFilters "armeabi-v7a"
}
}
}1.2 Add App-Related Permissions
Modify the AndroidManifest.xml file
<!--Recording permission-->
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<!--Network permission-->
<uses-permission android:name="android.permission.INTERNET" />1.3 Invocation Steps/Sample Code
1.3.1 Obtain Recording Permissions
registerForActivityResult(new ActivityResultContracts.RequestPermission(), success -> {
tip(success);
}).launch(Manifest.permission.RECORD_AUDIO);1.3.2 Create a Speech Recognition Class
private Transcriber transcriber;//Real-time speech recognition
private BaseAsr getAsr() {
if (type == Asr.Type.RECOGNIZER) {
return recognizer;
}
return transcriber;
}
//Initialize the SDK in onCreate
transcriber = Transcriber.getInstance(activity);
recognizer = Recognizer.getInstance(activity);1.3.3 Initialize the Microphone
getAsr().initRecorder();//Initializes the microphone when creating the identification class 1.3.4 Set Up Callbacks
(1) Callback Parameter Description
| Name | Type | Description | Return Parameters |
|---|---|---|---|
| onStart | Function | Callback method when the engine connection starts | String type of the current task ID |
| onResult | Function | Callback method for engine result content | String type of result data |
| onIntermediateResult | Function | Callback method for engine intermediate results | String type of intermediate result data |
| onWarning | Function | Callback method for engine result warnings | Task ID and Errors type of status code |
| onError | Function | Callback method for engine result errors | Task ID and Errors type of status code |
| onGetAudio | Function | Callback method for engine recognition audio data | byte[] type of audio data |
| onStop | Function | Callback method when the engine ends | None |
(2) Parameter Example
getAsr().setListener(new Asr.Listener() {
@Override
public void onStart(String taskId) {
}
@Override
public void onError(String taskId, Errors.Err err) {
}
@Override
public void onResult(String msg) {
}
@Override
public void onIntermediateResult(String msg) {
}
@Override
public void onWarning(String taskId, Errors.Err err) {
}
@Override
public void onGetAudio(byte[] data) {
}
@Override
public void onStop() {
}
});
1.3.5 Activate Permissions
(1) Configuration Parameters
| Name | Type | Description | Default Value |
|---|---|---|---|
| onSuccess | Function | Callback method for successful initialization | None |
| onFail | Function | Callback method for failed initialization | None |
(2) Parameter Example
@NonNull
private Asr.InitListener getInitListener() {
return new Asr.InitListener() {
@Override
public void onSuccess() {
tip("onSuccess");
}
@Override
public void onFail(Errors.Err err) {
}
};
}Online Authentication
getAsr().initOnline(appId, appSecret, getInitListener());1.3.6 Set Parameters
Interface Parameter Description
| Parameter | Type | Required | Description | Default Value |
|---|---|---|---|---|
| lang_type | String | Yes | Language option | Required |
| format | String | No | Audio encoding format | pcm |
| sample_rate | Integer | No | Audio sampling rate | 16000 |
| enable_intermediate_result | Boolean | No | Whether to return intermediate recognition results | true |
| enable_punctuation_prediction | Boolean | No | Whether to add punctuation in post-processing | true |
| enable_inverse_text_normalization | Boolean | No | Whether to perform ITN in post-processing | true |
| max_sentence_silence | Integer | No | Speech sentence breaking detection threshold. Silence longer than this threshold is considered as a sentence break. The valid parameter range is 200~1200. Unit: Milliseconds | 450 |
| enable_words | Boolean | No | Whether to enable returning word information | false |
| enable_modal_particle_filter | Boolean | No | Whether to enable modal particle filtering | true |
| hotwords_id | String | No | Hotwords ID | None |
| hotwords_weight | Float | No | Hotwords weight, the range is [0.1, 1.0] | 0.4 |
| correction_words_id | String | No | Forced correction vocabulary ID Supports multiple IDs, separated by a vertical bar; all indicates using all IDs. | None |
| forbidden_words_id | String | No | Forbidden words ID Supports multiple IDs, separated by a vertical bar; all indicates using all IDs. | None |
JsonObject params = new JsonObject();
params.addProperty("lang_type", langType);
params.addProperty("sample_rate", 16000);
params.addProperty("enable_intermediate_result", true);
params.addProperty("enable_punctuation_prediction", true);
params.addProperty("enable_inverse_text_normalization", true);
params.addProperty("max_sentence_silence", 800);
params.addProperty("enable_words", true);1.3.7 Start/Stop Recognition
| Name | Type | Description | Default Value |
|---|---|---|---|
| autoRecording | boolean | Microphone initialized as true | true |
| onlineAsr | boolean | True for Cloud API, false for On-Device | true |
| params | JsonObject | Parameter json | None |
Start Recognition
getAsr().start(autoRecording, onlineAsr, params);Start Recognition (SDK External Audio Transfer)
1. The method of external recording transmission
1.1 start
getAsr().start(false, onlineAsr, params);
1.2 feed audio
getAsr().feed(data, false);
1.3 end
getAsr().stop();
2. The method of transmitting audio by reading files externally
2.1 start
getAsr().start(false, onlineAsr, params);
2.2 feed audio
getAsr().feed(data, false);
2.3 end(The ending data requires the transmission of tail packet data)
getAsr().feed(data, true);Start Recognition (File Recognition Method)
getAsr().startPath(onlineAsr,params,"Local audio address");
//No need to manually stop, automatically stops and calls back the onstop method when the file transfer is complete.Force Sentence Ending
getAsr().sentenceEnd();Customize Speaker Number
getAsr().speakerStart("speaker_name")Stop Recognition
getAsr().stop();Save Recognition Audio
getAsr().setSaveAudio(true);
//The default value is not saved. If true is set to save, the path is “context.getExternalFilesDir(null)/asrCache”