Logo
Short Speech Recognition

Android SDK

Short Speech Recognition Android SDK

  • When integrating multiple SDKs at the same time, "so" file conflicts may occur. Please use decompression software (such as 7-zip) to decompress the aar package and delete conflicting "so" files.
  • Supports Android 5.0 and above versions.
  • Before using the SDK, please read the Interface Protocol first. For details, refer to Cloud API.

1 Integration Steps

1.1 Add aar Dependency

Place asr-sdk.aar in the project's libs directory and modify the build.gradle of the app module to add the aar file and okhttp as dependencies.

implementation fileTree(dir: "libs", include: ["*.jar","*.aar"])
implementation 'com.squareup.okhttp3:okhttp:4.10.0'//Here we add okhttp. If the project is relatively old, you can use version 3.14.2.
implementation 'org.java-websocket:Java-WebSocket:1.5.3'
implementation 'com.google.code.gson:gson:2.10.1'

If you are using a 32-bit SDK, you need to add the following code:

defaultConfig {
    //Add the following code
    externalNativeBuild {
        ndk {
            abiFilters  "armeabi-v7a" 
        }
    }
}

Modify the AndroidManifest.xml file

<!--Recording permission must be added-->
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<!--Network permission must be added-->
<uses-permission android:name="android.permission.INTERNET" />

1.3 Invocation Steps/Sample Code

1.3.1 Obtain Recording Permissions

registerForActivityResult(new ActivityResultContracts.RequestPermission(), success -> {
        String msg = "Acquiring recording permission" + (success ? "success" : "error");
        tip(msg);
}).launch(Manifest.permission.RECORD_AUDIO);

1.3.2 Create a Speech Recognition Class

private Recognizer recognizer;//Short Speech Recognition

private BaseAsr getAsr() {
    if (type == Asr.Type.RECOGNIZER) {
         return recognizer;
        }
         return transcriber;
    }

//Initialize the SDK in onCreate
transcriber = Transcriber.getInstance(activity);
recognizer = Recognizer.getInstance(activity);

1.3.3 Initialize the Microphone

getAsr().initRecorder();//Initialize the microphone when creating the recognition class    

1.3.4 Set Up Callbacks

(1) Callback Parameter Description

NameTypeDescriptionReturn Parameters
onStartFunctionCallback method when the engine connection startsString type of the current task ID
onResultFunctionCallback method for engine result contentString type of result data
onIntermediateResultFunctionCallback method for engine intermediate resultsString type of intermediate result data
onWarningFunctionCallback method for engine result warningsTask ID and Errors type of status code
onErrorFunctionCallback method for engine result errorsTask ID and Errors type of status code
onGetAudioFunctionCallback method for engine recognition audio databyte[] type of audio data
onStopFunctionCallback method when the engine endsNone

(2) Parameter Example

getAsr().setListener(new Asr.Listener() {
   @Override
   public void onStart(String taskId) {   
   }

   @Override
   public void onError(String taskId, Errors.Err err) {
   }

   @Override
   public void onResult(String msg) {
   }

   @Override
   public void onIntermediateResult(String msg) {
   }

   @Override
   public void onWarning(String taskId, Errors.Err err) {
   }

   @Override
   public void onGetAudio(byte[] data) {
   }

   @Override
   public void onStop() {
   }
});         

1.3.5 Activate Permissions

(1) Configuration Parameters

NameTypeDescriptionDefault Value
onSuccessFunctionCallback method for successful initializationNone
onFailFunctionCallback method for failed initializationNone

(2) Parameter Example

@NonNull
private Asr.InitListener getInitListener() {
    return new Asr.InitListener() {
        @Override
        public void onSuccess() {
            tip("Initialization successful");
        }
        @Override
        public void onFail(Errors.Err err) {
          
        }
    };
}

Online Authentication

getAsr().initOnline(appId, appSecret, getInitListener());

1.3.6 Set Parameters

Interface Parameter Description

ParameterTypeRequiredDescriptionDefault Value
lang_typeStringYesLanguage optionRequired
formatStringNoAudio encoding formatmp3
sample_rateIntegerNoAudio sampling rate16000
enable_intermediate_resultBooleanNoWhether to return intermediate recognition resultstrue
enable_punctuation_predictionBooleanNoWhether to add punctuation in post-processingtrue
enable_inverse_text_normalizationBooleanNoWhether to perform ITN in post-processingtrue
max_sentence_silenceIntegerNoSpeech sentence breaking detection threshold. Silence longer than this threshold is considered as a sentence break.
The valid parameter range is 200~1200. Unit: Milliseconds
450
enable_wordsBooleanNoWhether to enable returning word informationfalse
enable_modal_particle_filterBooleanNoWhether to enable modal particle filteringtrue
hotwords_idStringNoHotwords IDNone
hotwords_weightFloatNoHotwords weight, the range is [0.1, 1.0]0.4
correction_words_idStringNoForced correction vocabulary ID
Supports multiple IDs, separated by a vertical bar.
All indicates using all IDs.
None
forbidden_words_idStringNoForbidden words ID
Supports multiple IDs, separated by a vertical bar.
All indicates using all IDs.
None
//4.1 Set recognition parameters
JsonObject params = new JsonObject();
params.addProperty("lang_type", langType);//The language type to be recognized (Required)
params.addProperty("sample_rate", 16000);//Audio sampling rate
params.addProperty("enable_intermediate_result", true);//Whether to return intermediate recognition results
params.addProperty("enable_punctuation_prediction", true);//Whether to add punctuation in post-processing
params.addProperty("enable_inverse_text_normalization", true);//Whether to perform ITN in post-processing
params.addProperty("max_sentence_silence", 800);//Speech sentence breaking detection threshold. Silence longer than this threshold is considered as a sentence break. The valid parameter range is 200〜2000(ms), and the default value is 800ms
params.addProperty("enable_words", true);//Whether to enable returning word information

1.3.7 Start/Stop Recognition

NameTypeDescriptionDefault Value
autoRecordingbooleanMicrophone initialized as truetrue
onlineAsrbooleanTrue for Cloud API, false for On-Devicetrue
paramsJsonObjectParameter jsonNone

Start Recognition (SDK Built-in Mic)

getAsr().start(autoRecording, onlineAsr, params);

Start Recognition (SDK External Audio Transfer)

//1. The method of external recording transmission
//1.1 start
getAsr().start(false, onlineAsr, params);
//1.2 feed audio
getAsr().feed(data, false);
//1.3 end
getAsr().stop();

//2. The method of transmitting audio by reading files externally
//2.1 start
getAsr().start(false, onlineAsr, params);
//2.2 feed audio
getAsr().feed(data, false);
//2.3 end(The ending data requires the transmission of tail packet data)
getAsr().feed(data, true);

Start Recognition (File Recognition Method)

getAsr().startPath(onlineAsr,params,"Local audio address");
//No need to manually stop, automatically stops and calls back the onstop method when the file transfer is complete.

Force Sentence Ending

getAsr().sentenceEnd();

Customize Speaker Number

getAsr().speakerStart("speaker_name")

Stop Recognition

getAsr().stop();

Save Recognition Audio

getAsr().setSaveAudio(true);
//By default, saving is not enabled. Set to true to save, with the path being context.getExternalFilesDir(null)/asrCache

2 SDK Download

Android SDK

Android Demo