Logo
Real-time Speech Recognition

Android SDK

Real-time Speech Recognition Android SDK

When integrating multiple SDKs at the same time, "so" file conflicts may occur. Please use decompression software (such as 7-zip) to decompress the aar package and delete conflicting "so" files. Supports Android 5.0 and above versions.

Before using the SDK, please read the Interface Protocol first. For details, refer to Cloud API.

1 Integration Steps

1.1 Add aar Dependency

Place asr-sdk.aar in the project's libs directory and modify the build.gradle of the app module to add the aar file and okhttp as dependencies.

implementation fileTree(dir: "libs", include: ["*.jar","*.aar"])
implementation 'com.squareup.okhttp3:okhttp:4.10.0'//Here we add okhttp. If the project is relatively old, you can use version 3.14.2
implementation 'org.java-websocket:Java-WebSocket:1.5.3'
implementation 'com.google.code.gson:gson:2.10.1'

If you are using a 32-bit SDK, you need to add the following code:

     defaultConfig {
        //Add the following code
        externalNativeBuild {
            ndk {
                abiFilters  "armeabi-v7a" 
            }
        }
    }

Modify the AndroidManifest.xml file

<!--Recording permission-->
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<!--Network permission-->
<uses-permission android:name="android.permission.INTERNET" />

1.3 Invocation Steps/Sample Code

1.3.1 Obtain Recording Permissions

   registerForActivityResult(new ActivityResultContracts.RequestPermission(), success -> {
            tip(success);
        }).launch(Manifest.permission.RECORD_AUDIO);

1.3.2 Create a Speech Recognition Class

    private Transcriber transcriber;//Real-time speech recognition

    private BaseAsr getAsr() {
        if (type == Asr.Type.RECOGNIZER) {
             return recognizer;
            }
             return transcriber;
        }

    //Initialize the SDK in onCreate
    transcriber = Transcriber.getInstance(activity);
    recognizer = Recognizer.getInstance(activity);

1.3.3 Initialize the Microphone

    getAsr().initRecorder();//Initializes the microphone when creating the identification class      

1.3.4 Set Up Callbacks

(1) Callback Parameter Description

NameTypeDescriptionReturn Parameters
onStartFunctionCallback method when the engine connection startsString type of the current task ID
onResultFunctionCallback method for engine result contentString type of result data
onIntermediateResultFunctionCallback method for engine intermediate resultsString type of intermediate result data
onWarningFunctionCallback method for engine result warningsTask ID and Errors type of status code
onErrorFunctionCallback method for engine result errorsTask ID and Errors type of status code
onGetAudioFunctionCallback method for engine recognition audio databyte[] type of audio data
onStopFunctionCallback method when the engine endsNone

(2) Parameter Example

 getAsr().setListener(new Asr.Listener() {
            @Override
            public void onStart(String taskId) {
                
            }

            @Override
            public void onError(String taskId, Errors.Err err) {

            }

            @Override
            public void onResult(String msg) {

            }

            @Override
            public void onIntermediateResult(String msg) {

            }

            @Override
            public void onWarning(String taskId, Errors.Err err) {

            }

            @Override
            public void onGetAudio(byte[] data) {

            }

            @Override
            public void onStop() {

            }
        });     

1.3.5 Activate Permissions

(1) Configuration Parameters

NameTypeDescriptionDefault Value
onSuccessFunctionCallback method for successful initializationNone
onFailFunctionCallback method for failed initializationNone

(2) Parameter Example

  @NonNull
    private Asr.InitListener getInitListener() {
        return new Asr.InitListener() {
            @Override
            public void onSuccess() {
                tip("onSuccess");
            }
            @Override
            public void onFail(Errors.Err err) {
              
            }
        };
    }

Online Authentication

   getAsr().initOnline(appId, appSecret, getInitListener());

1.3.6 Set Parameters

Interface Parameter Description

ParameterTypeRequiredDescriptionDefault Value
lang_typeStringYesLanguage optionRequired
formatStringNoAudio encoding formatpcm
sample_rateIntegerNoAudio sampling rate16000
enable_intermediate_resultBooleanNoWhether to return intermediate recognition resultstrue
enable_punctuation_predictionBooleanNoWhether to add punctuation in post-processingtrue
enable_inverse_text_normalizationBooleanNoWhether to perform ITN in post-processingtrue
max_sentence_silenceIntegerNoSpeech sentence breaking detection threshold. Silence longer than this threshold is considered as a sentence break. The valid parameter range is 200~1200. Unit: Milliseconds450
enable_wordsBooleanNoWhether to enable returning word informationfalse
enable_modal_particle_filterBooleanNoWhether to enable modal particle filteringtrue
hotwords_idStringNoHotwords IDNone
hotwords_weightFloatNoHotwords weight, the range is [0.1, 1.0]0.4
correction_words_idStringNoForced correction vocabulary ID
Supports multiple IDs, separated by a vertical bar; all indicates using all IDs.
None
forbidden_words_idStringNoForbidden words ID
Supports multiple IDs, separated by a vertical bar; all indicates using all IDs.
None
    
    JsonObject params = new JsonObject();
    params.addProperty("lang_type", langType);
    params.addProperty("sample_rate", 16000);
    params.addProperty("enable_intermediate_result", true);	
    params.addProperty("enable_punctuation_prediction", true);
    params.addProperty("enable_inverse_text_normalization", true);
    params.addProperty("max_sentence_silence", 800);
    params.addProperty("enable_words", true);

1.3.7 Start/Stop Recognition

NameTypeDescriptionDefault Value
autoRecordingbooleanMicrophone initialized as truetrue
onlineAsrbooleanTrue for Cloud API, false for On-Devicetrue
paramsJsonObjectParameter jsonNone

Start Recognition

   getAsr().start(autoRecording, onlineAsr, params);

Start Recognition (SDK External Audio Transfer)

        1. The method of external recording transmission

        1.1 start
        getAsr().start(false, onlineAsr, params);
        1.2 feed audio
        getAsr().feed(data, false);
        1.3 end
        getAsr().stop();

        2. The method of transmitting audio by reading files externally

        2.1 start
        getAsr().start(false, onlineAsr, params);
        2.2 feed audio
        getAsr().feed(data, false);
        2.3 end(The ending data requires the transmission of tail packet data)
        getAsr().feed(data, true);

Start Recognition (File Recognition Method)

      getAsr().startPath(onlineAsr,params,"Local audio address");
//No need to manually stop, automatically stops and calls back the onstop method when the file transfer is complete.

Force Sentence Ending

  getAsr().sentenceEnd();

Customize Speaker Number

   getAsr().speakerStart("speaker_name")

Stop Recognition

   getAsr().stop();

Save Recognition Audio

  getAsr().setSaveAudio(true);
  //The default value is not saved. If true is set to save, the path is “context.getExternalFilesDir(null)/asrCache”

2 SDK Download

Android SDK

Android Demo