Logo
Short Speech Recognition

iOS SDK

Short Speech Recognition iOS SDK

  • Supports iOS versions 11.0 and above.
  • Before using the SDK, please read the Interface Protocol first. For details, refer to Cloud API.

1 Integration Steps

  1. Manual import: Drag SpeechEvaluate.framework into your project. Then, under General -> Frameworks, Libraries, and Embedded Content, change the Embed setting for SpeechRecognitionSDK.framework to Embed&Sign.
  2. Ensure the following pods are installed: SocketRocket0.6.0, AFNetworking.

Add Privacy - Microphone Usage Description to your project's info.plist file to add microphone access permissions.

1.2 Invocation Steps/Sample Code

In the file that requires the speech recognition function, please adhere to the delegate protocol EvalListener.

//Configure main parameters
SDKParams *params = [[SDKParams alloc] init];
params.appId = @"";
params.appSecret = @"";
params.sample_rate = 16000;//Sampling rate
params.format = @"pcm";//Audio encoding format
params.realtime = YES;//Whether it is Real-time Recognition, true represents Real-time Recognition, false represents Short Speech Recognition
params.langType = @"zh-cmn-Hans-CN";//Required Language Type
params.enable_intermediate_result = YES;//Whether to return intermediate results
params.enable_punctuation_prediction = YES;//Whether to perform ITN in post-processing
params.max_sentence_silence = 450;//Speech sentence breaking detection threshold. Silence longer than this threshold is considered as a sentence break. The valid parameter range is 200〜1200(ms), and the default value is 450ms
params.enable_words = YES;//Whether to enable returning word information

1.2.1 Create a Speech Recognition Class and Grant Authorization

NameTypeDescription
listeneridRecognition class
paramsSDKRecognitionParamsParameters and configuration
//Initialize the engine
 SpeechRecognition *speechManger = [[SpeechRecognition alloc] init];
[speechManger setInitSDK:self params:params];
self.speechManger = speechManger;

1.2.2 Callback Methods

NameTypeDescription
onRecognitionStartStringCallback method when engine connection starts
onRecognitionResultStringCallback method when engine returns content results
onRecognitionRealtimeResultStringCallback method when engine returns intermediate results
onRecognitionWarningStringCallback method when engine returns a result warning
onRecognitionErrorStringCallback method when engine returns a result error
/**
 * Return intermediate recognition results
 */
- (void) onRecognitionRealtimeResult: (NSString *) result;
/**
 * Return recognition results
 */
- (void) onRecognitionResult: (NSString *) result;
/**
 * Indicates successful start of recording
 */
- (void) onRecognitionStart: (NSString *) taskId;

/**
 * Indicates successful end of recognition
 */
- (void) onRecognitionStop;

/**
 * Return real-time recorded audio data
 */
- (void) onRecognitionGetAudio: (NSData *)data;

/**
 * Error callback, return error code and message
 */
- (void) onRecognitionError: (NSString *)code msg:(NSString*)msg taskId:(nullable NSString*)taskId;

/**
 * Warning callback
 */
- (void) onRecognitionWarning: (NSString *)code msg:(NSString*)msg taskId:(nullable NSString*)taskId;

1.2.3 Parameter Description

ParameterTypeRequiredDescriptionDefault Value
lang_typeStringYesLanguage optionRequired
formatStringNoAudio encoding formatpcm
sample_rateIntegerNoAudio sampling rate16000
enable_intermediate_resultBooleanNoWhether to return intermediate recognition resultsfalse
enable_punctuation_predictionBooleanNoWhether to add punctuation in post-processingfalse
enable_inverse_text_normalizationBooleanNoWhether to perform ITN in post-processingfalse
max_sentence_silenceIntegerNoSpeech sentence breaking detection threshold. Silence longer than this threshold is considered as a sentence break. The valid parameter range is 200~1200. Unit: Milliseconds450
enable_wordsBooleanNoWhether to return word informationfalse
enable_modal_particle_filterBooleanNoWhether to enable modal particle filteringfalse
hotwords_idStringNoHotwords IDNone
hotwords_weightFloatNoHotwords weight, the range is [0.1, 1.0]0.4
correction_words_idStringNoForced correction vocabulary ID
Supports multiple IDs, separated by a vertical bar; all indicates using all IDs.
None
forbidden_words_idStringNoForbidden words ID
Supports multiple IDs, separated by a vertical bar |; all indicates using all IDs.
None

1.2.4 Start/Stop Recognition

<1>Start Recognition (Internal Recording by SDK)
    [self.speechManger startRecording];
    
End Recognition
    [self.speechManger stopRecording];
    
<2>File Recognition (Directly pass the path of the audio file, local path)   
    [self.speechManger startRecognitionOralWithWavPath:@"wav audio path"];

<3>Audio Data Recognition (External recording by SDK, or file converted to NSData for recognition)    
- (void)doStart:(FinishBlock)finishBlock;
- (BOOL)doSetData:(NSData *) data isLast:(bool)isLast;
Call Method
[self.speechManger doStart:^(_Bool success) {
            if (success) {
If(Last Segment With Audio) {
[self.speechManger doSetData:data isLast:YES];
}else{
[self.speechManger doSetData:data isLast:NO];
}
}
}];

1.2.5 Force Sentence Ending

[self.speechManger sentenceEnd];

1.2.6 Customize Speaker

[self.speechManger speakerStart:@"speaker_name"];

2 SDK Download

iOS SDK

iOS Demo