iOS SDK

Supports iOS versions 11.0 and above.
Before using the SDK, please read the Interface Protocol first. For details, refer to Cloud API.

1 Integration Steps

Manual import: Drag SpeechEvaluate.framework into your project. Then, under General -> Frameworks, Libraries, and Embedded Content, change the Embed setting for SpeechRecognitionSDK.framework to Embed&Sign.
Ensure the following pods are installed: SocketRocket0.6.0, AFNetworking.

Add Privacy - Microphone Usage Description to your project's info.plist file to add microphone access permissions.

1.2 Invocation Steps/Sample Code

In the file that requires the speech recognition function, please adhere to the delegate protocol EvalListener.

//Configure main parameters
SDKParams *params = [[SDKParams alloc] init];
params.appId = @"";
params.appSecret = @"";
params.sample_rate = 16000;//Sampling rate
params.format = @"pcm";//Audio encoding format
params.realtime = YES;//Whether it is Real-time Recognition, true represents Real-time Recognition, false represents Short Speech Recognition
params.langType = @"zh-cmn-Hans-CN";//Required Language Type
params.enable_intermediate_result = YES;//Whether to return intermediate results
params.enable_punctuation_prediction = YES;//Whether to perform ITN in post-processing
params.max_sentence_silence = 450;//Speech sentence breaking detection threshold. Silence longer than this threshold is considered as a sentence break. The valid parameter range is 200〜1200(ms), and the default value is 450ms
params.enable_words = YES;//Whether to enable returning word information

1.2.1 Create a Speech Recognition Class and Grant Authorization

Name	Type	Description
listener	id	Recognition class
params	SDKRecognitionParams	Parameters and configuration

//Initialize the engine
 SpeechRecognition *speechManger = [[SpeechRecognition alloc] init];
[speechManger setInitSDK:self params:params];
self.speechManger = speechManger;

1.2.2 Callback Methods

Name	Type	Description
onRecognitionStart	String	Callback method when engine connection starts
onRecognitionResult	String	Callback method when engine returns content results
onRecognitionRealtimeResult	String	Callback method when engine returns intermediate results
onRecognitionWarning	String	Callback method when engine returns a result warning
onRecognitionError	String	Callback method when engine returns a result error

/**
 * Return intermediate recognition results
 */
- (void) onRecognitionRealtimeResult: (NSString *) result;
/**
 * Return recognition results
 */
- (void) onRecognitionResult: (NSString *) result;
/**
 * Indicates successful start of recording
 */
- (void) onRecognitionStart: (NSString *) taskId;

/**
 * Indicates successful end of recognition
 */
- (void) onRecognitionStop;

/**
 * Return real-time recorded audio data
 */
- (void) onRecognitionGetAudio: (NSData *)data;

/**
 * Error callback, return error code and message
 */
- (void) onRecognitionError: (NSString *)code msg:(NSString*)msg taskId:(nullable NSString*)taskId;

/**
 * Warning callback
 */
- (void) onRecognitionWarning: (NSString *)code msg:(NSString*)msg taskId:(nullable NSString*)taskId;

1.2.3 Parameter Description

Parameter	Type	Required	Description	Default Value
lang_type	String	Yes	Language option	Required
format	String	No	Audio encoding format	pcm
sample_rate	Integer	No	Audio sampling rate	16000
enable_intermediate_result	Boolean	No	Whether to return intermediate recognition results	false
enable_punctuation_prediction	Boolean	No	Whether to add punctuation in post-processing	false
enable_inverse_text_normalization	Boolean	No	Whether to perform ITN in post-processing	false
max_sentence_silence	Integer	No	Speech sentence breaking detection threshold. Silence longer than this threshold is considered as a sentence break. The valid parameter range is 200～1200. Unit: Milliseconds	450
enable_words	Boolean	No	Whether to return word information	false
enable_modal_particle_filter	Boolean	No	Whether to enable modal particle filtering	false
hotwords_id	String	No	Hotwords ID	None
hotwords_weight	Float	No	Hotwords weight, the range is [0.1, 1.0]	0.4
correction_words_id	String	No	Forced correction vocabulary ID Supports multiple IDs, separated by a vertical bar; `all` indicates using all IDs.	None
forbidden_words_id	String	No	Forbidden words ID Supports multiple IDs, separated by a vertical bar `\|`; `all` indicates using all IDs.	None

1.2.4 Start/Stop Recognition

<1>Start Recognition (Internal Recording by SDK)
    [self.speechManger startRecording];
    
End Recognition
    [self.speechManger stopRecording];
    
<2>File Recognition (Directly pass the path of the audio file, local path)   
    [self.speechManger startRecognitionOralWithWavPath:@"wav audio path"];

<3>Audio Data Recognition (External recording by SDK, or file converted to NSData for recognition)    
- (void)doStart:(FinishBlock)finishBlock;
- (BOOL)doSetData:(NSData *) data isLast:(bool)isLast;
Call Method
[self.speechManger doStart:^(_Bool success) {
            if (success) {
If(Last Segment With Audio) {
[self.speechManger doSetData:data isLast:YES];
}else{
[self.speechManger doSetData:data isLast:NO];
}
}
}];

1.2.5 Force Sentence Ending

[self.speechManger sentenceEnd];

1.2.6 Customize Speaker

[self.speechManger speakerStart:@"speaker_name"];

2 SDK Download

iOS SDK

iOS Demo

On this page