iOS SDK
Note:
Supports iOS versions 11.0 and above.
Only activation via internet connection is supported.
1 Integration Steps
Note: If you are interested in trying the on-device recognition feature, please get in touch with our business contact in advance to receive an APPID. Also, please specify the language type you want to try. Contact: voice.contact@dolphin-ai.jp.
- Manual import: Drag SpeechEvaluate.framework into your project. Then, under General -> Frameworks, Libraries, and Embedded Content, change the Embed setting for SpeechRecognitionSDK.framework to Embed&Sign.
- Ensure the following pods are included: SocketRocket0.6.0, AFNetworking, SSZipArchive2.4.0.
- Offline sdk add model file: Place the model file (.zip) into the framework folder.
1.1 Add App-Related Permissions
- Add Privacy - Microphone Usage Description to your project's info.plist file to add microphone access permissions.
1.2 Invocation Steps/Sample Code
In the file that requires the recognition function, please adhere to the delegate protocol EvalListener.
//Configuration Main Parameters
SDKParams *params = [[SDKParams alloc] init];
params.appId = @"";
params.appSecret = @"";
params.sample_rate = 16000;
params.format = @"pcm";
params.realtime = NO;
params.langType = @"ja-JP";
params.enable_intermediate_result = YES;
params.enable_punctuation_prediction = YES;
params.max_sentence_silence = 450;
params.enable_words = YES;
/**
@sourcePath Customized paths
@Documents/userPatu/model,The path before the model can be named arbitrarily.
@The vad file must be under the model file, for example:Documents/userPath/model/vad
@The language file must be under the model/asr file, for example:Documents/userPath/model/asr/ja-JP
*/
params.sourcePath = [[NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES) firstObject] stringByAppendingPathComponent:@"/userPath/model"];1.2.1 Create a Speech Recognition Class and Grant Authorization
| Name | Type | Description |
|---|---|---|
| SpeechRecognition | SpeechRecognition | Recognition class |
| params | SDKParams | Parameters and settings |
self.params.initType = YES;
self.params.online = NO;
SpeechRecognition *speechManger = [[SpeechRecognition alloc] init];
[speechManger setInitSDK:self params:params];
self.speechManger = speechManger;1.2.2 Callback Method Description
| Name | Type | Description |
|---|---|---|
| onStart | String | Callback method when engine connection starts |
| onResult | String | Callback method when engine returns results |
| onRealtimeResult | String | Callback method when engine returns intermediate results |
| onWarning | String | Callback method when engine returns a result warning |
| onError | String | Callback method when engine returns a result error |
| onMachineCode | String | Engine returns machine code |
- (void) onRealtimeResult: (NSString *) result;
- (void) onResult: (NSString *) result;
- (void) onStart: (NSString *) taskId;
- (void) onStop;
- (void) onGetAudio: (NSData *)data;
- (void) onError: (NSString *)code msg:(NSString*)msg taskId:(nullable NSString*)taskId;
- (void) onWarning: (NSString *)code msg:(NSString*)msg taskId:(nullable NSString*)taskId;
- (void) onMachineCode: (NSString *)MachineCode;1.2.3 Interface Parameter Description
| Parameter | Type | Required | Description | Default Value |
|---|---|---|---|---|
| lang_type | String | Yes | Language option | Required |
| format | String | No | Audio encoding format | pcm |
| sample_rate | Integer | No | Audio sampling rate | 16000 |
| enable_intermediate_result | Boolean | No | Whether to return intermediate recognition results | false |
| enable_punctuation_prediction | Boolean | No | Whether to add punctuation in post-processing | false |
| enable_inverse_text_normalization | Boolean | No | Whether to perform ITN in post-processing | false |
| max_sentence_silence | Integer | No | Speech sentence breaking detection threshold. Silence longer than this threshold is considered as a sentence break. The valid parameter range is 200~1200. Unit: Milliseconds | 450 |
| enable_words | Boolean | No | Whether to return word information | false |
| enable_modal_particle_filter | Boolean | No | Whether to enable modal particle filtering | false |
| hotwords_id | String | No | Hotwords ID | None |
| hotwords_weight | Float | No | Hotwords weight, the range is [0.1, 1.0] | 0.4 |
1.2.4 Start/Stop Recognition
<1>Start Recognition (Internal Recording by SDK)
[self.speechManger startRecording];
End Recognition
[self.speechManger stopRecording];
<2>File Recognition (Directly pass the path of the audio file, local path)
[self.speechManger startRecognitionOralWithWavPath:@"wav audio path"];
<3>Audio Data Recognition (External recording by SDK, or file converted to NSData for recognition)
- (void)doStart:(FinishBlock)finishBlock;
- (BOOL)doSetData:(NSData *) data isLast:(bool)isLast;
Call Method
[self.speechManger doStart:^(_Bool success) {
if (success) {
If(Last Segment With Audio) {
[self.speechManger doSetData:data isLast:YES];
}else{
[self.speechManger doSetData:data isLast:NO];
}
}
}];1.2.5 Load Hotwords
[self.speechManger loadWords];1.2.6 Destroy on Exit
//Release the current model
[self.speechManger offLineRelese];
//Release all models
[self.speechManger offLineReleseAll];1.2.7 Cancel the generation of logs
Set the parameters in model/asr/ja-JP/config:
log-level = 6
log-rotate-days=02 Status Code Table
| Error Code | Error Message | Description | Solution |
|---|---|---|---|
| 110100 | Unauthorized | Unauthorized | Change the parameter to an authorized capability or contact the business department to add AI capabilities |
| 110101 | Invalid Activation Code | Activation key is incorrect | Enter the correct activation key |
| 110102 | Model Count Exceeds Limit | The number of models exceeds the limit | Please contact the business department |
| 110103 | Authorization Expired | Authorization has expired | Contact the business department to extend the authorization period |
| 110104 | Authorization Failed | Authorization failed | Please contact the business department |
3 SDK Download & On-Device Model Download
3.1 SDK Download
The download link for the SDK is as follows: iOS SDK
3.2 On-Device Model Download
| Language Type | Language Code | Download Link | Remarks |
|---|---|---|---|
| Japanese - General | ja-JP | Japanese - General | Only Japanese |
| Japanese - Hotel | ja-JP | Japanese - Hotel | Supports mixed speech of Japanese and English |
| English | en-US | English | |
| Mandarin | zh-cmn-Hans-CN | Mandarin | Supports mixed speech of Mandarin and English |