Logo

FAQs

1 General Questions

1.How to apply for a DolphinVoice account?

Navigate to the DolphinVoice website and click on the login button to register. For detailed steps, please refer to the Quick Start.

2.How to create a project?

Once your account application is successful, a project is automatically created by default. If you need a new project, you can create it by clicking the New Project button.

Note: Each project has all AI capabilities enabled by default.

3.Where to check the AppID and AppSecret?

Please refer to the Connection URL on the DolphinVoice User Center.

4.Will the existing Token become invalid upon re-obtaining a new one?

Re-obtaining the Token does not affect the validity of the Token you already have. The validity of the Token is related only to the valid time period.

5.How long is the default validity period for the Token?

The default validity period for the Token is 7 days. If the capabilities of the APPID in use are changed, the previously generated Token will become invalid and need to be regenerated.

Speech Synthesis

1.What are the commonly used terms in synthesis? How to understand them?

Concurrency: The number of service requests at the same time, which is the number of requests being processed simultaneously by the backend service at a certain moment, with a counting period of points.

Real-Time Factor (RTF): Abbreviation for Real Time Factor. For synthesis scenarios: RTF = Synthesis Time / Audio duration.

First Packet Delay: The delay from when the user initiates a synthesis call to when they first receive the audio data.

Synthesis Delay: The delay corresponding to the user receiving the required audio data. For example, if the user needs the complete audio data for playback, then the synthesis delay is the overall delay of the single request. If the user can use partial audio data (e.g., feeding it into a player for playback), then the synthesis delay is the first packet delay.

Streaming Synthesis: The text is transmitted all at once, and the audio is returned while being processed. The perceived delay by the user is the first packet delay.

Non-Streaming Synthesis: The text is transmitted all at once, and the audio is returned after being fully processed by the backend. The perceived delay by the user is strongly related to the RTF & the length of the synthesized audio: Synthesis Time = RTF * Audio Duration.

1. What languages does speech synthesis support?

The platform currently supports 3 languages, which is Japanese, English, and Chinese.

2. How many voices does speech synthesis support?

Currently, speech synthesis supports 19 voices. Among them, there are 3 Japanese voices, 2 English voices, and 14 Chinese voices. You can freely choose according to your business needs. Each voice effect supports adjusting parameters such as volume, speed, and speech rate.For details, see Speech Synthesis - Development Guide.

3. What sampling rates does speech synthesis support?

The speech synthesis service currently supports audio with sampling rates of 8000 Hz, 16000 Hz, and 24000 Hz.

4. What interface types does speech synthesis support?

Speech synthesis supports both streaming (WebSocket) and non-streaming (HTTP) interfaces. Non-streaming speech synthesis returns the audio data after the entire sentence has been synthesized, while streaming speech synthesis returns the audio data while synthesizing.

5. What text encoding formats does speech synthesis support?

The input text for speech synthesis only supports UTF-8 encoding. The input text cannot exceed 1024 bytes, approximately 300 Chinese characters.

6. What audio encoding formats does the speech synthesis service support?

It supports outputting data in wav, pcm, and mp3 encoding formats. Note: Streaming does not support the wav format.

7.Does the speech synthesis service support emotion settings?

Yes, for specific supported voice types, see Speech Synthesis - Development Guide.

8. Does speech synthesis support generating video files?

No, it does not support generating video files.

9. If there is a pronunciation error after speech synthesis, can the pronunciation be corrected?

Currently, this is not supported.

10. Can the silence duration between two pieces of text be set in speech synthesis?

Currently, this is not supported.

11. What SDKs does speech synthesis support?

Speech synthesis provides SDKs for the following versions: Python, Android, iOS, H5/JS.

Status Code Query

General Error Codes

Error CodeError MessageDescriptionSolution
110000Token Missingtoken is missingAdd token parameter in the header
110001Invalid Tokentokenis incorrectPass the correct token value
110005Concurrency Quota ExceededConcurrency limit exceededConcurrency exceeds the limit, contact business for additional concurrency
110006Failed To Create TokenFailed to create tokenRecreate the token
110007APP ID Not Foundapp_id does not existPlease enter the correct app_id
110008Invalid SignatureInvalid SignaturePlease regenerate the correct signature
110009Token ExpiredToken expiredReacquire the token
110011Illegal Current TimeInvalid current timeCheck if the time is correct
110012Payment Status Abnormal, Service UnavailablePayment Status Abnormal, Service UnavailablePlease contact Business
120000Network ErrorNetwork errorCheck your network
120001Lack Of Network PermissionsLack of network access rightsCheck your network
120002Network DisconnectedNetwork connection has been disconnectedCheck your network
120003No Network ConnectionNo network connectionCheck your network
140000Database isDatabase is busy, please try again laterThe database is busy, please try again laterContact business department
140004APP ID/APP Secret Cannot Be NullAPPID/APPSecret cannot be nullEnter APPID/APPSecret
140005Listener is null, please call setListener method firstListener is null. Please call the setListener method firstCall the setListener method first
140006InitListener Cannot Be NullInitListener cannot be nullInitListener cannot be null
140007Language Not SetLanguage not setPlease set the language
140010Invalid ParameterParameter errorCheck the parameters (to ensure they are not unspecified, incorrect, or empty strings)
140011Parameter MissingParameter missingThe required parameter is missing
140012Invalid Parameter TypeParameter type errorCheck the type of the parameter
140013Invalid Parameter FormatParameter format errorCheck the format of the parameter

Online Synthesis for Short Text

Error CodeError MessageDescriptionSolution
300000Invalid ParameterParameter errorCheck the parameters (to ensure they are not unspecified, incorrect, or empty strings)
300001Parameter MissingParameter is missingThe required parameter is missing
300002Invalid Parameter TypeParameter type errorCheck the type of the parameter
300003Invalid Parameter FormatParameter format errorCheck the format of the parameter
300004Text For Speech Synthesis Is NullSynthesis text is emptyPlease enter valid text (text content is not empty, text corresponds to the language, text contains words other than punctuation)
300005Text Length Exceeds LimitSynthesis text length exceeds limitMaximum length is 1024 bytes
300100Text is being synthesized, please try again laterBeing synthesized, please try again laterWait for synthesis to complete before operating
300101Audio is being played, please try again laterBeing played, please try again laterWait for playback to complete before operating
310500Failed To Call EngineEngine call failedPlease contact the business
310000Gateway Timeout In Receiving DataGateway timeout in receiving dataPlease resend the data
310001Connection Error 1Connection error 1Please contact the business department
310002Connection Error 2Connection Error 2Please contact the business department
310003DisconnectedThe connection has been disconnectedPlease contact the business department
310201Failed To Encode AudioAudio encoding failedPlease resynthesize