FAQs
1 General Questions
1.How to apply for a DolphinVoice account?
Navigate to the DolphinVoice website and click on the login button to register. For detailed steps, please refer to the Quick Start.
2.How to create a project?
Once your account application is successful, a project is automatically created by default. If you need a new project, you can create it by clicking the New Project button.
Note: Each project has all AI capabilities enabled by default.
3.Where to check the AppID and AppSecret?
Please refer to the Connection URL on the DolphinVoice User Center.
4.Will the existing Token become invalid upon re-obtaining a new one?
Re-obtaining the Token does not affect the validity of the Token you already have. The validity of the Token is related only to the valid time period.
5.How long is the default validity period for the Token?
The default validity period for the Token is 7 days. If the capabilities of the APPID in use are changed, the previously generated Token will become invalid and need to be regenerated.
Speech Synthesis
1.What are the commonly used terms in synthesis? How to understand them?
Concurrency: The number of service requests at the same time, which is the number of requests being processed simultaneously by the backend service at a certain moment, with a counting period of points.
Real-Time Factor (RTF): Abbreviation for Real Time Factor. For synthesis scenarios: RTF = Synthesis Time / Audio duration.
First Packet Delay: The delay from when the user initiates a synthesis call to when they first receive the audio data.
Synthesis Delay: The delay corresponding to the user receiving the required audio data. For example, if the user needs the complete audio data for playback, then the synthesis delay is the overall delay of the single request. If the user can use partial audio data (e.g., feeding it into a player for playback), then the synthesis delay is the first packet delay.
Streaming Synthesis: The text is transmitted all at once, and the audio is returned while being processed. The perceived delay by the user is the first packet delay.
Non-Streaming Synthesis: The text is transmitted all at once, and the audio is returned after being fully processed by the backend. The perceived delay by the user is strongly related to the RTF & the length of the synthesized audio: Synthesis Time = RTF * Audio Duration.
1. What languages does speech synthesis support?
The platform currently supports 3 languages, which is Japanese, English, and Chinese.
2. How many voices does speech synthesis support?
Currently, speech synthesis supports 19 voices. Among them, there are 3 Japanese voices, 2 English voices, and 14 Chinese voices. You can freely choose according to your business needs. Each voice effect supports adjusting parameters such as volume, speed, and speech rate.For details, see Speech Synthesis - Development Guide.
3. What sampling rates does speech synthesis support?
The speech synthesis service currently supports audio with sampling rates of 8000 Hz, 16000 Hz, and 24000 Hz.
4. What interface types does speech synthesis support?
Speech synthesis supports both streaming (WebSocket) and non-streaming (HTTP) interfaces. Non-streaming speech synthesis returns the audio data after the entire sentence has been synthesized, while streaming speech synthesis returns the audio data while synthesizing.
5. What text encoding formats does speech synthesis support?
The input text for speech synthesis only supports UTF-8 encoding. The input text cannot exceed 1024 bytes, approximately 300 Chinese characters.
6. What audio encoding formats does the speech synthesis service support?
It supports outputting data in wav, pcm, and mp3 encoding formats. Note: Streaming does not support the wav format.
7.Does the speech synthesis service support emotion settings?
Yes, for specific supported voice types, see Speech Synthesis - Development Guide.
8. Does speech synthesis support generating video files?
No, it does not support generating video files.
9. If there is a pronunciation error after speech synthesis, can the pronunciation be corrected?
Currently, this is not supported.
10. Can the silence duration between two pieces of text be set in speech synthesis?
Currently, this is not supported.
11. What SDKs does speech synthesis support?
Speech synthesis provides SDKs for the following versions: Python, Android, iOS, H5/JS.
Status Code Query
General Error Codes
| Error Code | Error Message | Description | Solution |
|---|---|---|---|
| 110000 | Token Missing | token is missing | Add token parameter in the header |
| 110001 | Invalid Token | tokenis incorrect | Pass the correct token value |
| 110005 | Concurrency Quota Exceeded | Concurrency limit exceeded | Concurrency exceeds the limit, contact business for additional concurrency |
| 110006 | Failed To Create Token | Failed to create token | Recreate the token |
| 110007 | APP ID Not Found | app_id does not exist | Please enter the correct app_id |
| 110008 | Invalid Signature | Invalid Signature | Please regenerate the correct signature |
| 110009 | Token Expired | Token expired | Reacquire the token |
| 110011 | Illegal Current Time | Invalid current time | Check if the time is correct |
| 110012 | Payment Status Abnormal, Service Unavailable | Payment Status Abnormal, Service Unavailable | Please contact Business |
| 120000 | Network Error | Network error | Check your network |
| 120001 | Lack Of Network Permissions | Lack of network access rights | Check your network |
| 120002 | Network Disconnected | Network connection has been disconnected | Check your network |
| 120003 | No Network Connection | No network connection | Check your network |
| 140000 | Database isDatabase is busy, please try again later | The database is busy, please try again later | Contact business department |
| 140004 | APP ID/APP Secret Cannot Be Null | APPID/APPSecret cannot be null | Enter APPID/APPSecret |
| 140005 | Listener is null, please call setListener method first | Listener is null. Please call the setListener method first | Call the setListener method first |
| 140006 | InitListener Cannot Be Null | InitListener cannot be null | InitListener cannot be null |
| 140007 | Language Not Set | Language not set | Please set the language |
| 140010 | Invalid Parameter | Parameter error | Check the parameters (to ensure they are not unspecified, incorrect, or empty strings) |
| 140011 | Parameter Missing | Parameter missing | The required parameter is missing |
| 140012 | Invalid Parameter Type | Parameter type error | Check the type of the parameter |
| 140013 | Invalid Parameter Format | Parameter format error | Check the format of the parameter |
Online Synthesis for Short Text
| Error Code | Error Message | Description | Solution |
|---|---|---|---|
| 300000 | Invalid Parameter | Parameter error | Check the parameters (to ensure they are not unspecified, incorrect, or empty strings) |
| 300001 | Parameter Missing | Parameter is missing | The required parameter is missing |
| 300002 | Invalid Parameter Type | Parameter type error | Check the type of the parameter |
| 300003 | Invalid Parameter Format | Parameter format error | Check the format of the parameter |
| 300004 | Text For Speech Synthesis Is Null | Synthesis text is empty | Please enter valid text (text content is not empty, text corresponds to the language, text contains words other than punctuation) |
| 300005 | Text Length Exceeds Limit | Synthesis text length exceeds limit | Maximum length is 1024 bytes |
| 300100 | Text is being synthesized, please try again later | Being synthesized, please try again later | Wait for synthesis to complete before operating |
| 300101 | Audio is being played, please try again later | Being played, please try again later | Wait for playback to complete before operating |
| 310500 | Failed To Call Engine | Engine call failed | Please contact the business |
| 310000 | Gateway Timeout In Receiving Data | Gateway timeout in receiving data | Please resend the data |
| 310001 | Connection Error 1 | Connection error 1 | Please contact the business department |
| 310002 | Connection Error 2 | Connection Error 2 | Please contact the business department |
| 310003 | Disconnected | The connection has been disconnected | Please contact the business department |
| 310201 | Failed To Encode Audio | Audio encoding failed | Please resynthesize |