OpenAI Audio operations#
Use this operation to generate an audio, or transcribe or translate a recording in OpenAI. Refer to OpenAI for more information on the OpenAI node itself.
Generate Audio#
Use this operation to create audio from a text prompt.
Enter these parameters:
- Credential to connect with: Create or select an existing OpenAI credential.
- Resource: Select Audio.
- Operation: Select Generate Audio.
- Model: Select the model you want to use to generate the audio. Refer to TTS | OpenAI for more information.
- TTS-1: Use this to optimize for speed.
- TTS-1-HD: Use this to optimize for quality.
- Text Input: Enter the text to generate the audio for. The maximum length is 4096 characters.
- Voice: Select a voice to use when generating the audio. Listen to the previews of the voices in Text to speech guide | OpenAI.
Options#
- Response Format: Select the format for the audio response. Choose from MP3 (default), OPUS, AAC, FLAC, WAV, and PCM.
- Audio Speed: Enter the speed for the generated audio from a value from
0.25
to4.0
. Defaults to1
. - Put Output in Field: Defaults to
data
. Enter the name of the output field to put the binary file data in.
Refer to Create speech | OpenAI documentation for more information.
Transcribe a Recording#
Use this operation to transcribe audio into text. OpenAI API limits the size of the audio file to 25 MB. OpenAI will use the whisper-1
model by default.
Enter these parameters:
- Credential to connect with: Create or select an existing OpenAI credential.
- Resource: Select Audio.
- Operation: Select Transcribe a Recording.
- Input Data Field Name: Defaults to
data
. Enter the name of the binary property that contains the audio file in one of these formats:.flac
,.mp3
,.mp4
,.mpeg
,.mpga
,.m4a
,.ogg
,.wav
, or.webm
.
Options#
- Language of the Audio File: Enter the language of the input audio in ISO-639-1. Use this option to improve accuracy and latency.
- Output Randomness (Temperature): Defaults to
1.0
. Adjust the randomness of the response. The range is between0.0
(deterministic) and1.0
(maximum randomness). We recommend altering this or Output Randomness (Top P) but not both. Start with a medium temperature (around 0.7) and adjust based on the outputs you observe. If the responses are too repetitive or rigid, increase the temperature. If they’re too chaotic or off-track, decrease it.
Refer to Create transcription | OpenAI documentation for more information.
Translate a Recording#
Use this operation to translate audio into English. OpenAI API limits the size of the audio file to 25 MB. OpenAI will use the whisper-1
model by default.
Enter these parameters:
- Credential to connect with: Create or select an existing OpenAI credential.
- Resource: Select Audio.
- Operation: Select Translate a Recording.
- Input Data Field Name: Defaults to
data
. Enter the name of the binary property that contains the audio file in one of these formats:.flac
,.mp3
,.mp4
,.mpeg
,.mpga
,.m4a
,.ogg
,.wav
, or.webm
.
Options#
- Output Randomness (Temperature): Defaults to
1.0
. Adjust the randomness of the response. The range is between0.0
(deterministic) and1.0
(maximum randomness). We recommend altering this or Output Randomness (Top P) but not both. Start with a medium temperature (around 0.7) and adjust based on the outputs you observe. If the responses are too repetitive or rigid, increase the temperature. If they’re too chaotic or off-track, decrease it.
Refer to Create transcription | OpenAI documentation for more information.
Common issues#
For common errors or issues and suggested resolution steps, refer to Common Issues.