Skip to main content

toCaptions()

Converts the output from transcribe() into an array of Caption objects, so you can use the functions from @remotion/captions.

tsx
import {toCaptions, transcribe, resampleTo16Khz} from '@remotion/whisper-web';
 
const file = new File([], 'audio.wav');
 
const channelWaveform = await resampleTo16Khz({
file,
});
 
const whisperWebOutput = await transcribe({
channelWaveform,
model: 'tiny.en',
});
 
const {captions} = toCaptions({
whisperWebOutput,
});
 
console.log(captions); /*
[
{
text: "William",
startMs: 40,
endMs: 420,
timestampMs: 240,
confidence: 0.813602,
}, {
text: " just",
startMs: 420,
endMs: 650,
timestampMs: 480,
confidence: 0.990905,
}, {
text: " hit",
startMs: 650,
endMs: 810,
timestampMs: 700,
confidence: 0.981798,
}
]
*/

See also