Option to upload text and then upload MP3 without generating transcription

Hi, I appreciate the product you’ve created; I’ve explored various tools, and Readlang is definitely the ultimate one. Something I’m struggling with a bit is that I enjoy reading along with the audiobook at the same time. I see there’s an option to upload the MP3 and then have it transcribed, but I’d like to first upload my book, then upload my MP3, and only have it help me generate timestamps. I understand that currently, you can upload the MP3, generate the text, and then modify it, but I think it takes quite a while to wait for the generation when I can just provide the book myself. Currently, what I’m doing is having Audible open in another tab and using keyboard shortcuts to pause and play it, but I’d prefer to have everything integrated. LingQ had this feature with podcasts, and it was very helpful. Thank you!

PS: Recently, reading in the forum, I discovered that the voice provided by Microsoft Edge sounds quite natural compared to what I’ve endured with my other browser. I think it would have been interesting to know this earlier; maybe it would be good to mention it somewhere in case anyone optionally wants it. It’s quick, and seriously, not hearing a robotic voice anymore is glorious.

I’d second that. Podcasts sometimes already have transcripts. And although having it synced is nice, I can easily live without it, scrolling manually.

On the subject, it would be nice to be able to remove the audio track without deleting the whole project. At the moment I copy the text, delete the whole thing (twice) and then paste the text into a new project. It’s no big deal, but could be neater.:}

1 Like

Right now there’s no way to just get Readlang to generate timestamps to an existing text. I use the OpenAI whisper API and as far as I know it doesn’t offer that functionality. But the workflow I suggest achieves the same goal:

  1. Upload mp3 and Realdang with auto-transcribe with timestamps
  2. Replace the entire text with your text using the Edit tab in the reader page sidebar and Readlang will attempt to maintain the timestamps as best it can

I know that the transcription can take a bit of time, about 1min per 10mins of audio. Is this a big problem for you? It’s still a relatively short amount of time compared to how long it will take to listen and read it. Even if I found another tool to directly generate timestamps to an existing transcript (this is called forced alignment), I doubt it would be that much faster. If transcribing is taking significantly longer than this please let me know.

True, I agree this could be neater.

1 Like