I’m trying to find a replacement for NaturalReader in Linux but I’m not finding anything as good.
I have played around with different engines, such as Espeak (too robotic), Mozilla TTS and Coqui, and Piper. But I’m looking for an application, not just an engine, something that would allow me to open up a PDF, pick a spot and read from there, then be able to move back and forth on the document. Ideally, I would like to also be able to tell the application how to pronounce certain words.
I haven’t figured out how to make Okular use The best I have found is ReadAloud, but it’s just a browser addon. Okular doesn’t seem to be able to use something like Piper EDIT: but Pied exists: https://github.com/Elleo/pied which makes it work.
Any ideas?
(I use Debian btw :P )
I am using Linux Mint and only discovered text-to-speech platforms exist recently. Could you advice a beginner on which one to use? Do I have to install it from the software manager or use the terminal to install whatever is recommended? Not very tech savvy, but not naïve either.
If you’re seeking a pre-packaged solution for leveraging the Kokoro-82M text-to-speech model, you might find the ‘Kokoro-FastAPI’ Dockerized wrapper… adequate. It seems to function, at least for me.
My mouth.
…and my bow
Kokoro is absolutely incredible for how small it is. It can run on CPU fairly quickly and the results are so consistent. I’m even working on an absolute wacky idea to use a genetic algorithm for voice cloning because the tensors it uses for voice style are just so small. It’s an awesome application.
Do you just want a screen reader maybe? Gnome has Orca, and KDE has Kreader. Orca is much more polished.
A screen reader reads what’s on the screen. What I’m describing is reading a document. ReadAloud does exactly that for Firefox, I am just asking for standalone applications.
I think Piper being able to take input and read it from the client is simple enough that most people wouldn’t make an entire GUI just to avoid to do that, so it might be hard to find something like that.
If you’re specifically talking about reading PDFs aloud, you could do something like:
pdftotext file.pdf | piper
and it will read the whole thing.If you only mean reading a file from a specific selection of text, I’ve never seen something that, and it would have to operate more like a fully fledged screen reader because you’d have content rendered on screen that would have to be then piped to a TTS engine.