I’m trying to find a replacement for NaturalReader in Linux but I’m not finding anything as good.

I have played around with different engines, such as Espeak (too robotic), Mozilla TTS and Coqui, and Piper. But I’m looking for an application, not just an engine, something that would allow me to open up a PDF, pick a spot and read from there, then be able to move back and forth on the document. Ideally, I would like to also be able to tell the application how to pronounce certain words.

I haven’t figured out how to make Okular use The best I have found is ReadAloud, but it’s just a browser addon. Okular doesn’t seem to be able to use something like Piper EDIT: but Pied exists: https://github.com/Elleo/pied which makes it work.

Any ideas?

(I use Debian btw :P )

  • hummy_bee@mander.xyz
    link
    fedilink
    arrow-up
    0
    ·
    16 days ago

    I am using Linux Mint and only discovered text-to-speech platforms exist recently. Could you advice a beginner on which one to use? Do I have to install it from the software manager or use the terminal to install whatever is recommended? Not very tech savvy, but not naïve either.

  • rodbiren@midwest.social
    link
    fedilink
    English
    arrow-up
    0
    ·
    16 days ago

    Kokoro is absolutely incredible for how small it is. It can run on CPU fairly quickly and the results are so consistent. I’m even working on an absolute wacky idea to use a genetic algorithm for voice cloning because the tensors it uses for voice style are just so small. It’s an awesome application.

    • acargitz@lemmy.caOP
      link
      fedilink
      arrow-up
      0
      ·
      edit-2
      16 days ago

      A screen reader reads what’s on the screen. What I’m describing is reading a document. ReadAloud does exactly that for Firefox, I am just asking for standalone applications.

      • just_another_person@lemmy.world
        link
        fedilink
        arrow-up
        0
        ·
        16 days ago

        I think Piper being able to take input and read it from the client is simple enough that most people wouldn’t make an entire GUI just to avoid to do that, so it might be hard to find something like that.

        If you’re specifically talking about reading PDFs aloud, you could do something like: pdftotext file.pdf | piper and it will read the whole thing.

        If you only mean reading a file from a specific selection of text, I’ve never seen something that, and it would have to operate more like a fully fledged screen reader because you’d have content rendered on screen that would have to be then piped to a TTS engine.