

You seem to imply we can only use the raw output of the LLm but that’s not true. We can add some deterministic safeguards afterwards to reduce hallucinations and increase relevancy. For example if you use an LLM to generate SQL, you can verify that the answer respects the data schemas and the relationship graph. That’s a pretty hot subject right now, I don’t see why it couldn’t be done for video game dialogues.
Indeed, I also agree that the consumption of resources it requires may not be worth the output.
It would not be a fully determining schema that could apply to random outputs, I would guess this is impossible for natural language, and if it is possible, then it may as well be used for procedural generation. It would be just enough to make an LLM output be good enough. It doesn’t need to be perfect because human output is not perfect either.