It’s likely first and foremost a resource problem. “How much different would the...

It’s likely first and foremost a resource problem. “How much different would the output be if that text hadn’t been part of the training data” can _in principle_ be answered by instead of training one model, training N models where N is the number of texts in the training data, omitting text i from the training data of model i, and then when using the model(s), run all N models in parallel and apply some distance metric on their outputs. In case of a verbatim quote, at least one of the models will stand out in that comparison, allowing to infer the source. The difficulty would be in finding a way to do something along those lines efficiently enough to be practical.