Data usage and energy cost while running it - this you would have under control when selfhosting
Training of the LLM: You have no control over how the LLM was trained - not which data sources were used, the forkforce that cleaned up the data or the amount and type of energy used to do so
Criticism on LLMs has two angles: