LLM-co-running Concurrent Training and Serving of Large Language Models on Distributed Systems An illustration of Separate and Mix under concurrent workloads Separate training and inference Co-locate training and inference