CPU inference With some optimizations, it is possible to efficiently run large model inference on a CPU.