And then, to refine this approach we should have exceptions: All tests that need to download a heavy set of weights or a dataset that is larger than ~50MB (e.g., model or tokenizer integration tests, pipeline integration tests) should be set to slow.