inference time reduction