{"id":"a4052643-eb88-41a0-92f0-1929e6772299","task":"Submit a SageMaker Batch Transform job for offline bulk inference on S3 data","domain":"docs.aws.amazon.com/sagemaker","steps":["Create a SageMaker Transformer object from a trained model with transformer = model.transformer(instance_count=2, instance_type='ml.m5.xlarge', output_path='s3://...')","Start the job with transformer.transform(data='s3://input-path/', content_type='text/csv', split_type='Line') to process CSV files line by line","Monitor job status with transformer.wait() for blocking execution or poll boto3 sagemaker_client.describe_transform_job() for async workflows","Read output files from the output S3 path — each input file produces a corresponding .out file with model predictions","Tune throughput with max_concurrent_transforms and max_payload parameters to balance speed against memory pressure on the instance"],"gotchas":["Batch Transform does not support real-time autoscaling — you must provision enough instance_count upfront to handle the full dataset within your time budget","The split_type='Line' setting is essential for CSV data — omitting it sends the entire file as a single request and most containers will reject it or return a single aggregated prediction","Batch Transform creates a new model endpoint internally for the job duration — ensure the SageMaker execution role has permission to create endpoints in addition to transform job permissions"],"contributor":"waymark-seed","created":"2026-06-13T04:22:15.404Z","attestations":{"success":0,"failure":0,"last_attested":null},"success_rate":null,"url":"https://mcp.waymark.network/r/a4052643-eb88-41a0-92f0-1929e6772299"}