Workload migration from on-prem to AWS - MAP Assessment
A Mid-size K-12 Ed-tech SaaS
Client
A mid-size K-12 ed-tech SaaS with 20+ years’ experience builds cloud learning platforms used nationwide. Its system handles heavy concurrent traffic, making this case relevant to any content-heavy SaaS managing large unstructured data in an older co-located setup.
Challenge
- Move stateful PHP learning platform from co-location to AWS.
- Migrate 15-node web tier + 4-node MariaDB cluster
- 340 TB of unstructured data (~914 M small objects) with no defined migration path.
- 15 Mbps bandwidth made online data transfer impractical.
- PHP app relied on local sessions, filesystem cache, and temp files, complicating scaling.
- Database required a near-zero downtime migration while serving live users.
Key Results
- Cut 340 TB data transfer from ~1+ year to ~5–6 weeks using a hybrid Snowball Edge + DataSync approach.
- Slashed cloud storage costs ~62% with lifecycle transition to Glacier Instant Retrieval, paying back the one-time investment in ~6 months.
- Enabled near-zero downtime DB migration with AWS DMS CDC and cross-region replica, achieving seconds-lag RPO and minute-scale RTO.
- Saved ~20–25% in app infrastructure costs by using a fixed 15-instance EC2 setup instead of Auto Scaling, avoiding ~$800–$1,100/mo in excess S3 request fees.
Solution

- Structured Engagement:Delivered a MAP Assessment with a migration roadmap, architecture design, and TCO analysis across four workstreams.
- Infrastructure Discovery:Documented the co-location stack (15-node web tier, 4-node Galera cluster, load balancers, network), baseline OpEx, and stateful app characteristics to inform architecture.
- Storage Migration Plan:Designed a hybrid Seed & Sync path using Snowball Edge + DataSync to migrate ~340 TB (~914 M objects) with validation layers and rollback procedures
- Database Migration:Used AWS DMS with Full Load + CDC to achieve near-zero downtime to RDS MariaDB (Multi-AZ + cross-region replica) with cost-saving reserved instance guidance.
- App Architecture Evaluation:Compared fixed 15-instance EC2 (recommended) vs Auto Scaling with externalized session/cache; recommended fixed for cost efficiency given workload patterns.
- S3 Strategy & Lifecycle:Phased Intelligent-Tiering in Year 1 and Glacier Instant Retrieval from Year 2 onward (~62% storage cost reduction), with strong security controls (encryption, bucket policies, MFA Delete).
Technologies Used
- AWS Snowball Edge Storage Optimized (Physical Data Transfer)
- AWS DataSync (Enhanced Mode — Delta Sync)
- Amazon S3 (Intelligent-Tiering, Glacier Instant Retrieval, Lifecycle Management)
- AWS Database Migration Service (DMS) with Change Data Capture (CDC)
- Amazon RDS for MariaDB (Multi-AZ, Cross-Region Read Replica)
- Amazon EC2 (t3a.large) with Application Load Balancer (ALB)
- AWS CloudWatch (Monitoring, Transfer Validation, DB Insights)
Summary
An ed-tech company needed to move 340 TB of unstructured data and a stateful PHP platform from an aging co-location to AWS despite a 15 Mbps link that made online transfer impractical. MAP Assessment defined a Hybrid Snowball Edge + DataSync “Seed and Sync” approach that cut the data migration from an infeasible 1+ year to ~5–6 weeks, enabled phased S3 tiering for ~62% lower ongoing storage costs, and used AWS DMS CDC to achieve near-zero downtime database migration with near-real-time RPO.
#arocom #artificialintelligence #machinelearning #datascience


