
A research team built a working proof-of-concept (POC) in R/Shiny to validate a geospatial recommendation algorithm. The science was sound. The system wasn't.
Frontline users needed location-specific recommendations based on complex geospatial analysis. The POC could generate them, but each request took 20 to 30 minutes. During peak seasons, when dozens of users needed answers simultaneously, the system became unusable.
Three technical barriers stood between the POC and production:
The question wasn't whether the science worked. It was whether the system could be re-architected to deliver that science reliably, at scale, without starting from scratch.


The Six Feet Up engineering team treated the POC as validated research, not as a codebase to incrementally improve. The R/Shiny implementation had proven the algorithms worked. The remaining challenge was delivering them predictably.
Every technology choice was driven by a specific production requirement: horizontal scaling, spatial query performance, artifact storage, and infrastructure repeatability.
The application was rebuilt on a Python/Django backend with a React frontend, deployed on Kubernetes. PostgreSQL with PostGIS extensions handled spatial queries. Raster datasets moved to AWS S3 storage. Terraform managed infrastructure-as-code to eliminate environment drift between deployments.
The biggest unknown was how to handle multi-gigabyte raster datasets without introducing latency spikes or dependencies on external services.
Three strategies were tested against real production data:
Testing revealed the hybrid approach was most reliable. Rasters were preprocessed and stored in S3, then cached locally using Kubernetes persistent volumes. This eliminated repeated downloads while reducing dependency on external services during live requests.
With data access stabilized, the next bottleneck became clear: too much data was being processed per request.
The fix was architectural. Instead of loading entire regional rasters and filtering down, the pipeline was redesigned to extract only the raster windows intersecting the user's drawn boundary (think of a polygon drawn on an interactive map). Request runtime now scaled with field size, not dataset size.
That change moved 20–30 minute workflows into an interactive range. Some completed in seconds.
Speed alone wasn't sufficient. The system also needed to be predictable under load.
Geospatial clustering (identifying zones with similar characteristics) was removed from the live request path entirely. Clusters were pre-calculated and versioned as static files loaded at startup. When clustering parameters changed, a batch job regenerated the files. The complexity was still there. It just wasn't happening during a user's request.
The recommendation engine was rebuilt using a linear programming model (PuLP) that evaluated availability alongside optimality. When a preferred option was out of stock, the system returned feasible alternatives rather than failing or ignoring the constraint. This shifted the tool from an academic exercise into something a frontline user could act on.
.webp)
.webp)
The re-architecture made the system production-ready. What had been a 20–30 minute workflow became interactive. What had been unpredictable became reliable.
Three architectural principles drove the improvement:
The results confirmed that the underlying science could work at production scale, but only with an architecture designed for it.
If you're scaling a POC into a production system: test data access patterns with real datasets early. Measure where time is actually spent. Lock in your approach to data before building features on top of it. Experimentation beats assumptions.
Scaling a data-intensive POC into production? The decisions that matter most happen before the first feature ships. Learn more about Six Feet Up's approach to data challenges.



Unlocking Value from Raw Time Series Signals
Healthcare technology startup