Blog

From PoC to Production: The Journey of Incedo Lighthouse^TM, an AI-powered SaaS Platform on AWS

Hakan Korkmaz and Avjeet Singh Bhatia
August 21, 2025

Introduction

Incedo Lighthouse^TM began as an experimental proof-of-concept (PoC) to validate the capability of using AI to automate problem discovery and insight generation through KPI Trees. Over time, it evolved into a robust, multi-tenant SaaS platform deployed on AWS, enabling enterprise customers to discover problems through AI-powered analytics.

This blog outlines the technical and strategic journey of transforming Incedo Lighthouse^TM from an early prototype into a production-grade SaaS platform. It also highlights key AWS services leveraged, architectural decisions made, and lessons learned along the way.

Phase 1: Business Planning & Strategy

At inception, Incedo Lighthouse^TM was designed to answer a common pain point: how can large organizations (enterprises with 1000+ employees) discover problems in their business without manual data analysis?

We started by identifying our target personas:

Business users who need quick insights (which can be read in less than 2 mins) into operational breakdowns
C-Level officers who need quick insights into business
Data analysts who need to trace anomalies to their root cause
Data engineers who require consistent data pipelines and model performance

We aligned our product goals to AWS’s SaaS Journey Framework, ensuring we had a strong foundation:

Define measurable KPIs called metrics
Identify filters, cohorts
Identify compliance and isolation requirements for multi-tenancy

Phase 2: MVP Development & Early Architecture

The MVP was built using a monolithic backend and basic Python data pipelines. Key capabilities included:

Reading and transforming XL, CSV files and other data sources
Validating schemas and generating KPI Trees
Returning insights via an API consumed by a React Frontend

The architecture included:

React: For Frontend
Spring Boot API: REST endpoints for fetching data from database tables
Python Layer: Data Science Batch jobs using Pandas which will generate insights and perform machine learning tasks.
PostgreSQL: Central data store

Although functional, this approach could not handle the volume, variety, and concurrency of real-world production scenarios (processing 0.5M+ records per hour). Limitations around memory, CPU constraints, and slow batch execution became apparent.

Phase 3: Replatforming for Scale

We adopted a modular microservices architecture and replaced bottlenecks with scalable AWS-native services. The replatforming effort focused on decomposing the monolith into specialized services, each responsible for a distinct responsibility in the data processing pipeline. This not only enabled better fault isolation but also allowed services to scale independently.

Microservices Introduced:

Core API Service
- Built with Springboot
- Accepts user-uploaded files
- Tenant management
- Stores files and its metadata in PostgreSQL
Login Service
- Provides all the authentication services
- Isolates authorizations complexities
Data Science Service
- Implemented using PySpark jobs on Amazon EMR
- Performs schema validation, typecasting, date conversions, and missing value imputation
- Dynamically adjusts resources based on file size and complexity using EMR auto scaling
- Writes transformed data to disk in parquet format with partitioned paths
- Detects anomalies
React Service
- Exposes React Frontend as Cloud Front which is AWS Native
- Required for creation/updation of KPI Trees

By decomposing the system into these modular components, we enabled independent scaling, faster deployment cycles, and better maintainability. Each service can be versioned, monitored, and secured individually, enabling Incedo Lighthouse^TM to meet the performance and compliance expectations of a modern SaaS offering.

Core Technology Changes:

Replaced Pandas batch jobs with PySpark on AWS EMR, with 4 executors (8 GB RAM each) and 16 GB driver
Deployed AI models for real-time anomaly, clustering
Using more core AWS services like Amazon GuardDuty, AWS Network Firewall, Amazon API Gateway, Amazon Elastic Compute Cloud (Amazon EC2), Amazon Elastic Container Registry (Amazon ECR), Amazon Elastic Kubernetes Service (Amazon EKS), Amazon Simple Storage Service (Amazon S3), Amazon CloudWatch, AWS CloudTrail, Amazon Cognito, Amazon Relational Database Service (Amazon RDS), NAT Gateway, Amazon Route 53, Amazon CloudFront, AWS Web Application Firewall (AWS WAF), etc.
Below are the Performance Metrics:
- Average throughput: 0.8M records processed per hour under peak load
- Average API response time: Upto 3000ms for standard queries
- Scalability limits: Tested up to 4M records/hour with auto-scaling EMR clusters

Phase 4: Multi-Tenancy and SaaS Readiness

We selected a schema-based multi-tenancy model in PostgreSQL to isolate tenant configurations while keeping operational overhead low.

IAM roles and resource policies ensured tenant-level data isolation. Our onboarding workflow automatically provisioned schemas. Key SaaS capabilities included:

Automated provisioning
Role-based access control
Custom metadata management per tenant
Tenant-specific configurations for model thresholds, KPI rules, and transformation logic

Security Compliance:

Data protection: Enhanced encryption features at rest and in transit
Access control: IAM role-based access, least privilege
Monitoring: Amazon GuardDuty, AWS CloudTrail, AWS Config
Network security: AWS WAF, security groups
Compliance frameworks: AWS well-architected security pillar.

Phase 5: CI/CD and Operational Readiness

We formalized development workflows using:

GitHub actions for CI/CD
SonarQube for code quality gates
Pre-commit hooks and branch protection
EKS: Infrastructure was managed with AWS CDK and CloudFormation. In addition, we containerized all microservices using Docker and deployed them on Amazon EKS (Amazon Elastic Kubernetes Service). This gave us an environment with advanced security features and enabled efficient horizontal scaling across services. Each microservice runs as a container within its own Kubernetes pod, allowing for better fault isolation, autoscaling, and zero-downtime deployments. We used Helm charts to package and deploy Kubernetes resources consistently across environments, enabling simplified version management, parameterization, and rollout strategies for each service. We leveraged Kubernetes features such as ConfigMaps, Secrets, and Horizontal Pod Autoscalers to manage configurations and scale based on custom metrics like CPU usage or request volume. We used AWS Secrets Manager for protected credentials and AWS Systems Manage Parameter Store for configuration management.

Enhanced Tracking and Observability:

Amazon CloudWatch for logs and metrics
Amazon GuardDuty and Config for security posture
AWS WAF to protect APIs
Alarms and dashboards via Amazon CloudWatch

Phase 6: Real-Time Anomaly Detection

To increase responsiveness, we introduced a real-time anomaly detection pipeline:

Ingest data continuously via Amazon Kinesis or Kafka
Apply window-based transformations using Spark Streaming
The results are stored back in PostgreSQL. This evolution positioned Incedo Lighthouse^TM as a proactive analytics engine

Phase 7: Cost Optimization

Operating large Spark jobs and ML inference at scale can be expensive. We adopted:

Spot Instances for EMR jobs
S3 Intelligent Tiering for data storage

These optimizations reduced costs by ~40% post replatforming within 2 months (exact costs cannot be revealed due to confidentiality)

Phase 8: Go-To-Market and Beyond

After internal production validation, we prepared for public onboarding:

Implemented SaaS trial tenant flow
Integrated with AWS Marketplace
Added support workflows with auto-escalation via email/SNS

System Reliability Data is below:

Uptime: 99.95% over the past 12 months
Recovery Time Objective (RTO): < 15 minutes for service disruptions
Redundancy measures: Multi-AZ deployments for PostgreSQL and EKS clusters, daily backups with point-in-time recovery

Lessons Learned

Build SaaS mindset from day 1
Designing Incedo Lighthouse^TM with a SaaS-first mindset helped us ensure that every tenant, regardless of size, had access to a consistent, scalable experience. This meant investing early in capabilities such as tenant onboarding automation, isolated configurations, self-service features, and role-based access control. It also meant architecting for self-serve provisioning, licensing control, and operational visibility across tenants
Use AWS managed services wherever possible
Rather than managing Spark clusters, ML model deployments, or workflow engines ourselves, we relied on services that could auto-scale, automatically recover, and integrate natively with IAM and CloudWatch. This allowed us to focus on application logic rather than infrastructure
Design for failure and scale early
One early challenge was the failure of batch jobs during peak usage (when processing is > 5K records/minute). We addressed this by implementing retry mechanisms, using AWS Step Functions for fault-tolerant workflows, and monitoring each microservice with alerts. By embracing eventual consistency, idempotent operations, and queue-based decoupling, we made the system with improved resilience features. This mindset enabled us to effectively manage spikes in load (2x volume) without service degradation
Start with real use cases
Rather than building abstract data capabilities, we built Incedo Lighthouse^TM features around actual business scenarios—like root cause analysis for delayed shipments or fluctuating revenues. These use cases helped define what metrics mattered, how anomaly detection thresholds should be tuned, and what visualizations users needed. It kept us grounded and ensured product-market fit from the outset

Customer Success Metrics:

Use cases: Automated anomaly detection for supply chain delays, financial performance monitoring
Quantifiable benefits: Reduced issue resolution time by 45%, increased operational efficiency by 30%
Average implementation timeframe: 2-4 weeks from onboarding to production usage

Conclusion

The journey from PoC to production for Incedo Lighthouse^TM has been an iterative, insight-rich experience. By closely aligning with AWS SaaS best practices, adopting serverless and elastic compute, and embracing DevOps and cost optimization early, we built a scalable AI SaaS platform that’s helping organizations discover root causes faster and smarter.

As we expand into generative AI, deeper anomaly detection, and smarter metric exploration, Incedo Lighthouse^TM continues to shine a light on what matters most: actionable insights, at scale.

About the Authors

Hakan Korkmaz

Sr. Partner Solutions Architect with Amazon Web Services

Hakan Korkmaz is a Sr. Partner Solutions Architect with Amazon Web Services. He provides architecture guidance on enterprise cloud adoption, migration, modernization, solution development, and AI strategy to AWS partners and customers..

Avjeet Singh Bhatia

Senior Technical Architect with Incedo Inc.

Avjeet Singh Bhatia is Senior Technical Architect with Incedo Inc. He provides architectural guidance to Incedo Lighthouse Development Team. He is Technology enthusiast, leading technical initiatives in Product Development. Specializing in architecting, designing and implementing technically complex, highly scalable, secured and highly available applications. Technologies currently working on: AI/ML, Python, Spark, Scala, Kafka, AWS, Java & Microservices..

Blog

From PoC to Production: The Journey of Incedo Lighthouse^TM, an AI-powered SaaS Platform on AWS

Introduction

Phase 1: Business Planning & Strategy

Phase 2: MVP Development & Early Architecture

Phase 3: Replatforming for Scale

Core Technology Changes:

Phase 4: Multi-Tenancy and SaaS Readiness

Phase 5: CI/CD and Operational Readiness

Phase 6: Real-Time Anomaly Detection

Phase 7: Cost Optimization

Phase 8: Go-To-Market and Beyond

Lessons Learned

Conclusion

About the Authors

Hakan Korkmaz

Avjeet Singh Bhatia

Services

Platform

Partners

Optimizing HCP Engagement: AI-driven Next Best Action with Incedo Lighthouse^TM on AWS

Data Mesh & Fabric: Unveiling the Synergy and potential with AWS

Unlocking Insights with Incedo’s Data and analytics services on AWS environment for effective pharma sales

Blog

From PoC to Production: The Journey of Incedo LighthouseTM, an AI-powered SaaS Platform on AWS

Introduction

Phase 1: Business Planning & Strategy

Phase 2: MVP Development & Early Architecture

Phase 3: Replatforming for Scale

Core Technology Changes:

Phase 4: Multi-Tenancy and SaaS Readiness

Phase 5: CI/CD and Operational Readiness

Phase 6: Real-Time Anomaly Detection

Phase 7: Cost Optimization

Phase 8: Go-To-Market and Beyond

Lessons Learned

Conclusion

About the Authors

Hakan Korkmaz

Avjeet Singh Bhatia

Services

Platform

Partners

Suggested Thinking

Optimizing HCP Engagement: AI-driven Next Best Action with Incedo LighthouseTM on AWS

Data Mesh & Fabric: Unveiling the Synergy and potential with AWS

Unlocking Insights with Incedo’s Data and analytics services on AWS environment for effective pharma sales

From PoC to Production: The Journey of Incedo Lighthouse^TM, an AI-powered SaaS Platform on AWS

Optimizing HCP Engagement: AI-driven Next Best Action with Incedo Lighthouse^TM on AWS