InstaVision: AI Model Evaluation for Reduced Inference Cost

"Partnering with Tech 42 has been a fantastic experience. They helped us tackle one of our most critical business challenges: reducing AI inference costs without compromising the quality of the real-time generative event descriptions that power our platform.

Their team conducted a thorough and well-structured model evaluation POC on AWS, benchmarking a comprehensive set of vision models against our existing OpenAI GPT-4o implementation — including AWS Bedrock models such as Nova 2 Lite, Llama 4 Scout, and Llama 4 Maverick, as well as self-hosted open-source models including Qwen3-VL and InternVL3.5 across multiple instance types. They built out SageMaker-based Jupyter notebooks for both API-based and self-hosted evaluations, delivered prompt optimization strategies, and provided a clear, data-driven cost and performance analysis that gave us exactly what we needed to make confident decisions.

Throughout the engagement, the Tech 42 team was communicative, responsive, and a true partner in every sense of the word. They met our goals and expectations, and we are now moving forward with confidence based on their findings. We look forward to continuing this partnership into the next phase and highly recommend Tech 42 to any team looking to accelerate their AI initiatives."

Sagar Setu

VP of Engineering

Project summary

InstaVision, a provider of AI-driven video analytics and cloud-based surveillance solutions, needed to significantly reduce AI inference costs associated with generating real-time, contextual event descriptions from security camera footage — all without degrading the quality their residential and commercial customers depend on. Tech 42 designed and executed a structured Proof-of-Concept on AWS, benchmarking leading vision models available through Amazon Bedrock alongside self-hosted open-source models deployed via Amazon SageMaker, evaluating each against InstaVision's existing validation dataset and real-world use cases. The engagement included prompt optimization strategies and a comprehensive unit economics and performance analysis across both API-based and self-hosted model deployment approaches. Armed with clear, data-driven findings, InstaVision gained the confidence to make an informed model migration decision — and engaged Tech 42 for a follow-on phase to implement the recommended solution.

Back to all testimonials

InstaVision: AI Model Evaluation for Reduced Inference Cost

Project summary

We help organizations design, build, and deploy AI that works in the real world.

AWS Advanced Tier Services Partner

Contact

Company

Resources