Software Engineer (AI Platforms, ML Systems, and Backend Infrastructure)

Jatin Kulkarni

I am a software engineer building production systems for AI platforms and data-intensive products. My recent work spans AWS platform engineering, scalable infrastructure, and applied ML projects in retrieval, vision, and multimodal systems.

Software EngineeringAWS AI PlatformsSeattle, WACornell Tech M.Eng. '25UT Austin B.S '24

Selected Work

Software engineering impact across platform and ML systems

AWS AI Platforms: Scalable ML Infrastructure and Production Systems

Built production ML systems infrastructure for SageMaker Training Plans, focused on reserved-capacity procurement, allocation, and validation for training and inference workloads.

  • Reduced deployment/configuration turnaround from 6-10 days to 1-2 days
  • Reduced customer friction by ~96% through data-driven reserved-capacity limit redesign
  • Contributed to inference-related Training Plans workflows for reserved GPU/accelerator capacity
View full project

HEDWIG: Learning Geospatial Embeddings for Large-Scale Retrieval

Built a ViCLIP-based geolocation system that learns richer geospatial embeddings from multi-frame panoramic imagery and captions.

  • Reduced median top-1 prediction error by over 1,600 km vs. CLIP baseline
  • Increased predictions within 750 km by nearly 4x
  • Improved top-1 retrieval quality across distance thresholds
View full project

Multimodal Medical Image Classification using CLIP and ResNet

Compared multimodal and image-only approaches for diabetic retinopathy, including interpretability and embedding analyses.

  • ResNet-50 reached 92.52% classification accuracy
  • Two-stage CLIP pipeline reached 89.31% accuracy
  • Used Grad-CAM and t-SNE for model behavior analysis
View full project

About

From machine learning curiosity to production platform work.

I started out drawn to machine learning because it sat at the intersection of mathematics, software, and human behavior. That curiosity turned into research in computer vision and geolocation, internships building AI-assisted quality tooling, and now full-time engineering work on AI platforms at AWS.

What keeps me engaged is the translation layer between an idea and a dependable system: reading through messy constraints, understanding how customers actually experience them, and shaping software that makes the right thing easier. My background across computer science, entrepreneurship, and design helps me stay technical without losing the product lens.

View My Resume

What I Work On

ML systems infrastructure and AI platform engineering

I enjoy backend systems for reserved compute capacity, reliability-focused platform workflows, and product decisions that make ML infrastructure dependable in practice.

How I Think

Deep technical analysis with clear business tradeoffs

My best work usually starts with digging into root causes, simplifying the moving pieces, and then finding the version of the solution that is scalable for both engineers and users.

Recent Signals

  • Reduced support friction by about 96% through reserved-capacity limit analysis and redesign
  • Helped enable zero-touch region expansion by removing manual configuration work
  • Built full-stack and research projects spanning FastAPI, Angular, FAISS, and ViCLIP

Experience

Building across research, product, and platform engineering.

My recent work spans production AI infrastructure, applied ML internships, and research that keeps me grounded in the underlying models.

AI Platforms

Amazon Web Services

Software Development Engineer

Bellevue, WA

July 2025 - Present

  • Software Development Engineer on SageMaker Training Plans, building systems for reserved capacity procurement, allocation, and validation for ML training and inference workloads.
  • Contributed to Training Plans support for inference-related reserved-capacity workflows, including GPU/accelerator capacity paths used for production ML inference deployments.
  • Developed customer-facing and internal validation paths for reserved-capacity APIs and resource flows, including safeguards that keep restricted capacity out of unsupported customer experiences.
  • Reduced feature deployment lead time from 6-10 days to 1-2 days by helping move configuration storage to AWS AppConfig.
  • Analyzed 290+ limit increase tickets and drove a redesign that reduced customer friction by about 96%.
  • Implemented dynamic region configuration across partner codebases, enabling zero-touch region expansion and removing manual launch coordination.
Public AWS write-up

Visual Computing Group

Cornell University

Researcher

New York, NY

Fall 2024 - Spring 2025

  • Researched deep learning-based image geolocation using hierarchical embeddings and ViCLIP-based representations.
  • Ran ablation studies across embedding strategies and training setups to improve geolocation accuracy.
  • Contributed to an ongoing manuscript around geospatial reasoning and visual representation learning.

AI Research & Development

Aristocrat Technologies

Intern

Austin, TX

January 2024 - June 2024

  • Built LLM-driven workflows with Retrieval-Augmented Generation to improve test case generation and root cause analysis.
  • Developed defect prediction approaches aimed at improving software reliability and reducing flaky tests.
  • Partnered with engineering teams to translate research ideas into practical quality-assurance tooling.

Technical

Kellogg Brown & Root International, Inc.

Intern

Houston, TX

May 2023 - August 2023

  • Implemented an automation pipeline for migrating enterprise documents into Azure Cognitive Search.
  • Configured AWS IAM and Azure Active Directory access patterns for new projects and roles.
  • Analyzed more than 600,000 support tickets to identify trends and improve IT department efficiency.

Software Engineering

Responsible Artificial Intelligence Institute

Intern

Austin, TX

January 2022 - December 2022

  • Helped redesign the RAI Collab platform to improve performance and reduce maintenance overhead.
  • Reworked front-end UX and tooling for the RAI Collab Portal to accelerate delivery.
  • Built the AI Regulatory Tracker with React and Firebase to track 170+ global AI policies.

Undergraduate Research

UT Austin Autonomous Robotics

Researcher

Austin, TX

August 2021 - December 2021

  • Studied perceived safety in human-robot interaction experiments involving Boston Dynamics Spot.
  • Used ROS, RVIZ, Azure Kinect, and SLAM workflows to model test spaces and track subjects in real time.
  • Combined technical implementation with experimental analysis to understand comfort and safety tradeoffs.

Projects

Software engineering projects across AI platforms, ML systems, and product infrastructure.

Structured for quick scanning: problem, implementation approach, and measurable impact. Featured projects below are ready for direct job application links.

Fall 2024 - Spring 2025

HEDWIG: Learning Geospatial Embeddings for Large-Scale Retrieval

Representation Learning, Multimodal Retrieval, and Geolocation

Built a ViCLIP-based geolocation system that learns richer geospatial embeddings from multi-frame panoramic imagery and captions.

  • Reduced median top-1 prediction error by over 1,600 km vs CLIP
  • Increased predictions within 750 km by nearly 4x
  • Improved top-1 retrieval quality across distance thresholds

Spring 2024

Multimodal Medical Image Classification using CLIP and ResNet

Multimodal Learning and Interpretability

Explored multimodal and image-only approaches for diabetic retinopathy classification, comparing semantic alignment against fine-grained visual discrimination.

  • ResNet-50 achieved 92.52% accuracy
  • Two-stage CLIP approach achieved 89.31% accuracy
  • Used Grad-CAM and t-SNE to analyze explainability and embedding structure

2025 - Present

AWS AI Platforms: Scalable ML Infrastructure and Production Systems

Production ML Infrastructure and Reliability

Software engineering work on SageMaker Training Plans infrastructure for reserved-capacity procurement, allocation, and validation across ML training and inference workloads.

  • Reduced deployment/configuration turnaround from 6–10 days to 1–2 days
  • Reduced customer friction by ~96% through data-driven reserved-capacity limit redesign
  • Contributed to inference-related Training Plans workflows for reserved GPU/accelerator capacity

Fall 2024

Handwritten Equation Recognition using CNNs, Vision Transformers, and Seq2Seq Models

Computer Vision + Sequence Modeling

Built a multi-stage pipeline to convert handwritten equations into LaTeX using visual recognition and sequence generation models.

  • ViT-Base reached 72.87% symbol recognition accuracy
  • Seq2Seq model achieved 66.19% exact match
  • BLEU score of 0.8428

Spring 2025

FinQ-RAG

Full-Stack Retrieval-Augmented Generation System

Built a full-stack application for answering natural language questions over financial PDFs using FastAPI, Angular, FAISS, and Hugging Face models.

Spring 2024

Neural Network Final Project: Harmonizing Genres

CS 342 Neural Networks

Predicted music genres by combining CNN-based spectrogram understanding with RNN/LSTM sequence modeling for richer audio representation.

Fall 2023

Method Cards

Design Thinking Capstone Project

Built an interactive web experience that turns design thinking workflows into guided, game-like problem-solving steps.

Writing

Notes on technology, learning, and the ideas behind the work.

I use the blog to think in public about machine learning, software engineering, and the projects that shaped how I work.

Loading latest posts…

Contact

Let's build something thoughtful.

I'm especially interested in machine learning engineering, applied AI, and product-minded software work. If you want to talk about an opportunity, a project, or just compare notes on building useful ML systems, I'd love to hear from you.