About me
Hello, I'm Mohan Bhosale, a Data Scientist with 4+ years of experience building ML systems that actually ship to production. I'm currently completing my Master's in Data Science at Northeastern University (GPA: 3.96) and most recently worked as a Data Scientist Co-op at Cohere Health, where I built predictive models analyzing 10M+ healthcare claims on AWS SageMaker and implemented end-to-end MLOps pipelines that helped optimize $8.5M in annual medical expenses.
My industry experience spans healthcare, ad-tech, and ed-tech. At Mediamint, I designed real-time Spark/Kafka pipelines, built customer segmentation models that lifted engagement by 25%, and led A/B testing that drove a 30% increase in ad performance. At Allround Club, I built a hybrid recommendation engine with TensorFlow and Apache Spark that boosted course purchases by 25% and ran churn analysis that cut attrition by 12%. I don't just train models. I build the data infrastructure, run the experiments, and deliver dashboards that stakeholders actually use.
My recent work focuses on LLMs and multi-agent AI systems in production settings. I've architected ClassifyAI, an 8-agent LLM system that automates full ML classification pipelines via LangGraph. I built HIMAS, a federated learning platform for privacy-preserving ICU mortality prediction that won 1st Place at the Google Cambridge MLOps Hackathon. I've also developed a multi-modal video ad classifier (F1: 0.81) using PyTorch and Transformers for Prof. Yakov Bart, and built Extended-Reality remote assistance systems with LLM-driven avatars in Prof. Mallesham's EXP research lab. My toolkit covers the full pipeline: PySpark ETL, feature engineering, LangChain/RAG, Docker/Kubernetes deployments, and cloud infrastructure across AWS, GCP, and Azure.
I'm looking for roles where I can combine deep ML expertise with real engineering discipline to solve hard problems at scale, especially in healthcare AI, ML infrastructure, or any team where data science means shipping production systems, not just notebooks.
What i'm doing
-
Building Production-Ready ML Solutions
Experienced in architecting and deploying scalable machine learning models in production environments. Currently working with healthcare claims data to build predictive models using PySpark and advanced ML frameworks on AWS SageMaker, optimizing millions in medical expenses while improving patient outcomes.
-
Research
Actively engaged in cutting-edge research in AI, computer vision, and NLP. My research interests include multi-modal learning, generative AI, and Large Language Models. I've developed innovative solutions like multi-modal video ad classifiers and RAG-based document analysis systems that push the boundaries of AI applications.
-
Data Analytics
Expert in extracting actionable insights from complex, large-scale datasets using advanced statistical methods and visualization tools. Proficient in designing real-time data pipelines, conducting A/B testing, and creating comprehensive dashboards that drive data-informed decision-making and business strategy.
-
Problem Solving
Specialized in tackling intricate data science problems across diverse domains. From healthcare optimization to marketing automation, I consistently deliver innovative solutions that enhance efficiency, reduce costs, and drive measurable business impact through advanced analytics and machine learning techniques.