AI/ML — Sagar Tetali

AI | ML Portfolio

I’m a Machine Learning R&D Engineer currently working at a startup on bringing intelligence to everyday objects. These are some of my independent projects, including a project I completed as part of CIS 519: Applied Machine Learning at UPenn with Dr. Eric Eaton, where we researched Transfer Learning, NLP, and word embedding methods.

Download CV

Diabetes Risk Predictor

Tools: FLASK, Docker, MLFLow, Jupyter Notebook, VSCode, XGBoost, GitHub, AWS ECR, Render

A Containerized, cloud-deployed FLASK-based web application that predicts the risk of diabetes based on the answers to a questionnaire. The app currently uses an XGBoost (Extreme Gradient Boosted Model) model trained on a Kaggle Dataset of CDC Data.

Open App

View Code on GitHub

Document ChatBot: Retrieval Augmented Generation (RAG)

Tools: OpenAI API, GPT-3.5, PineCone, Streamlit, Python, VSCode, Github

A Streamlit App that uses Langchain, OpenAI Embeddings, GPT 3.5-Turbo, and Pinecone Vector Databases to process a user-provided document. The document is chunked, and then converted to word embeddings using OpenAI Embeddings. The embeddings are inserted into a Pinecone Index which is deleted after runtime. Langchain is used to retrieve information through the QA

View Code on GitHub

Research Project @UPenn

Transfer Learning: Opinion Mining in Product Reviews

Tools: Python, Jupyter Notebook, GPU-accelerated Machine Learning

Team Project, on which I worked alongside Tien Pham and Grace Boatman.

We compared transfer learning techniques built on word embeddings to evaluate classification performance for opinion mining. We use transfer learning techniques at two tiers: First, we use word embedding methods such as GloVe, BERT, and ULMFiT that have been extensively trained on huge repositories of data.

Secondly, we trained models built on these embeddings at different instance-size combinations of two datasets: a ”source” dataset of Amazon Tech Product reviews and a ”target”. dataset of TripAdvisor reviews. We subsequently evaluated predictive performance on the target TripAdvisor dataset, and compare the ability of the model to generalize across non i.i.d. datasets.

Project Report

Certifications

I’m a TensorFlow Certified Developer, skilled in building neural networks for Image Classification, NLP, and Time Series Analysis.