Complete list of all completed Data Science projects
Project Name | Description | Domain | Technical Stack |
---|---|---|---|
Partisanship Analysis | perform analysis of newspaper issues after predicting partisanship using Machine Learning algorithms after feature extraction | Information Retrieval, Data Analysis, Data Visualization | PySpark, Seaborn, SQL |
Diabetes Prediction | predict diabetes based on a user profile using a ML algoritm | Supervised Machine Learning | XGBoost |
Jhakaas Papervala | use collaborative filtering to recommend articles | Recommendation System | LangChain, Python, Jupyter |
Dr. Jarvis | healthcare voice assistant in regional Indian languages | Natural Language Processing | Gemini, LangChain, Python, gTTS |
QnA bot | document based question answering chatbot implemented with the RAG architecture (using Llama model on CPU and Hugging Face API) | Generative AI | Llama2, Python, LangChain |
Text Generation | generating descriptions based on given inputs using LLM | Generative AI | Llama2, Python, LangChain |
Bedrock.ipynb | Jupyter notebook demonstrating the application of AWS Bedrock FMs as Python clients | Generative AI | AWS Bedrock, Claude, Jurassic, Jupyter |
Generative Adversial Network | build a GAN from scratch on handwritten images | Deep Learning | PyTorch |
Visualising NYPD shooting incidents | analyse the dataset and visualise insights using Python and R | Data Wrangling, Data Visualisation | Plotly, Pandas, R, Tableau, Jupyter |
Analysing Titanic Dataset | perform statistical tests and visualise insights using ggplot | Statistical Analysis, Data Visualisation | R |
PDF Extraction | use AWS services and LLM to perform OCR on an invoice | Optical Character Recognition | Python, AWS Textract, GPT |
Introduction to MongoDB | practised information retrieval from collections using standard commands | Information Retrieval, NoSQL | MongoDB, Linux |
Exploring Hadoop | tried my hand at Map Reduce programing and the Hadoop ecosystem | Big Data, Data Engineering | Hadoop, HDFS |
Cheatsheets | reference material related to Data Science. Includes cloud services, programming in Python, and interview preparation | Data Science, Machine Learning, AI, Cloud Computing, Big Data, Python, Statistics, Anamoly Detection, Data Manipulation | SQL, Pandas, scipy, Numpy, AWS, GCP, MongoDB, ChatGPT, Docker, Kubernetes, Linux, Matplotlib, Scikit Learn, OpenCV, CNN, RNN, PyTorch, Keras, Large Language Models, LangChain, Prompt Engineering, Git, MATLAB, Tableau, Plotly, ggplot, flextable |