Skip to content

Connects to OpenAI, applies Large Language Models (LLMs) & LangChain, and builds a platform to chat with coffee customer review text data using Python. Visualizes text data with R

Notifications You must be signed in to change notification settings

Xin-Bu/Coffee_review_text_QA_LLMs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Chat with coffee review text data

NCA_Web_Medium_Decaf_1

Image source

Dataset

The dataset in this project contains two files: coffee_review.pdf and coffee_review.csv. Both are from the same data source. The file coffee_review.pdf has customer review text data only, the basic information of which is shown in the table below. The file coffee_review.csv has 2,095 rows and 12 columns. The columns are: name,roaster,roast, loc_country, origin_1, origin_2, 100g_USD, rating, review_date, desc_1, desc_2, and desc_3.

Items Descriptive statistics
tokens 41,064
unique_tokens 3,070
avg_token_length 6.45
lexical_diversity 0.07
top_n cup;aroma,mouthfeel,acidity,structure,finish,notes,sweet,cocoa,chocolate,syrupy

The purpose of this project is to visualize text data in coffee_review.csv using R and to build a platform using python to ask and answer questions from the file coffee_review.pdf. The R code in this project for the visuals was written in R Markdown and knitted to html.

Selected text visuals with R

  • A wordcloud of the most common words: word_cloud

  • Term frequency by roast: term_frequency

  • A network of bigrams:

bigrams_visual

  • A world map illustrating the median coffee price by region: coffee_price

Procedures of LangChain QA application with Python

  • Load documents
  • Split documents
  • Define embedding
  • Create vector database from data
  • Define retriever
  • Create a chatbot chain
  • Create a panel-based interactive dashboard

Data source

Coffee Review

References

Silge, J., & Robinson, D. (2023). Text mining with R: A tidy approach. O'Reilly Media, Inc.

Intro: Deeplearning.ai on Langchain: chat with your data

About

Connects to OpenAI, applies Large Language Models (LLMs) & LangChain, and builds a platform to chat with coffee customer review text data using Python. Visualizes text data with R

Topics

Resources

Stars

Watchers

Forks