-
Notifications
You must be signed in to change notification settings - Fork 470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include GPT-4 V model to be able to search for images and embedding images. #323
Comments
Reference here: - Azure-Samples/azure-search-openai-demo#1056 |
Update 22nd April: After spiking possible technology choices, I believe the best way forward is to:
Then when querying, generate embeddings of the question using both Azure Computer Vision and Note: this does require us to change the index to allow for an additional I was initially going to create an ADR deciding on which tools would be best to use, but given my research, spike and investigation on how this is implemented in Azure-Samples/azure-search-openai-demo#1056, I now believe using both appoaches combined will give the best results. Next steps are to now start building this into CWYDSA |
Update 23rd April:
|
Update: 28th May The core tasks relating to this story have been completed, namely uploading images with advanced image processing, and querying data based on these images, passing these to the LLM. There exist some outstanding tasks regarding updating the prompts to match include the images that are passed to the LLM, and also getting it to work with integrated vectorisation. However, it may be better to move these into their own issues, so this main epic can be closed. |
This issue is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be closed in 30 days. |
Motivation
Company data often comprises various types of images, including screenshots, maps, and diagrams. By enabling the chat admin app to ingest and process these images, it can provide more accurate and relevant responses to user queries that involve visual data. This ensures that the chat app can fully utilise all available company data to deliver an improved user experience.
Note: Image processing is only available using GPT-4 https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#gpt-4-and-gpt-4-turbo-preview
How would you feel if this feature request was implemented?
Requirements
gpt-4-vision
to generate a responseStretch: Fallback to OCR/document intelligence if image of a document detectedStretch: Allow images to be uploaded when chattingTasks
/api/conversation/custom
#750/api/conversation/azure_byod
#752Bugs
The text was updated successfully, but these errors were encountered: