Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: [WIP] Integration of SmolDocling #708

Draft
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

maxmnemonic
Copy link
Contributor

@maxmnemonic maxmnemonic commented Jan 8, 2025

Preliminary integration with SmolDocling model and VLM Pipeline:

  • SmolDocling inference model
  • New VLM Pipeline that uses SmolDocling model
  • Assembly code that builds Docling document from Doc-tags format predicted by SmolDocling
  • Example of how to use
  • Rudimentary speed measurement logging

Checklist:

  • Documentation has been updated, if necessary.
  • Examples have been added, if necessary.
  • Tests have been added, if necessary.

Copy link

mergify bot commented Jan 8, 2025

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

@maxmnemonic maxmnemonic changed the title WIP: Integration of SmolDocling pipeline WIP: Integration of SmolDocling Jan 8, 2025
@maxmnemonic maxmnemonic force-pushed the mly/smol-docling-integration branch from e4a60ae to 48faf18 Compare January 8, 2025 15:12
@cau-git cau-git changed the title WIP: Integration of SmolDocling feat: [WIP] Integration of SmolDocling Jan 10, 2025
cau-git and others added 13 commits January 16, 2025 16:23
Signed-off-by: Christoph Auer <[email protected]>
Signed-off-by: Maksym Lysak <[email protected]>
…e assembly code, example included.

Signed-off-by: Maksym Lysak <[email protected]>
…s in VLM pipeline. This enables correct figure extraction and page numbers in provenances

Signed-off-by: Maksym Lysak <[email protected]>
…easurement in smol_docling models

Signed-off-by: Maksym Lysak <[email protected]>
@maxmnemonic maxmnemonic force-pushed the mly/smol-docling-integration branch from 64e854e to 354c90a Compare January 16, 2025 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants