Skip to content

Commit

Permalink
Refactor TextGeneration class to support development usecases (quic#165)
Browse files Browse the repository at this point in the history
* Refactor TextGeneration class to support development usecases

Development usecases require:
 - Support serving successive requests in same session.
In current class structure, TextGeneration obj initialization
is coupled with a combination of prompt, generation length.
This adds an overhead of creating a QAICInferenceSession/
loading qpc, extracting its characteristics for each request.
 - Support yielding tokens as they are generated.
Current call-flow uses TextStreamer to print generated tokens
to console. An API that can yield tokens as they are decoded
will offer cleaner solution for development purpose. Adopting
this approach for high level APIs within QEfficient is an
overkill.

- Move components of TextGeneration class into a base class that
primarily handles loading a QAICInferenceSession, low-level
methods that fetch and leverage information from Session object.
- Code maintenance to indicate scope of variables/methods in base
class.
- Add setup method in TextGeneration class to reset storage
variables for a new request.
- Add an API to yield decoded tokens as they are generated.

Signed-off-by: quic-suppugun <[email protected]>

* Revert reordering of methods

Signed-off-by: quic-suppugun <[email protected]>

* Update docstrings, Cleanup TextGeneration class

Signed-off-by: quic-suppugun <[email protected]>

* Format and lint

Signed-off-by: quic-suppugun <[email protected]>

* Added test module and minor fix

Signed-off-by: Rishin Raj <[email protected]>

* Format

Signed-off-by: Rishin Raj <[email protected]>

* Device ID fix

Signed-off-by: Rishin Raj <[email protected]>

---------

Signed-off-by: quic-suppugun <[email protected]>
Signed-off-by: Rishin Raj <[email protected]>
Co-authored-by: Rishin Raj <[email protected]>
  • Loading branch information
quic-suppugun and quic-rishinr authored Dec 5, 2024
1 parent 7c61470 commit f6bffae
Show file tree
Hide file tree
Showing 3 changed files with 482 additions and 194 deletions.
Loading

0 comments on commit f6bffae

Please sign in to comment.