Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better registration support for a wide range of third-party hardware #20349

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

uniartisan
Copy link

@uniartisan uniartisan commented Oct 19, 2024

What does this PR do?

Thank you to the lightning team for providing such an easy-to-use, clearly designed library.

The pr draft hopes to provide better registration support for a wide range of third-party hardware, and the pr is designed to integrate third-party hardware with minimal intrusive changes, including intel XPU, Huawei Ascend NPU, Cambrian, Moorethreads, and more.

Fixes #<issue_number>

Before submitting
  • Was this discussed/agreed via a GitHub issue? (not for typos and docs)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes? (if necessary)
  • Did you write any new necessary tests? (not for typos and docs)
  • Did you verify new and existing tests pass locally with your changes?
  • Did you list all the breaking changes introduced by this pull request?
  • Did you update the CHANGELOG? (not for typos, docs, test updates, or minor internal changes/refactors)

PR review

Anyone in the community is welcome to review the PR.
Before you start reviewing, make sure you have read the review guidelines. In short, see the following bullet-list:

Reviewer checklist
  • Is this pull request ready for review? (if not, please submit in draft mode)
  • Check that all items from Before submitting are resolved
  • Make sure the title is self-explanatory and the description concisely explains the PR
  • Add labels and milestones (and optionally projects) to the PR so it can be classified

📚 Documentation preview 📚: https://pytorch-lightning--20349.org.readthedocs.build/en/20349/

@github-actions github-actions bot added fabric lightning.fabric.Fabric pl Generic label for PyTorch Lightning package labels Oct 19, 2024
@uniartisan
Copy link
Author

Examples here: https://github.com/uniartisan/RWKV-PEFT/blob/device-enhance/train.py#L499

There are a lot of things to be checked, I will try to do it later and make it more clear in documentation

@github-actions github-actions bot added the docs Documentation related label Oct 19, 2024
@uniartisan uniartisan marked this pull request as ready for review October 19, 2024 07:13
@uniartisan uniartisan changed the title Device enhance Better registration support for a wide range of third-party hardware Oct 19, 2024
@uniartisan uniartisan force-pushed the device-enhance branch 11 times, most recently from 8f0b3d6 to 2a89640 Compare October 22, 2024 06:58
Copy link

codecov bot commented Oct 22, 2024

Codecov Report

Attention: Patch coverage is 72.63158% with 26 lines in your changes missing coverage. Please review.

Project coverage is 87%. Comparing base (9177ec0) to head (dee68b1).
Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff            @@
##           master   #20349    +/-   ##
========================================
- Coverage      88%      87%    -1%     
========================================
  Files         267      267            
  Lines       23380    23459    +79     
========================================
- Hits        20481    20377   -104     
- Misses       2899     3082   +183     

@uniartisan uniartisan force-pushed the device-enhance branch 2 times, most recently from 1c83154 to 15595bf Compare October 22, 2024 08:00
Copy link
Collaborator

@lantiga lantiga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the interesting PR! I added a few comments.

src/lightning/fabric/accelerators/accelerator.py Outdated Show resolved Hide resolved
src/lightning/fabric/plugins/precision/amp.py Outdated Show resolved Hide resolved
src/lightning/fabric/plugins/precision/fsdp.py Outdated Show resolved Hide resolved
src/lightning/fabric/plugins/precision/fsdp.py Outdated Show resolved Hide resolved
src/lightning/fabric/plugins/precision/fsdp.py Outdated Show resolved Hide resolved
src/lightning/fabric/strategies/ddp.py Outdated Show resolved Hide resolved
src/lightning/fabric/strategies/deepspeed.py Outdated Show resolved Hide resolved
@lantiga lantiga added the waiting on author Waiting on user action, correction, or update label Dec 10, 2024
@lantiga
Copy link
Collaborator

lantiga commented Dec 10, 2024

hey @uniartisan are you willing to finish this one up? It would be a welcome contribution

@uniartisan
Copy link
Author

hey @uniartisan are you willing to finish this one up? It would be a welcome contribution

sorry for my late reply. I will solve them tomorrow!
😝😝😝😝

@uniartisan uniartisan force-pushed the device-enhance branch 3 times, most recently from bdb81d4 to 01a931d Compare December 20, 2024 03:18
@fritol
Copy link

fritol commented Dec 23, 2024

will you add DirectML?

@uniartisan
Copy link
Author

will you add DirectML?

What I've added is the code for plug-in registration, which means you can register Direct ML by yourself. Just write a few simple function 🤓

@uniartisan uniartisan requested a review from lantiga December 25, 2024 09:11
@lantiga
Copy link
Collaborator

lantiga commented Jan 6, 2025

Thank you @uniartisan!

@lantiga lantiga removed the waiting on author Waiting on user action, correction, or update label Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accelerator docs Documentation related fabric lightning.fabric.Fabric pl Generic label for PyTorch Lightning package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants