Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to the LST workflows and small general LST fixes #46967

Conversation

VourMa
Copy link
Contributor

@VourMa VourMa commented Dec 16, 2024

This PR follows up on some items from #46746. More specifically, I have:

  • Moved the LST workflows to use the latest default geometry (D110)
  • Made the 0.704 run on a GPU only if there is one, otherwise fall back to CPU (hardware-agnostic workflow, in accordance with the rest of heterogeneous workflows in CMSSW).
  • Replaced the custom binary search function with std::lower_bound.
  • Got rid of the warning about the SeedStopInfo size mismatch between the TrackCandidate and the TrajectorySeed collections.
  • Made a couple more minor fixes following up on PR integration comments (e.g. using MessageLogger instead of printf, renamings, etc.)

I have tested the changes privately and the performance is exactly the same, so this is a technical PR.

@cmsbuild
Copy link
Contributor

cmsbuild commented Dec 16, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @VourMa for master.

It involves the following packages:

  • Configuration/PyReleaseValidation (pdmv, upgrade)
  • DQM/TrackingMonitorSource (dqm)
  • RecoTracker/LST (reconstruction)
  • RecoTracker/LSTCore (reconstruction)

@AdrianoDee, @Moanwar, @antoniovagnerini, @cmsbuild, @DickyChant, @jfernan2, @mandrenguyen, @miquork, @rseidita, @srimanob, @subirsarkar can you please review it and eventually sign? Thanks.
@GiacomoSguazzoni, @Martin-Grunewald, @VinInn, @VourMa, @arossi83, @dgulhan, @fabiocos, @felicepantaleo, @fioriNTU, @gpetruc, @idebruyn, @jandrea, @makortel, @missirol, @mmusich, @mtosi, @rovere, @slomeo, @sroychow, @threus this is something you requested to watch as well.
@antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@slava77
Copy link
Contributor

slava77 commented Dec 16, 2024

test parameters:

  • enable_tests = gpu
  • workflows = 29634.703,29834.703
  • workflows_gpu = 29634.704,29834.704
  • relvals_opt = -w upgrade
  • relvals_opt_gpu = -w upgrade

@slava77
Copy link
Contributor

slava77 commented Dec 16, 2024

@cmsbuild please test

@cmsbuild
Copy link
Contributor

-1

Failed Tests: RelVals
Size: This PR adds an extra 96KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ee0831/43483/summary.html
COMMIT: 1ff4d57
CMSSW: CMSSW_15_0_X_2024-12-16-1100/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/46967/43483/install.sh to create a dev area with all the needed externals and cmssw changes.

  • DAS Queries: The DAS query tests failed, see the summary page for details.

RelVals

  • 2024.000001DAS Error
  • 2024.101001DAS Error
  • 2024.202001DAS Error
Expand to see more relval errors ...
  • 2024.303001

GPU Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 24 differences found in the comparisons
  • DQMHistoTests: Total files compared: 7
  • DQMHistoTests: Total histograms compared: 53058
  • DQMHistoTests: Total failures: 876
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 52182
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 6 files compared)
  • Checked 24 log files, 30 edm output root files, 7 DQM output files
  • TriggerResults: no differences found

@slava77
Copy link
Contributor

slava77 commented Dec 17, 2024

@smuzaffar
are the CPU comparisons missing due to failing DAS queries? Perhaps these can still be allowed to run for the passing tests.
Please check.
Thank you.

@smuzaffar
Copy link
Contributor

are the CPU comparisons missing due to failing DAS queries?

yes @slava77 , comparison job is not run if there are relval failures.

Perhaps these can still be allowed to run for the passing tests.

sure, sounds reasonable. We will update bot to run comparison job for passed relvals

@slava77
Copy link
Contributor

slava77 commented Dec 17, 2024

are the CPU comparisons missing due to failing DAS queries?

the question is somewhat academic for this PR though: the workflow numbers in this PR are not present in the baseline and the affected code does not run in other requested workflows.
So, changes in the code can not be tested in the bot comparisons (only that the code runs).

@smuzaffar
Copy link
Contributor

smuzaffar commented Dec 17, 2024

For valid additional workflows, bot does try to generate baseline. For this PR it did try to generate baseline for 29634.703,29834.703 but failed as these workflows are not available at least runTheMatrix.py -n -w upgrade -l 29634.703,29834.703 failed https://cmssdt.cern.ch/jenkins/job/ib-run-baseline/13661/console

Note, that bot can only generate baselines for workflows if they are available in IB. Any new workflow added by PR will not have any baseline

@slava77
Copy link
Contributor

slava77 commented Dec 17, 2024

Note, that bot can only generate baselines for workflows if they are available in IB. Any new workflow added by PR will not have any baseline

Yes, that's what I implied in my "academic" comment

Actually, only one workflow is not defined in the baseline (29834), let me update the test parameters to just 634, also hoping the DAS errors go away

@slava77
Copy link
Contributor

slava77 commented Dec 17, 2024

@cmsbuild please test

@cmsbuild
Copy link
Contributor

+1

Size: This PR adds an extra 12KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ee0831/43506/summary.html
COMMIT: 1ff4d57
CMSSW: CMSSW_15_0_X_2024-12-17-1100/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/46967/43506/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially removed 22 lines from the logs
  • Reco comparison results: 7 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3655224
  • DQMHistoTests: Total failures: 454
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3654750
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.053 KiB( 47 files compared)
  • DQMHistoSizes: changed ( 29634.703 ): 0.053 KiB Tracking/TrackParameters
  • Checked 210 log files, 180 edm output root files, 48 DQM output files
  • TriggerResults: found differences in 1 / 46 workflows

GPU Comparison Summary

Summary:

  • You potentially removed 18 lines from the logs
  • Reco comparison results: 199 differences found in the comparisons
  • DQMHistoTests: Total files compared: 8
  • DQMHistoTests: Total histograms compared: 85200
  • DQMHistoTests: Total failures: 8622
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 76578
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.053 KiB( 7 files compared)
  • DQMHistoSizes: changed ( 29634.704 ): 0.053 KiB Tracking/TrackParameters
  • Checked 28 log files, 33 edm output root files, 8 DQM output files
  • TriggerResults: no differences found

@slava77
Copy link
Contributor

slava77 commented Dec 17, 2024

FractionOfCandOverSeeds_highPtTripletStepTrackCandidates_highPtTripletStep changed in 29634.703
image

I guess it's from https://github.com/cms-sw/cmssw/pull/46967/files#diff-a67b503ef4abafb8f3149bdb2412b9e125b2b0de50f969d52c5400e8267823aaR398-R402
TrackSeedMonhighPtTripletStep.SeedProducer = "lstPixelSeedInputProducer"
@VourMa do you agree?

@VourMa
Copy link
Contributor Author

VourMa commented Dec 17, 2024

FractionOfCandOverSeeds_highPtTripletStepTrackCandidates_highPtTripletStep changed in 29634.703 image

I guess it's from https://github.com/cms-sw/cmssw/pull/46967/files#diff-a67b503ef4abafb8f3149bdb2412b9e125b2b0de50f969d52c5400e8267823aaR398-R402 TrackSeedMonhighPtTripletStep.SeedProducer = "lstPixelSeedInputProducer" @VourMa do you agree?

Yes

@slava77
Copy link
Contributor

slava77 commented Dec 19, 2024

(after 2 days of no indication of review)
@cms-sw/dqm-l2 @cms-sw/reconstruction-l2 @cms-sw/upgrade-l2 @cms-sw/pdmv-l2
please note that tests in this PR have completed and show expected performance.
Please take a look and comment or perhaps sign.
Thank you.

@jfernan2
Copy link
Contributor

+1

@Moanwar
Copy link
Contributor

Moanwar commented Dec 19, 2024

+Upgrade

@AdrianoDee
Copy link
Contributor

+pdmv

@antoniovagnerini
Copy link

+dqm

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @rappoccio, @mandrenguyen, @antoniovilela, @sextonkennedy (and backports should be raised in the release meeting by the corresponding L2)

@mandrenguyen
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 74198d8 into cms-sw:master Dec 21, 2024
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants