Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data curation guide with narrative, 1st draft #7

Draft
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

wd15
Copy link
Collaborator

@wd15 wd15 commented Dec 30, 2024

Draft of the following sections

  • Overview
  • Definitions
  • Data Generation
    • Intro
    • Choosing data to generate
    • File Formats
    • Recovering from crashes and restarts
    • Using Workflow Tools
    • HPC and Parallel Data (merge with restarts maybe)
  • Data Curation
    • Overview
    • Metadata Standards
    • Publish the codes and workflows during development
    • Identifying the significant data assets
    • Licensing
    • Selecting a data repository
  • Case Study
  • Bibliography

Ideas and questions

  • Possibly merge HPC, Parallel and restarts as they overlap a lot
  • Possibly don't have HPC section since most of data generation is really HPC related
  • Should we WERB this?
  • Can we number the sections?

@wd15 wd15 requested a review from tkphd January 3, 2025 16:06
wd15 and others added 9 commits January 3, 2025 12:06
 - update .gitignore to ignore _build subfolder
 - update README.md to include use of Python's simple web server
 - Add Nix build
 - Add access to sphinxcontrib.mermaid to build mermaid diagram
 - Ensure Overview section of data generation is using citations
   correctly.
 - Make Mermaid diagram work correctly
 - Include draft of automation section
 - Add Bibliography section
Copy link
Collaborator

@tkphd tkphd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@wd15
Copy link
Collaborator Author

wd15 commented Jan 6, 2025

Looks good!

Thanks for working on it. I was about half way through cleaning it up so some of it was a big old mess. I still need to work on the metadata section and we need a short example of curation, but need to chat with you about that.

@wd15
Copy link
Collaborator Author

wd15 commented Jan 6, 2025

What should I set my text editor column width to?

@tkphd
Copy link
Collaborator

tkphd commented Jan 6, 2025

Oh, I dunno. I default to 79 chars, because that's the maximum width nicely formatted visual diffs on GitHub. Shorter lines can be better for text, but certainly not longer. I think your edits were made at 70 characters, which is probably reasonable. I'll try not to make Git bounce back-and-forth over this.

@wd15
Copy link
Collaborator Author

wd15 commented Jan 6, 2025

Shall we meet to discuss? I sent an email.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants