One benefit of version control is the ability to trace back the history of changes in your code. In this section we will try out a couple of options for exploring repository history.
- Inspect repo history on Github
- LICENSE.md, Contributor guidelines
- Inspect repo history with Git-GUI and gitk
- Git history: diffs, blames, branches
Choose an existing repo with some history. It could be this repository or maybe the Github repository for a Python or R package that you have used in the past. Maybe:
The repository page has tons of information. You can explore the current code, see what bugs have been reported in the code, and what changes have been suggested by the community.
Since the README file is used as the home page for a repo on Github, this is frequently the place to go to learn about a piece of software you haven't used before.
Expand steps & screenshots for exploring history on Github
Navigate into a code folder and open up a code file.
Reading well-written code can be a great way to improve your own programming, particularly when you want your code to function like other projects that you have seen.
Github offers may different ways to explore the history of an individual code file. In the upper right should be some different views of the file you are browsing.
Explore the Raw, Blame, and History views -- what are they showing?
Are you having trouble using a package? Maybe someone else is having a similar problem and has requested a fix. From the repository home, go to the issues
tab.
You can search through open or closed issues, and see what discussion there has been around solutions so far.
(note: new repos won't have issues -- take a look at a widely used one. The repo for the Atom text editor has > 600 isssues, for example)
Some open source projects attract a lot of users contributing code improvements. You can take a look at proposed changes users have suggested on Github. Open the pull requests
tab
Some questions that come up when working on open-source code are:
- What license should I use?
- What guidance should I provide to potential contributors to the project?
- What about a code of conduct?
Addressing what EPA needs for all of these is beyond the scope here, but while we are browsing repos, take a look at how other projects have addressed these questions.
For example:
- dplyr's license, contributor guidelines, code of conduct, and issue template
- panda's .github folder has many of these same items. It also has an extensive contributing guide linked from the README.
The main folder of a repo or the .github
folder are common places to find these files.
Github is great, but you can also explore repo history locally. When you clone a repo, you are getting the entire history of the repo. This is an area where visual tools really shine compared to the command line.
Expand it!
Follow the same approach as before.
e.g.: git clone https://github.com/tidyverse/dplyr.git
(we will see if this stresses the internet connection too much....)
Under the 'repository' menu, go to 'visualize master's history' (the main branch of code is usually called master)
Looks something like this:
If you squint, you can sort of imagine how this relates to what you could see on Github, but not quite the same. More modern git clients offer better visualization.
This page shows some screenshots of what this sort of thing looks like in GitKraken.
It is possible to explore diffs, blames, branches, commits, etc, all from the command line. I have to admit I don't have much experience with this, but you could try out:
This article from Atlassian goes through some commands you can try.
This is an area where a GUI really shines.
Return to Section 3 - Collaboration workflows Proceed to Section 5 - Git automation