Series
Version Control with GitHub
This series teaches why version control is important and how to use a common version control tool, GitHub. GitHub also allows for collaboration within the environment.
Series Objectives
After completing the series, you will be able to:
-
Git 01: Intro to Git Version Control
- Explain what version control is and how it can be used.
- Explain why version control is important.
- Discuss the basics of how the Git version control system works.
- Discuss how GitHub can be used as a collaboration tool.
-
Git 02: GitHub.com - Repos & Forks
- Create a GitHub account.
- Know how to navigate to and between GitHub repositories.
- Create your own fork, or copy, a GitHub repository.
- Explain the relationship between your forked repository and the master repository it was created from.
-
Git 03: Git Clone - Work Locally On Your Computer
- Be able to use the
git clone
command to create a local version of a GitHub repository on your computer.
- Be able to use the
-
Git 04: Markdown Files
- Create a Markdown (.md) file using a text editor.
- Use basic markdown syntax to format a document including: headers, bold and italics.
-
Git 05: Git Add Changes - Commit
- Add new files or changes to existing files to your repo.
- Document changes using the
commit
command with a message describing what has changed. - Describe the difference between
git add
andgit commit
. - Sync changes to your local repository with the repostored on GitHub.com.
- Use and interpret the output from the following commands:
-
git status
-
git add
-
git commit
-
git push
-
-
Git 06: Pull Request to Add Changes to a Central Repo
- Explain the concept of base fork and head fork.
- Know how to transfer changes between a fork & a central repo in GitHub.
- Create a Pull Request on the GitHub.com website.
-
Git 07: Updating Your Repo by Setting Up a Remote
- Explain why it is important to update a local repo before beginning edits.
- Update your local repository from a remote (upstream) central repo.
Things You’ll Need To Complete This Series
- You will need Git and bash installed on your computer.
- You will also need an active internet connection to access GitHub.
Git 01: Intro to Git Version Control
Authors: Megan A. Jones
Last Updated: Apr 8, 2021
In this page, you will be introduced to the importance of version control in scientific workflows.
- Explain what version control is and how it can be used.
- Explain why version control is important.
- Discuss the basics of how the Git version control system works.
- Discuss how GitHub can be used as a collaboration tool.
The text and graphics in the first three sections were borrowed, with some modifications, from Software Carpentry's Version Control with Git lessons.
What is Version Control?
A version control system maintains a record of changes to code and other content. It also allows us to revert changes to a previous point in time.
Types of Version control
There are many forms of version control. Some not as good:
- Save a document with a new date (we’ve all done it, but it isn’t efficient)
- Google Docs "history" function (not bad for some documents, but limited in scope).
Some better:
- Mercurial
- Subversion
- Git - which we’ll be learning much more about in this series.
More Resources:
Visit the version control Wikipedia list of version control platforms.
Why Version Control is Important
Version control facilitates two important aspects of many scientific workflows:
- The ability to save and review or revert to previous versions.
- The ability to collaborate on a single project.
This means that you don’t have to worry about a collaborator (or your future self) overwriting something important. It also allows two people working on the same document to efficiently combine ideas and changes.
- Why would version control have been helpful to your project & work flow?
- What were the consequences of not having a version control system in place?
How Version Control Systems Works
Simple Version Control Model
A version control system keeps track of what has changed in one or more files
over time. The way this tracking occurs, is slightly different between various
version control tools including git
, mercurial
and svn
. However the
principle is the same.
Version control systems begin with a base version of a document. They then save the committed changes that you make. You can think of version control as a tape: if you rewind the tape and start at the base document, then you can play back each change and end up with your latest version.
Once you think of changes as separate from the document itself, you can then think about “playing back” different sets of changes onto the base document. You can then retrieve, or revert to, different versions of the document.
The benefit of version control when you are in a collaborative environment is that two users can make independent changes to the same document.
If there aren’t conflicts between the users changes (a conflict is an area where both users modified the same part of the same document in different ways) you can review two sets of changes on the same base document.
A version control system is a tool that keeps track of these changes for us. Each version of a file can be viewed and reverted to at any time. That way if you add something that you end up not liking or delete something that you need, you can simply go back to a previous version.
Git & GitHub - A Distributed Version Control Model
GitHub uses a distributed version control model. This means that there can be many copies (or forks in GitHub world) of the repository.
Have a look at the graphic below. Notice that in the example, there is a "central" version of our repository. Joe, Sue and Eve are all working together to update the central repository. Because they are using a distributed system, each user (Joe, Sue and Eve) has their own copy of the repository and can contribute to the central copy of the repository at any time.
Create A Working Copy of a Git Repo - Fork
There are many different Git and GitHub workflows. In the NEON Data Institute, we will use a distributed workflow with a Central Repository. This allows us all (all of the Institute participants) to work independently. We can then contribute our changes to update the Central (NEON) Repository. Our collaborative workflow goes like this:
- NEON "owns" the Central Repository.
- You will create a copy of this repository (known as a fork) in your own GitHub account.
- You will then
clone
(copy) the repository to your local computer. You will do your work locally on your laptop. - When you are ready to submit your changes to the NEON repository, you will:
- Sync your local copy of the repository with NEON's central repository so you have the most up to date version, and then,
- Push the changes you made to your local copy (or fork) of the repository to NEON's main repository.
Each participant in the institute will be contributing to the NEON central repository using the same workflow! Pretty cool stuff.
Let's get some terms straight before we go any further.
- Central repository - the central repository is what all participants will add to. It is the "final working version" of the project.
- Your forked repository - is a "personal” working copy of the central repository stored in your GitHub account. This is called a fork. When you are happy with your work, you update your repo from the central repo, then you can update your changes to the central NEON repository.
- Your local repository - this is a local version of your fork on your own computer. You will most often do all of your work locally on your computer.
Additional Resources:
Further documentation for and how-to-use direction for Git, is provided by the Git Pro version 2 book by Scott Chacon and Ben Straub , available in print or online. If you enjoy learning from videos, the site hosts several.
Git 02: GitHub.com - Repos & Forks
Authors: Megan A. Jones
Last Updated: Apr 8, 2021
In this tutorial, we will fork, or create a copy in your github.com account, an existing GitHub repository. We will also explore the github.com interface.
- Create a GitHub account.
- Know how to navigate to and between GitHub repositories.
- Create your own fork, or copy, a GitHub repository.
- Explain the relationship between your forked repository and the master repository it was created from.
Additional Resources
- Diagram of Git Commands -- this diagram includes more commands than we will learn in this series but includes all that we use for our standard workflow.
- GitHub Help Learning Git resources
Create An Account
If you do not already have a GitHub account, go to GitHub and sign up for your free account. Pick a username that you like! This username is what your colleagues will see as you work with them in GitHub and Git.
Take a minute to setup your account. If you want to make your account more recognizable, be sure to add a profile picture to your account!
If you already have a GitHub account, simply sign in.
Navigate GitHub
Repositories, AKA Repos
Let's first discuss the repository or "repo". (The cool kids say repo, so we will jump on the git cool kid bandwagon) and use "repo" from here on in. According to the GitHub glossary:
A repository is the most basic element of GitHub. They're easiest to imagine as a project's folder. A repository contains all of the project files (including documentation), and stores each file's revision history. Repositories can have multiple collaborators and can be either public or private.
In the Data Institute, we will share our work in the DI-NEON-participants repo.
Find an Existing Repo
The first thing that you'll need to do is find the DI-NEON-participants repo. You can find repos in two ways:
- Type “DI-NEON-participants” in the github.com search bar to find the repository.
- Use the repository URL if you have it - like so: https://github.com/NEONScience/DI-NEON-participants.
Navigation of a Repo Page
Once you have found the Data Institute participants repo, take 5 minutes to explore it.
Git Repo Names
First, get to know the repository naming convention. Repository names all take the format:
OrganizationName/RepositoryName
So the full name of our repository is:
NEONScience/DI-NEON-participants
Header Tabs
At the top of the page you'll notice a series of tabs. Please focus on the following 3 for now:
- Code: Click here to view structure & contents of the repo.
- Issues: Submit discussion topics, or problems that you are having with the content in the repo, here.
- Pull Requests: Submit changes to the repo for review / acceptance. We will explore pull requests more in the Git 06 tutorial.
Other Text Links
A bit further down the page, you'll notice a few other links:
- commits: a commit is a saved and documented change to the content or structure of the repo. The commit history contains all changes that have been made to that repo. We will discuss commits more in Git 05: Git Add Changes -- Commits .
Fork a Repository
Next, let's discuss the concept of a fork on the github.com site. A fork is a copy of the repo that you create in your account. You can fork any repo at any time by clicking the fork button in the upper right hand corner on github.com.
Create your own fork of the DI-NEON-participants now.
Check Out Your Data Institute Fork
Now, check out your new fork. Its name should be:
YOUR-USER-NAME/DI-NEON-participants.
It can get confusing sometimes moving between a central repo:
and your forked repo:
A good way to figure out which repo you are viewing is to look at the name of the repo. Does it contain your username? Or your colleagues'? Or NEON's?
Your Fork vs the Central Repo
Your fork is an exact copy, or completely in sync with, the NEON central repo. You could confirm this by comparing your fork to the NEON central repository using the pull request option. We will learn about pull requests in Git06: Sync GitHub Repos with Pull Requests. For now, take our word for it.
The fork will remain in sync with the NEON central repo until:
- You begin to make changes to your forked copy of the repo.
- The central repository is changed or updated by a collaborator.
If you make changes to your forked repo, the changes will not be added to the NEON central repo until you sync your fork with the NEON central repo.
Summary Workflow -- Fork a GitHub Repository
On the github.com website:
- Navigate to desired repo that you want to fork.
- Click Fork button.
Have questions? No problem. Leave your question in the comment box below. It's likely some of your colleagues have the same question, too! And also likely someone else knows the answer.
Git 03: Git Clone - Work Locally On Your Computer
Authors: Megan A. Jones
Last Updated: Apr 8, 2021
This tutorial covers how to clone
a github.com repo to your computer so
that you can work locally on files within the repo.
- Be able to use the
git clone
command to create a local version of a GitHub repository on your computer.
Additional Resources
- Diagram of Git Commands -- this diagram includes more commands than we will cover in this series but includes all that we use for our standard workflow.
- GitHub Help Learning Git resources.
Clone - Copy Repo To Your Computer
In the previous tutorial, we used the github.com interface to fork the central NEON repo. By forking the NEON repo, we created a copy of it in our github.com account.
Now we will learn how to create a local version of our forked repo on our laptop, so that we can efficiently add to and edit repo content.
Copy Repo URL
Start from the github.com interface:
- Navigate to the repo that you want to clone (copy) to your computer --
this should be
YOUR-USER-NAME/DI-NEON-participants
. - Click on the Clone or Download dropdown button and copy the URL of the repo.
Then on your local computer:
- Your computer should already be setup with Git and a bash shell interface. If not, please refer to the Institute setup materials before continuing.
- Open bash on your computer and navigate to the local GitHub directory that you created using the Set-up Materials.
To do this, at the command prompt, type:
$ cd ~/Documents/GitHub
Note: If you have stored your GitHub directory in a location that is different
- i.e. it is not
/Documents/GitHub
, be sure to adjust the above code to represent the actual path to the GitHub directory on your computer.
Now use git clone
to clone, or create a copy of, the entire repo in the
GitHub directory on your computer.
# clone the forked repo to our computer
$ git clone https://github.com/neon/DI-NEON-participants.git
The output shows you what is being cloned to your computer.
Cloning into 'DI-NEON-participants.git'...
remote: Counting objects: 3808, done.
remote: Total 3808 (delta 0), reused 0 (delta 0), pack-reused 3808
Receiving objects: 100% (3808/3808), 2.92 MiB | 2.17 MiB/s, done.
Resolving deltas: 100% (2185/2185), done.
Checking connectivity... done.
$
Note: The output numbers that you see on your computer, representing the total file size, etc, may differ from the example provided above.
View the New Repo
Next, let's make sure the repository is created on your computer in the location where you think it is.
At the command line, type ls
to list the contents of the current
directory.
# view directory contents
$ ls
Next, navigate to your copy of the data institute repo using cd
or change
directory:
# navigate to the NEON participants repository
$ cd DI-NEON-participants
# view repository contents
$ ls
404.md _includes code
ISSUE_TEMPLATE.md _layouts images
README.md _posts index.md
_config.yml _site institute-materials
_data assets org
Alternatively, we can view the local repo DI-NEON-participants
in a finder (Mac)
or Windows Explorer (Windows) window. Simply open your Documents in a window and
navigate to the new local repo.
Using either method, we can see that the file structure of our cloned repo exactly mirrors the file structure of our forked GitHub repo.
Summary Workflow -- Create a Local Repo
In the github.com interface:
- Copy URL of the repo you want to work on locally
In shell:
-
git clone URLhere
Note: that you can copy the URL of your repository directly from GitHub.
Got questions? No problem. Leave your question in the comment box below. It's likely some of your colleagues have the same question, too! And also likely someone else knows the answer.
Git 04: Markdown Files
Authors: Megan A. Jones
Last Updated: Jun 9, 2024
This tutorial covers how create and format Markdown files.
Learning Objectives
At the end of this activity, you will be able to:
- Create a Markdown (.md) file using a text editor.
- Use basic markdown syntax to format a document including: headers, bold and italics.
What is the .md Format?
Markdown is a human readable syntax for formatting text documents. Markdown can be used to produce nicely formatted documents including pdfs, web pages and more. In fact, this web page that you are reading right now is generated from a markdown document!
In this tutorial, we will create a markdown file that documents both who you are and also the project that you might want to work on at the NEON Data Institute.
Markdown Formatting
Markdown is simple plain text, that is styled using symbols, including:
-
#
: a header element -
**
: bold text -
*
: italic text -
Let's review some basic markdown syntax.
Plain Text
Plain text will appear as text in a Markdown document. You can format that text in different ways.
For example, if we want to highlight a function or some code within a plain text
paragraph, we can use one backtick on each side of the text (
), like this:
Here is some code
. This is the backtick, or grave; not an apostrophe (on most
US keyboards it is on the same key as the tilde).
To add emphasis to other text you can use bold or italics.
Have a look at the markdown below:
The use of the highlight ( `text` ) will be reserved for denoting code.
To add emphasis to other text use **bold** or *italics*.
Notice that this sentence uses a code highlight "``", bold and italics. As a rendered markdown chunk, it looks like this:
The use of the highlight ( text
) will be reserve for denoting code when
used in text. To add emphasis to other text use bold or italics.
Horizontal Lines (rules)
Create a rule:
***
Below is the rule rendered:
Section Headings
You can create a heading using the pound (#) sign. For the headers to render properly there must be a space between the # and the header text. Heading one is 1 pound sign, heading two is 2 pound signs, etc as follows:
Heading two
## Heading two
Heading three
### Heading three
Heading four
#### Heading four
For a more thorough list of markdown syntax, please read this GitHub Guide on Markdown.
Data Tip: There are many free Markdown editors out there! The atom.io editor is a powerful text editor package by GitHub, that also has a Markdown renderer allowing you to see what your Markdown looks like as you are working.
Activity: Create A Markdown Document
Now that you are familiar with the Markdown syntax, use it to create a brief biography that:
- Introduces yourself to the other participants.
- Documents the project that you have in mind for the Data Institute.
Add Your Bio
First, create a .md file using the text editor of your preference. Name the file with the naming convention: LastName-FirstName.md
Save the file to the participants/2017-RemoteSensing/pre-institute2-git directory in your local DI-NEON-participants repo (the copy on your computer).
Add a brief bio using headers, bold and italic formatting as makes sense. In the bio, please provide basic information including:
- Your Name
- Domain of interest
- One goal for the course
Add a Capstone Project Description
Next, add a revised Capstone Project idea to the Markdown document using the
heading ## Capstone Project
. Be sure to specify in the document the types of
data that you think you may require to complete your project.
NOTE: The Data Institute repository is a public repository visible to anyone with internet access. If you prefer to not share your bio information publicly, please submit your Markdown document using a pseudonym for your name. You may also want to use a pseudonym for your GitHub account. HINT: cartoon character names work well. Please email us with the pseudonym so that we can connect the submitted document to you.
Got questions? No problem. Leave your question in the comment box below. It's likely some of your colleagues have the same question, too! And also likely someone else knows the answer.
Git 05: Git Add Changes - Commit
Authors: Megan A. Jones
Last Updated: Apr 8, 2021
This tutorial reviews how to add and commit changes to a Git repo.
- Add new files or changes to existing files to your repo.
- Document changes using the
commit
command with a message describing what has changed. - Describe the difference between
git add
andgit commit
. - Sync changes to your local repository with the repostored on GitHub.com.
- Use and interpret the output from the following commands:
-
git status
-
git add
-
git commit
-
git push
-
Additional Resources
- Diagram of Git Commands -- this diagram includes more commands than we will learn in this series but includes all that we use for our standard workflow.
- GitHub Help Learning Git resources
- Information on branches in Git -- we do not focus on the use of branches in Git or GitHub, however, if you want more information on this structure, this Git documentation may be of use.
In the previous lesson, we created a markdown (.md
) file in our forked version
of the DI-NEON-participants
central repo. In order for Git to recognize this
new file and track it, we need to:
- Add the file to the repository using
git add
. - Commit the file to the repository as a set of changes to the repo (in this case, a new
document with some text content) using
git commit
. - Push or sync the changes we've made locally with our forked repo hosted on github.com
using
git push
.
Check Repository Status -- git status
Let's first run through some basic commands to get going with Git at the command line. First, it's always a good idea to check the status of your repository. This allows us to see any changes that have occurred.
Do the following:
- Open bash if it's not already open.
- Navigate to the
DI-NEON-participants
repository in bash. - Type:
git status
.
The commands that you type into bash should look like the code below:
# Change directory
# The directory containing the git repo that you wish to work in.
$ cd ~/Documents/GitHub/neon-data-repository-2016
# check the status of the repo
$ git status
Output:
On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
Untracked files:
(use "git add <file>..." to include in what will be committed)
_posts/ExampleFile.md
Let's make sense of the output of the git status
command.
-
On branch master
: This tells us that we are on the master branch of the repo. Don't worry too much about branches just yet. We will work on the master branch throughout the Data Institute. -
Changes not staged for commit:
This lists any file(s) that is/are currently being tracked by Git but have new changes that need to be added for Git to track. -
Untracked file:
These are all new files that have never been added to or tracked by Git.
Use git status
anytime to view any untracked changes that have occurred, what
is being tracked and what is not currently being tracked.
Add a File - git add
Next, let's add the Markdown file containing our bio and short project summary
using the command git add FileName.md
. Replace FileName.md with the name
of your markdown file.
# add a file, so that changes are tracked
$ git add ExampleBioFile.md
# check status again
$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
new file: _posts/ExampleBioFile.md
Understand the output:
-
Changes to be committed:
This lists the new files or files with changes that have been added to the Git tracking system but need to be committed as actual changes in the git repository history.
Commit Changes - git commit
When we add a file in the command line, we are telling Git to recognize that
a change has occurred. The file moves to a "staging" area where Git
recognizes a change has happened but the change has not yet been formally
documented. When we want to permanently document those changes, we
commit
the change. A single commit will work for all files that are currently
added to and in the Git staging area (anything in green when we check the status).
Commit Messages
When we commit a change to the Git version control system, we need to add a commit message. This message describes the changes made in the commit. This commit message is helpful to us when we review commit history to see what has changed over time and when those changes occurred. Be sure that your message covers the change.
# commit changes with message
$ git commit -m “new example file for demonstration”
[master e3cd622] new example file for demonstration
1 file changed, 56 insertions(+), 4 deletions(-)
create mode 100644 _posts/ExampleFile.md
Understand the output: Each commit will look slightly different but the important parts include:
-
master xxxxxxx
this is the unique identifier for this set of changes or this commit. You will always be able to track this specific commit (this specific set of changes) using this identifier. -
_ file change, _ insertions(+), _ deletion (-)
this tells us how many files have changed and the number of type of changes made to the files including: insertions, and deletions.
Why Add, then Commit?
To understand what is going on with git add
and git commit
it is important
to understand that Git has a staging area that we add items to with git add
.
Changes are not actually documented and permanently tracked until we commit them. This allows
us to commit specific groups of files at the same time if we wish. For instance,
we may decide to add and commit all R scripts together. And Markdown files in another,
separate commit.
Transfer Changes (Commits) from a Local Repo to a GitHub Repo - git push
When we are done editing our files and have committed the changes locally, we
are ready to transfer or sync these changes to our forked repo on github.com. To
do this we need to push
our changes from the local Git version control to the
remote GitHub repo.
To sync local changes with github.com, we can do the following:
- Check the status of our repo using
git status
. Are all of the changes added and committed to the repo? - Use
git push origin master
.origin
tells Git to push the files to the originating repo which in this case - is our fork on github.com which we originally cloned to our local computer.master
is the repo branch that you are currently working on.
Let's push the changes that we made to the local version of our Git repo to our fork, in our github.com account.
# check the repo status
$ git status
On branch master
Your branch is ahead of 'origin/master' by 1 commit.
(use "git push" to publish your local commits)
# transfer committed changes to the forked repo
git push origin master
Counting objects: 1, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (6/6), done.
Writing objects: 100% (6/6), 1.51 KiB | 0 bytes/s, done.
Total 6 (delta 4), reused 0 (delta 0)
To https://github.com/mjones01/DI-NEON-participants.git
5022aca..e3cd622 master -> master
NOTE: You may be asked for your username and password! This is your github.com username and password.
Understand the output:
- Pay attention to the repository URL - the "origin" is the
repository that the commit was pushed to, here
https://github.com/mjones01/DI-NEON-participants.git
. Note that because this repo is a fork, your URL will have your GitHub username in it instead of "mjones01".
View Commits in GitHub
Let’s view our recent commit in our forked repo on GitHub.
- Go to github.com and navigate to your forked Data Institute repo - DI-NEON-participants.
- Click on the commits link at the top of the page.
- Look at the commits - do you see your recent commit message that you typed into bash on your computer?
- Next, click on the <>CODE link which is ABOVE the commits link in github.
- Is the Markdown file that you added and committed locally at the command line on your computer, there in the same directory (participants/pre-institute2-git) that you saved it on your laptop?
Is Your File in the NEON Central Repo Yet?
Next, do the following:
- Navigate to the NEON central NEONScience/DI-NEON-participants repo. (The easiest method to do this is to click the link at the top of the page under your repo name).
- Look for your file in the same directory. Is your new file there? If not, why?
Remember the structure of our workflow.
We’ve added changes from our local repo on our computer and pushed them to our fork on github.com. But this fork is in our individual user account, not NEONS. This fork is separate from the central repo. Changes to a fork in our github.com account do not automatically transfer to the central repo. We need to sync them! We will learn how to sync these two repos in the next tutorial Git 06: Syncing GitHub Repos with Pull Requests .
Summary Workflow - Committing Changes
On your computer, within your local copy of the Git repo:
- Create a new markdown file and edit it in your favorite text editor.
On your computer, in shell (at the command line):
-
git status
-
git add FileName
-
git status
- make sure everything is added and ready for commit - `git commit -m “messageHere”
-
git push origin master
On the github.com website:
- Check to make sure commit is added.
- Check to see if the file that you added is visible online in your Git repo.
Have questions? No problem. Leave your question in the comment box below. It's likely some of your colleagues have the same question, too! And also likely someone else knows the answer.
Git 06: Sync GitHub Repos with Pull Requests
Authors: Megan A. Jones
Last Updated: Apr 8, 2021
This tutorial covers adding new edits or contents from your forked repo on github.com and a central repo.
- Explain the concept of base fork and head fork.
- Know how to transfer changes (sync) between a fork & a central repo in GitHub.
- Create a Pull Request on the GitHub.com website.
Additional Resources
- Diagram of Git Commands: this diagram includes more commands than we will learn in this series.
- GitHub Help Learning Git resources
We now have done the following:
- We've forked (made an individual copy of) the
NEONScience/DI-NEON-participants
repo to our github.com account. - We've cloned the forked repo - making a copy of it on our local computers.
- We've added files and content to our local copy of the repo and committed the changes.
- We've pushed those changes back up to our forked repo on github.com.
Once you've forked and cloned a repo, you are all setup to work on your project. You won't need to repeat those steps.
In this tutorial, we will learn how to transfer changes from our forked repo in our github.com account to the central NEON Data Institute repo. Adding information from your forked repo to the central repo in GitHub is done using a pull request.
- It allows you to contribute to another repo without needing administrative privileges to make changes to the repo.
- It allows others to review your changes and suggest corrections, additions, edits, etc.
- It allows repo administrators control over what gets added to their project repo.
The ability to suggest changes to ANY (public) repo, without needing administrative privileges is a powerful feature of GitHub. In our case, you do not have privileges to actually make changes to the DI-NEON-participants repo. However you can make as many changes as you want in your fork, and then suggest that NEON add those changes to their repo, using a pull request. Pretty cool!
Adding to a Repo Using Pull Requests
Pull Requests in GitHub
Step 1 - Start Pull Request
To start a pull request, click the pull request button on the main repo page.
Alternatively, you can click the Pull requests tab, then on this new page click the "New pull request" button.
Step 2 - Choose Repos to Update
Select your fork to compare with NEON central repo. When you begin a pull request, the head and base will auto-populate as follows:
- base fork: NEONScience/DI-NEON-participants
- head fork: YOUR-USER-NAME/DI-NEON-participants
The above pull request configuration tells Git to sync (or update) the NEON repo with contents from your repo.
Head vs Base
- Base: the repo that will be updated, the changes will be added to this repo.
- Head: the repo from which the changes come.
One way to remember this is that the “head” is always ahead of the base, so we must add from the head to the base.
Step 3 - Verify Changes
When you compare two repos in a pull request page, git will provide an overview of the differences (diffs) between the files (if the file is a binary file, like code. Non-binary files will just show up as a fully new file if it had any changes). Look over the changes and make sure nothing looks surprising.
Step 4 - Create Pull Request
Click the green Create Pull Request button to create the pull request.
Step 5 - Title Pull Request
Give your pull request a title and write a brief description of your changes. When you’re done with your message, click Create pull request!
Check out the repo name up at the top (in your repo and in screenshot above) When creating the pull request you will be automatically transferred to the base repo. Since the central repo was the base, github will automatically transfer you to the central repo landing page.
Step 6 - Merge Pull Request
In this final step, it’s time to merge your changes in the NEONScience/DI-NEON-participants repo.
NOTE 1: You are only able to merge a pull request in a repo that you have permissions to!
NOTE 2: When collaborating, it is generally poor form to merge your own Pull Request, better to tag (@username) a collaborator in the comments so they know you want them to look at it. They can then review and, if acceptable, merge it.
To merge, your (or someone else's PR click the green "Merge Pull Request" button to "accept" or merge the updated commits in the central repo into your repo. Then click Confirm Merge.
We now synced our forked repo with the central NEON Repo. The next step in working in a GitHub workflow is to transfer any changes in the central repository into your local repo so you can work with them.
Data Institute Activity: Submit Pull Request for Week 2 Assignment
Submit a pull request containing the .md
file that you created in this
tutorial-series series. Before you submit your PR, review the
Week 2 Assignment page.
To ensure you have all of the required elements in your .md file.
To submit your PR:
Repeat the pull request steps above, with the base and head switched. Your base will be the NEON central repo and your HEAD will be YOUR forked repo:
- base fork: NEONScience/DI-NEON-participants
- head fork: YOUR-USER-NAME/DI-NEON-participants
When you get to Step 6 - Merge Pull Request (PR), are you able to merge the PR?
- Finally, go to the NEON Central Repo page in github.com. Look for the Pull Requests link at the top of the page. How many Pull Requests are there?
- Click on the link - do you see your Pull Request?
You can only merge a PR if you have permissions in the base repo that you are adding to. At this point you don’t have contributor permissions to the NEON repo. Instead someone who is a contributor on the repository will need to review and accept the request.
After completing the pull request to upload your bio markdown file, be sure to continue on to Git 07: Updating Your Repo by Setting Up a Remote to learn how to update your local fork and really begin the cycle of working with Git & GitHub in a collaborative manner.
Workflow Summary
Add updates to Central Repo with Pull Request
On github.com
-
Button: Create New Pull Request
-
Set base: central Institute repo, set head: your Fork
-
Make sure changes are what you want to sync
-
Button: Create Pull Request
-
Add Pull Request title & comments
-
Button: Create Pull Request
-
Button: Merge Pull Request - if working collaboratively, poor style to merge your own PR, and you only can if you have contributor permissions
Have questions? No problem. Leave your question in the comment box below. It's likely some of your colleagues have the same question, too! And also likely someone else knows the answer.
Git 07: Updating Your Repo by Setting Up a Remote
Authors: Megan A. Jones
Last Updated: Apr 8, 2021
This tutorial covers how to set up a Central Repo as a remote to your local repo in order to update your local fork with updates. You want to do this every time before starting new edits in your local repo.
Learning Objectives
At the end of this activity, you will be able to:
- Explain why it is important to update a local repo before beginning edits.
- Update your local repository from a remote (upstream) central repo.
Additional Resources
- Diagram of Git Commands: this diagram includes more commands than we will learn in this series.
- GitHub Help Learning Git resources
We now have done the following:
- We've forked (made an individual copy of) the
NEONScience/DI-NEON-participants
repo to our github.com account. - We've cloned the forked repo - making a copy of it on our local computers.
- We've added files and content to our local copy of the repo and committed the changes.
- We've pushed those changes back up to our forked repo on github.com.
- We've completed a Pull Request to update the central repository with our changes.
Once you're all setup to work on your project, you won't need to repeat the fork and clone steps. But you do want to update your local repository with any changes other's may have added to the central repository. How do we do this?
We will do this by directly pulling the updates from the central repo to our local repo by setting up the local repo as a "remote". A "remote" repo is any repo which is not the repo that you are currently working in.
Update, then Work
Once you've established working in your repo, you should follow these steps when starting to work each time in the repo:
- Update your local repo from the central repo (
git pull upstream master
). - Make edits, save,
git add
, andgit commit
all in your local repo. - Push changes from local repo to your fork on github.com (
git push origin master
) - Update the central repo from your fork (
Pull Request
) - Repeat.
Notice that we've already learned how to do steps 2-4, now we are completing the circle by learning to update our local repo directly with any changes from the central repo.
The order of steps above is important as it ensures that you incorporate any changes that have been made to the NEON central repository into your forked & local repos prior to adding changes to the central repo. If you do not sync in this order, you are at greater risk of creating a merge conflict.
What's A Merge Conflict?
A merge conflict occurs when two users edit the same part of a file at the same time. Git cannot decide which edit was first and which was last, and therefore which edit should be in the most current copy. Hence the conflict.
Set up Upstream Remote
We want to directly update our local repo with any changes made in the central repo prior to starting our next edits or additions. To do this we need to set up the central repository as an upstream remote for our repo.
Step 1: Get Central Repository URL
First, we need the URL of the central repository. Navigate to the central repository in GitHub NEONScience/DI-NEON-participants. Select the green Clone or Download button (just like we did when we cloned the repo) to copy the URL of the repo.
Step 2: Add the Remote
Second, we need to connect the upstream remote -- the central repository to our local repo.
Make sure you are still in you local repository in bash
First, navigate to the desired directory.
$ cd ~/Documents/GitHub/DI-NEON-participants
and then type:
$ git remote add upstream https://github.com/NEONScience/DI-NEON-participants.git
Here you are identifying that is is a git command with git
and then that you
are adding an upstream remote with the given URL.
Step 3: Update Local Repo
Use git pull
to sync your local repo with the forked GitHub.com repo.
Second, update local repo using git pull
with the added directions of
upstream
indicating the central repository and master
specifying which
branch you are pulling down (remember, branches are a great tool to look into
once you're comfortable with Git and GitHub, but we aren't going to focus on
them. Just use master
).
$ git pull upstream master
remote: Counting objects: 25, done.
remote: Compressing objects: 100% (15/15), done.
remote: Total 25 (delta 16), reused 19 (delta 10), pack-reused 0
Unpacking objects: 100% (25/25), done.
From https://github.com/NEONScience/DI-NEON-participants
74d9b7b..463e6f0 master -> origin/master
Auto-merging _posts/institute-materials/example.md
Understand the output: The output will change with every update, several things to look for in the output:
-
remote: …
: tells you how many items have changed. -
From https:URL
: which remote repository is data being pulled from. We set up the central repository as the remote but it can be lots of other repos too. - Section with + and - : this visually shows you which documents are updated and the types of edits (additions/deletions) that were made.
Now that you've synced your local repo, let's check the status of the repo.
$ git status
Step 4: Complete the Cycle
Now you are set up with the additions, you will need to add and commit those changes. Once you've done that, you can push the changes back up to your fork on github.com.
$ git push origin master
Now your commits are added to your forked repo on github.com and you're ready to repeat the loop with a Pull Request.
Workflow Summary
Syncing Central Repo with Local Repo
Setting It Up (only do this the initial time)
- Find & copy Central Repo URL
-
git remote add upstream https://github.com/NEONScience/DI-NEON-participants.git
After Initial Set Up
-
Update your Local Repo & Push Changes
-
git pull upstream master
- pull down any changes and sync the local repo with the central repo - make changes,
git add
andgit commit
-
git push origin master
- push your changes up to your fork - Repeat
Have questions? No problem. Leave your question in the comment box below. It's likely some of your colleagues have the same question, too! And also likely someone else knows the answer.
-