Skip to content

akaninghat/datalab

Repository files navigation

Data Analysis Lab

The purpose of this course is to provide an insight into the field of Data Analysis with large sets of experimental data. The students will learn to use and understand basic tools and methods which are used in real searches in gravitational wave and gamma-ray astronomy, such as those currently employed at AEI and LIGO.

Table of Contents

Resources

  1. Lab 1 - 13.10.2021 - Setting up
    1. Prerequisites
    2. Task 1
    3. Task 2
  2. Exercises
  3. Parallel Programming

Lab 1

Prerequisites

Get git and a github account

Generate a ssh key

To make your life easier when you upload the solutions to the next exercises you should now generate on your machine a ssh key that will allow you to do operations on your repository without being asked for a username and password each time.

  • Generate a ssh key using the terminal/command line (Try this link first and try to figure it out. Go through with step 7 of adding your ssh key to your github account as well! -- If all else fails here is a more detailed guide)

Get the code

  • Once you are logged in with your account, fork this repository by pressing the fork button on the upper right corner of this repository's page.

Screenshot 2021-10-14 at 11 47 03

Now you should have your own repository in your namespace called datalab <username>/datalab.

  • You should also have a ssh key added to your account to continue - if not use the 'HTTPS' link for the repository - you will be prompted for a username and password everytime. Copy the git url of this repository by going to your github page, the repository and clicking on Code>SSH>copy:

Screenshot 2021-10-14 at 11 48 48

  • Open a command line/terminal an clone your repository. The command should look something like:
git clone [email protected]:<username>/datalab.git

This will automatically create a new folder called datalab inside the folder where you ran the command and will give you an error if such a folder exists. If you want the folder to have another name run git clone gitithub.com:<username>/datalab.git <new_folder_name>. IF you want to move the entire folder after you have cloned it everything will work fine as the git references are kept in hidden files inside the folder.

Your first commit

  • Create a new file in datalab/solutions/exercise_1.py and push your changes to your repository.
Solution here

Go to your datalab folder. Make a new folder called solutins:

$ mkdir solutions

Create a new file called exercise_1.py with any method.

$ touch solutions/exercise_1.py

Check the changes to your repository

$ git status

Commit the changes and then push them:

$ git add . 
$ git commit -m "Saving my changes."
$ git log
$ git push origin main

Rebase from upstream

To get new changes that are pushed to this main repository the simplest way is to add an upstream and rebase your code. Before you rebase you should commit all your local changes that you want to keep. Try it yourself using this link

Solution here

Go to your datalab folder. To see what repositorities you are tracking run git remote -v - The output will probably look like this

$ git remote -v
origin	[email protected]:<your_username>/datalab.git (fetch)
origin	[email protected]:<your_username>/datalab.git (push)

Because you did the fork from the interface you can also get the new changes from the interface. But the better way to it is to add a 'remote' pointing to the fork (Add a keyname for the main repository). The textbook name for a repo you forked from is upstream. Add a remote named upstream pointing to this repo using: git remote add upstream [email protected]:alebot/datalab.git. Now when you run git remote -vyou should see something like this:

$ git remote -v
origin	[email protected]:<your_username>/datalab.git (fetch)
origin	[email protected]:<your_username>/datalab.git (push)
upstream	[email protected]:alebot/datalab.git (fetch)
upstream	[email protected]:alebot/datalab.git (push)

The best way to pull the new changes is using the rebase comamnd. This means that any commits you have made will be 'rebased' onto the new changes in the repository you have forked. (Make sure you have commited all your changes before proceeding.

$ git status
$ git add . 
$ git commit -m "Saving my changes."
$ git log
$ git fetch upstream
$ git rebase upstream/main
$ git log

Task 1

The first task will be to compile the two C source files. Go to your datalab/code folder and simply try run in your command line:

./Makefile

I expect you might get some errors, missing libraries, missing executable. Try to solve them.

You can either do this in your local environment or use docker and run a container with c++ for example this one.

Test this is working correctly by running:

generate_source --help
prober --help

you should get no errors and just a help message.

Task 2

  • Prepare your python environment. To solve the following exercises we will need preferebly python3 installed and at least a plotting library (such as matplotlib, but probably numpy, pandas etc will be useful as well. If you are using anaconda or miniconda make a new python environment for the datalab.

  • Have an IDE prepared, wheather it is Jupyter, PyCharm, Notebook++, etc - the most important thing is you can easily work with it. Try to write a script that prints "Hello World!" and run it.

Exercises

Open Exercise_1.pdf read the theory and solve the tasks. Complete solutions here. The same for Exercise 3 and Exercise 4 with solutions. The final assignment sheet and data are in the assignment folder in this repo.

Parallel Programming

Resources:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages