This project is about two-sigma competition in Kaggle. Two Sigma description.
clone repository, cd to project directory, and use make
command.
Ideas are taken from here and below is an abbreviation of some of the ideas from that page.
- Each notebook keeps a historical (and dated) record of the analysis as it’s being explored.
- The notebook is not meant to be anything other than a place for experimentation and development.
- Each notebook is controlled by a single author: a data scientist on the team (marked with initials).
- Notebooks can be split when they get too long.
- Notebooks can be split by topic, if it makes sense.
When issuing pull-requests, the diffs between the updated .ipynb files are hard to look at, as ipynb files are saved as json. One solution people tend to use is to commit the conversion to .py instead. This is great for seeing the differences in the input code (while jettisoning the output), and is useful for seeing the changes. However, when reviewing data science work, it is also incredibly important to see the output itself.
We get around these difficulties by committing the .ipynb, .py, and .html of every notebook. Creating the .py and .html files can be done simply and painlessly by editing the config file at ~/.jupyter/jupyter_notebook_config.py If you don’t have this file, run: code(jupyter notebook --generate-config) to create this file. Add the following code to this config file:
c = get_config()
### If you want to auto-save .html and .py versions of your notebook:
# modified from: https://github.com/ipython/ipython/issues/8009
import os
from subprocess import check_call
def post_save(model, os_path, contents_manager):
"""post-save hook for converting notebooks to .py scripts"""
if model['type'] != 'notebook':
return # only do this for notebooks
d, fname = os.path.split(os_path)
check_call(['jupyter', 'nbconvert', '--to', 'script', fname], cwd=d)
check_call(['jupyter', 'nbconvert', '--to', 'html', fname], cwd=d)
c.FileContentsManager.post_save_hook = post_save
Seems pretty self-explanatory, but you can look at the blog linked above if you want some clarification.
- Move final products in /develop to /src.
This is a team project which is about Kaggle Two-sigma Competition. Team "NULL": Xu Gao, Yaxiong Huang, Scott Edenbaum, Dodge Coates