Workflow

Many modern statistical methods require some programming. This is especially true of bayesian modeling. But how do I best write code that others can use, understand, and collaborate on? How do I seek help effectively? We take a look at writing reproducible examples, the version control tool Git, and the code collaboration platform GitHub.

You are viewing the session notebook. Click here for slides.

Problem

Generally, we want to

  • Do Bayesian data analyses
  • Write reproducible code
  • Seek help effectively
  • Collaborate with others
  • Make code/work available?

In this intro, we look at tools that facilitate achieving these goals. The more complex your analyses get, the more helpful these tools (might?) be.

Set-up

Let’s first make sure we’ve set up the tools

    1. To ensure Git works
    1. Help connecting to GitHub here

Git

  1. Git is a version control tool—a program on your computer
  2. Organize projects into repositories
    1. Local repository: ~/Users/matti/Documents/workshop/ (actually ~/Users/matti/Documents/workshop/.git)
    2. Remote repository: https://github.com/mvuorre/workshop.git
  3. Functions to
    1. Commit states to history
    2. Push and pull history from/to remote repository
    3. and more…
  4. Powers most software collaborations

Git

  1. Git can get extremely complicated
  2. I wrote a whole paper about it (Vuorre and Curley 2018), but still Kagi everything
  3. We want to know just enough and not more
    1. https://happygitwithr.com/
    2. https://docs.github.com/en/get-started/using-github/github-flow
    3. https://www.atlassian.com/git/tutorials/comparing-workflows

GitHub

  1. GitHub is a Microsoft-owned developer platform owned by Microsoft
  2. GH hosts remote Git repositories with interesting additions (live demo)
  3. Get the workshop’s source code from GitHub:
# In a directory where you're comfortable putting stuff
git clone https://github.com/mvuorre/workshop.git
cd workshop.git

There are many alternative services such as GitLab and Codeberg.

Collaborating with Git and GitHub

General workflow for contributing to others’ projects

  1. Find a problem and let the author know about it
    1. –> Submit an issue
  2. Fix the problem and submit your fix
    1. –> Submit a “pull request”
  3. In many cases want to show examples of what’s going wrong and how
    1. Reproducible example
    2. Idea applies equally to e.g. seeking help for your own problems on forums etc.

Reproducible examples

  1. Learn: https://speakerdeck.com/jennybc/reprex-reproducible-examples-with-r
  2. Example: https://github.com/mvuorre/brms-workshop/issues/1

Practice 1

  1. Create a reproducible example
  2. Submit your reprex as a new “example” issue at https://github.com/mvuorre/workshop/issues
  3. We’ll solve your problems together

Practice 2

Live example: Contributing to common repo (https://www.atlassian.com/git/tutorials/comparing-workflows)

  1. Get added as collaborator to https://github.com/brms-workshop/stuff
  2. git clone https://github.com/brms-workshop/stuff.git
  3. Find code that needs fixing, and let others know with an issue
  4. Fix code in a new branch
    1. e.g. Create a new file—this is an example.
  5. Submit branch to GitHub and open a pull request
  6. Discuss changes with others in pull request

Practice 3

Live example: Contributing to someone else’s repo

  1. Fork the workshop repo to your GitHub account
  2. Clone your remote repo to your computer
    1. git clone https://github.com/{your-name}/workshop.git
  3. Make changes
    1. For example, fix the reprex.R file
  4. Push local changes to your remote
  5. Open a pull request
  6. Discuss changes with others in pull request

Wrap-up

  • Bayesian statistics?!?
  • Reproducible examples are essential for seeking help
    • There will come a time when you need help!
  • Proper tools help us collaborate better
  • Visibility
    • Can choose public/private repos
    • Be careful if this is something you’re concerned about

References

Vuorre, Matti, and James P. Curley. 2018. “Curating Research Assets: A Tutorial on the Git Version Control System.” Advances in Methods and Practices in Psychological Science 1 (2): 219–36. https://doi.org/10.1177/2515245918754826.