Note: The exact practice of and tools to do open science are in constant development. As such this is living document and is often updated and evolving, so that over time it becomes more refined. Because of this, some parts of this document may be incomplete.
The main philosophy of this toolkit is to encourage reproducible and open scientific practices by automating difficult tasks and providing a (strongly) opinionated view on which tools and processes to use when doing open science. The goal here is to reduce the burden on researchers to actually do open and reproducible science, in particular for creating abstracts, slides, posters, and manuscripts. Currently, the target users are biomedical, medical, and health researchers, though this toolkit could easily be used by other disciplines.
The number of and possible tools and services to use to do open science is broad and diverse. This gives a lot of advantages, especially as this is the time for open science to grow and evolve. However, this is also a huge barrier for those researchers and scientists who are just starting out. There are too many choices and very little guidance on what to use. Given the possible choices available, this toolkit takes a strong stance on what tools and services to use. So, described below is a brief overview of the tools used in the larger steps involved in creating scientific output, which is then followed by more detailed explanations.
All projects, files, coding, writing, and other activities will be done in RStudio. RStudio is an exceptional environment to work with R (and other languages) and has many features that make it simpler to follow open scientific workflows.
Version control is vital to an open scientific workflow. Since Git is well established in the open source, software development, and R package development world, it is the natural choice for version control. RStudio also has an excellent and fairly easy to use interface with Git.
Analyses should follow the style similar to creating R packages (read this excellent book on R package creation for more details).
All writing should be done using R Markdown. Big picture output such as figures and tables should be created as single purpose functions (e.g.
figure_two()) and inserted into a single code chunk (especially important for figures). Inserting citations is fairly simple, with the bibliography file preferably saved in the same folder as the manuscript or other scientific output (in the
doc/ folder). Exploratory analyses should be done in a R Markdown file in the
doc/ folder as well.
General collaboration should be done through GitHub Pull Requests or through GitLab Merge Requests. If technical skill of Git is low, designate a team member with a bit more skill in Git (or willing/able to put time into learning) as the “Git liaison”.
Slides should preferably be created using the options available in R Markdown. The best way to create posters is still in development.
Once the manuscript, slides, or poster have been finalized, several actions should be taken:
For number 1 above, increasing the version number of the project can be used to track bigger picture milestones in the project. For small abstracts this isn’t really necessary, since they are likely part of a bigger scientific output.