Lesson | Overview |
---|---|
1. Download and verify data | Downloading data with wget /curl and check the transferred data’s integrity with check‐sums |
2. Streams, Redirection and Pipe | Combining pipes and redirection, Using "Exit" statuses |
3. Inspecting and Manipulating Text Data with UNIX Tools - Part 1 | Inspect file/s with utilities such as head ,less . Extracting and formatting tabular data. Magical grep . |
4. Inspecting and Manipulating Text Data with UNIX Tools - Part 2 | Substitute matching patterns with sed . Text processing with awk and bioawk |
5. Automating File-Processing with find and xargs | Search files by pattern with find and use xargs to execute a command for those objects matching the pattern |
6. Puzzles 🧩 | Can you use shell scripts to solve these "real" life challenges in molecular biology ? |
7. Supplementary - 1 | Recap - Unix , Linux and Unix shell |
8. Supplementary - 2 | Recap - Shell basics and commands |
9. Supplementary - 3 | Escaping, Special Characters |
Attribution Notice
- This workshop material is heavily inspired by :
- Buffalo, V (2015). Bioinformatics Data Skills.O'Reilly Media, Inc
- The Carpentries. The Unix Shell . https://swcarpentry.github.io/shell-novice/
- The Carpentries. Introduction to Command Line for Genomics. https://datacarpentry.org/shell-genomics/
- Rosalind Project. https://rosalind.info/about/
License
Genomics Aotearoa / New Zealand eScience Infrastructure "Intermediate Shell for Bioinformatics" is licensed under the GNU General Public License v3.0, 29 June 2007 . (Follow this link for more information)
Setup
-
If possible, we do recommend using the Remote option over Local ( Especially for Windows hosts). This will eliminate the need to install any additional applications
-
Remote option will require an existing NeSI Account
Remote¶
Log into NeSI Mahuika Jupyter Service - Not required if the workshop is running on OpenOnDemand based Training environment
- Follow https://jupyter.nesi.org.nz/hub/login
Enter NeSI username, HPC password and 6 digit second factor token
Choose server options as below
>>make sure to choose the correct project codenesi02659
, number of CPUsCPUs=2
, memory4 GB
prior to pressing button.
Local ¶
Local host setup - Windows, MacOS & Linux
- Install either
- Git for Windows from https://git-scm.com/download/win OR
- MobaXterm Home (Portable or Installer edition) from https://mobaxterm.mobatek.net/download-home-edition.html
- Portable edition does not require administrative privileges
- Native terminal client is sufficient.
- It might not comes with
wget
download data via command line (can be installed with$ brew install wget
) - However, it is not required as we provide a direct link to download data in .zip format
- Native terminal client is sufficient.
bioawk
install on all hosts
One of the tools used in this workshop is bioawk
which is not a native Linu/UNIX utility. Installing it on MacOS and Linux can be done with $ brew install bioawk
& $ sudo apt install bioawk
, respectively. Windows hosts might have to do it via conda
according to these instructions. However, this will require a prior install of Anaconda Or Miniconda