Skip to content

image

drawing

Lesson Overview
1. Download and verify data Downloading data with wget/curl and check the transferred data’s integrity with check‐sums
2. Streams, Redirection and Pipe Combining pipes and redirection, Using "Exit" statuses
3. Inspecting and Manipulating Text Data with UNIX Tools - Part 1 Inspect file/s with utilities such as head,less. Extracting and formatting tabular data. Magical grep.
4. Inspecting and Manipulating Text Data with UNIX Tools - Part 2 Substitute matching patterns with sed. Text processing with awk and bioawk
5. Automating File-Processing with find and xargs Search files by pattern with find and use xargs to execute a command for those objects matching the pattern
6. Puzzles 🧩 Can you use shell scripts to solve these "real" life challenges in molecular biology ?
7. Supplementary - 1 Recap - Unix , Linux and Unix shell
8. Supplementary - 2 Recap - Shell basics and commands
9. Supplementary - 3 Escaping, Special Characters


License

Genomics Aotearoa / New Zealand eScience Infrastructure "Intermediate Shell for Bioinformatics" is licensed under the GNU General Public License v3.0, 29 June 2007 . (Follow this link for more information)


Setup

  • If possible, we do recommend using the Remote option over Local ( Especially for Windows hosts). This will eliminate the need to install any additional applications

  • Remote option will require an existing NeSI Account

Remote

Log into NeSI Mahuika Jupyter Service - Not required if the workshop is running on OpenOnDemand based Training environment
  1. Follow https://jupyter.nesi.org.nz/hub/login
  2. Enter NeSI username, HPC password and 6 digit second factor token
    image

  3. Choose server options as below
    >>make sure to choose the correct project code nesi02659, number of CPUs CPUs=2, memory 4 GB prior to pressing images button.


images

Local ⚠

Local host setup - Windows, MacOS & Linux
  • Native terminal client is sufficient.
  • It might not comes with wget download data via command line (can be installed with $ brew install wget)
  • However, it is not required as we provide a direct link to download data in .zip format
  • Native terminal client is sufficient.

bioawk install on all hosts

One of the tools used in this workshop is bioawk which is not a native Linu/UNIX utility. Installing it on MacOS and Linux can be done with $ brew install bioawk & $ sudo apt install bioawk, respectively. Windows hosts might have to do it via conda according to these instructions. However, this will require a prior install of Anaconda Or Miniconda