CGRL+user+guide+workshop

=CGRL User Guide for Computing Workshop=

Introduction
Being a user of the CGRL gives you access to computing resources in the Berkeley Research Computing (BRC) high-performance computing (HPC) environment, including a large variety of genomic and bioinformatic programs. Getting to know how to use these resources efficiently is a challenge, even for those familiar with command-line use. In this workshop, we'll introduce the CGRL resources available to users, including new computing resources for your largest genomics projects and recommendations for best practices.

Starting point: Berkeley Research Computing (BRC) supercluster high-performance computing (HPC) user guide
[]

CGRL-specific user guide for the Vector cluster and Rosalind condo in Savio
[]
 * How do I sign up for an account, link my account to BRC HPC, and log in?
 * Where should I store data and how should I move it around?
 * What compute nodes are available?
 * How do I run my jobs?

Getting data into the BRC HPC environment
code ssh username@dtn.brc.berkeley.edu
 * 1) log in to the DTN

lftp ftp://gslftp@gslserver.qb3.berkeley.edu code
 * 1) log in to the Genomic Sequencing Laboratory's FTP server

Example interactive Slurm session
code squeue -p vector
 * 1) see what Slurm jobs are running on CGRL's Vector cluster

squeue -u $USER
 * 1) see all of my jobs running at the moment

srun --pty --partition=savio2_htc --account=co_rosalind --qos=rosalind_htc2_normal --time=00:30:00 bash -i
 * 1) start an interactive bash session Slurm job with 1 CPU on a Savio2 HTC node in CGRL's Rosalind condo

echo $SLURM_JOB_ID scontrol show job $SLURM_JOB_ID
 * 1) see information about the job you're currently running

squeue -p savio2_htc
 * 1) see all of the jobs running on the HTC node partition of the Savio2 cluster

exit
 * 1) exit your bash session

code

Example Slurm batch job: RNA-Seq quantification with kallisto
code cd /global/scratch/$USER curl -L https://www.dropbox.com/sh/0abnf67z8m9iv02/AADbq28QEqBXmfPFe7jVvbiLa?dl=1 > download.zip unzip download.zip
 * 1) download some example data to your Savio (Rosalind condo) scratch folder

vim kallisto_for_workshop.sh
 * 1) editing batch a script

sbatch kallisto_for_workshop.sh test_genome_index test 40 test_R1.fastq test_R2.fastq
 * 1) running the Slurm batch job with 4 CPUs on a Savio2 HTC nodes in CGRL's Rosalind condo

head test/abundance.tsv
 * 1) check output

code

Customizing, locally installing
code cd ~ vim .bashrc source ~/.bashrc
 * 1) setting a local library directory for installing R packages
 * 1) add something like the following: export R_LIBS_USER="global/home/users/$USER/R"

module load r/3.2.5 R

code