Closed:Understanding RNA-Seq Pipeline: newbie
0
1
Entering edit mode
4.7 years ago
WUSCHEL ▴ 750

I am new to bioinformatics and RNA-seq. I am planning to use below workflow for my Arabidopsis thaliana RNAseq data analysis, but I am not confident how to get started as I do not understand the purpose of each function. Could someone explain to me the purpose of the main functions of this pipeline? and general workflow?

    #!/bin/bash

# Use kallisto to perform k-mer based transcript quantification
# https://www.nature.com/articles/nbt.3519
# Build annotation index kallisto index -i annotation.idx annotation.fa

set -eu

if [ "$#" -lt 5 ]; then
    echo "Missing arguments!"
    echo "USAGE: kallisto.sh <SE,PE> <R1> <R2> <strandedness> <index> <name>"
    echo "strand: unstranded, fr_stranded, rf_stranded"
    echo "EXAMPLE: kallisto.sh PE SRR5724597_1.fastq.gz SRR5724597_2.fastq.gz unstranded AtRTD2_19April2016.idx col0-r1"
exit 1
fi

dow=$(date +"%F")

###########
### SINGLE END
###########

if [ "$1" == "SE" ]; then
    # requirements
    if [ "$#" -ne 5 ]; then
        echo "Missing required arguments for single-end!"
        echo "USAGE: kallisto.sh <SE> <R1> <strandedness> <index> <name>"
        exit 1
    fi

type=$1
R1=$2
strand=$3
annotation=$4
name=$5

echo "##################"
echo "Performing single-end alignments with kallisto"
echo "Type: $type"
echo "Input Files: $R1"
echo "Annotation: $annotation"
echo "Sample: $name"
echo "Time of analysis: $dow"
echo "##################"

# file structure
mkdir ${name}_kallisto_${dow}
mv $R1 -t ${name}_kallisto_${dow}
cd ${name}_kallisto_${dow}

mkdir 0_fastq
mv $R1 -t 0_fastq/

### Read trimming & FastQC
echo "Read trimming and FastQC"

mkdir 1_trimmed_fastq
cd 1_trimmed_fastq
trim_galore --fastqc --fastqc_args "--threads 4" ../0_fastq/$R1 | tee -a ../${name}_logs_${dow}.log
cd ../

mkdir 2_quant/
mv 1_trimmed_fastq/*fq.gz 2_quant/
cd 2_quant/

echo "                      "
echo "kallisto"
echo "                      "

if [ $strand == "unstranded" ]; then

    kallisto quant -i $annotation -t 4 --bias --single ${R1%%.fastq*}_trimmed.fq* -b 50 -l 300 -s 100 -o ./ 2>&1 | tee -a ../${name}_logs_${dow}.log

elif [ $strand == "fr_stranded" ]; then
        kallisto quant -i $annotation --fr-stranded -t 4 --bias --single ${R1%%.fastq*}_trimmed.fq* -b 50 -l 300 -s 100 -o ./ 2>&1 | tee -a ../${name}_logs_${dow}.log

else kallisto quant -i $annotation --rf-stranded -t 4 --bias --single ${R1%%.fastq*}_trimmed.fq* -b 50 -l 300 -s 100 -o ./ 2>&1 | tee -a ../${name}_logs_${dow}.log

fi

mv *fq.gz ../1_trimmed_fastq/

echo "complete"

fi

Your help is greatly appreciated. Thank you.

RNA-Seq next-gen sequencing • 133 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 2095 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6