Question: Pipeline for >100 RNA-Seq PE samples
gravatar for umn_bist
3.1 years ago by
umn_bist320 wrote:

Although I am familiar with RNA-seq workflow for n<20, this is my first time handling a large set of RNA-seq data. These are tumor (and matched normals) RNA-seq.

Are there any automated pipelines that are commonly used in the field for pre-alignment QC, alignment, post-alignment QC for a large set of RNA-seq data?

Two areas that I am having difficulty automating are:

  • [cutadapt] Providing an adapter list for both forward and reverse PE strands. I have a single list but I do not know if cutadapt will automatically reverse the adapter sequences. Also determining a value for -overlap=LENGTH. I may use BBMap in place of cutadapt.
cutadapt -q 10,10 -a "${adapter}" -A "${adapter}" -o "${file1%_1.fastq}_1_trimmed.fastq" -p "${file2%_2.fastq}_2_trimmed.fastq" "${file1}" "${file2}"
  • [TopHat2] Providing --mate-inner-dist and --mate-std-dev - as these will vary from sample to sample.
tophat -p 10 --mate-inner-dist {} --mate-std-dev {} --no-coverage-search --output-dir "${file}" --transcriptome-index
rna-seq tools • 995 views
ADD COMMENTlink modified 3.1 years ago • written 3.1 years ago by umn_bist320

Also using cutadapt I am trimming bases of quality score <10. Is this acceptable if I'm going to filter variants that are <30 MQ and <20 QUAL using snpSift?

ADD REPLYlink written 3.1 years ago by umn_bist320
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 762 users visited in the last hour