Question: Do downloaded fastq files from NCBI (SRA) needs preprocessing before used in bowtie2 and featureCounts
gravatar for RS
3 months ago by
RS0 wrote:

I am new to RNA-Seq data analysis. Sorry for very basic question :)

I was given a task to generate read counts for a published dataset. I have followed the below pipeline and generated read counts. But in this pipeline there is no preprocessing of fastq files like removal of ribosome sequences. are these fastq files have ribosomal sequences? Any other pre-processing always required?

bowtie2-build TAIR10_chr_all.fas Araindex

bowtie2 -x Araindex -1 SRR6371142_1.fastq -2 SRR6371142_2.fastq -S SRR6371142.SAM

sam.files <- list.files(path = "sf_SHARED-VB/", pattern = ".SAM$", full.names = TRUE)

fc <- featureCounts(sam.files, annot.ext="Arabidopsis_thaliana.TAIR10.41.gtf",isPairedEnd=TRUE,isGTFAnnotationFile=TRUE)

Thanks for help!

sequencing rna-seq • 153 views
ADD COMMENTlink modified 3 months ago by finswimmer11k • written 3 months ago by RS0

For a published dataset you can find out about the preprocessing steps that may have been performed from the paper or you may have to align the reads to the ribosomal genes and check if there are any!

ADD REPLYlink written 3 months ago by Sej Modha4.1k

You should not use bowtie2 to align eukaryotic RNAseq reads to a reference genome, as bowtie2 is not splice-aware. Use STAR or HISAT2 for this purpose.

Also, I hope you noticed you are mixing command-line and R commands.

ADD REPLYlink written 3 months ago by h.mon24k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1760 users visited in the last hour