Can I Use *.Bam Generated By Tophat As Input For Coveragebed For Read Count Of Rna-Seq

0

Entering edit mode

10.4 years ago

whuanfan • 0

The data is from this experiment : http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-822/
I downloed a *.bam file of this experiment from ENA: ftp://ftp.sra.ebi.ac.uk/vol1/ERA066/ERA066395/bam/MCF7_E2-12h_tophat.bam.

It should be mapped by TopHat as it is indicated in the file name.

Can I use this TopHat bam file as input of CoverageBed to count the read?
If yes, which GTF/GFF file I should use ?

On the other hand,

The Input:
the same TopHat bam file, ftp://ftp.sra.ebi.ac.uk/vol1/ERA066/ERA066395/bam/MCF7_E2-12h_tophat.bam
the control bam
GTF file from
ftp://ftp.ensembl.org/pub/release-74/gtf/homo_sapiens/Homo_sapiens.GRCh37.74.gtf.gz

I run cufflinks ,cuffmerg and cuffdiff. But the result is no gene is significantly differentially expressed. Where did I do wrong ? How to improve?

THANK YOU VERY MUCH!

rna-seq tophat • 2.7k views

ADD COMMENT • link 10.4 years ago by whuanfan • 0

2

Entering edit mode

Regardless of what genome those BAM files were aligned to, don't use coverageBed to get per-gene counts. If a read aligns to more than one feature, it increments the counter for both, which is the wrong way to do things when dealing with RNAseq. Use featureCounts (from subRead) or htseq-counts instead.