Question: featureCounts reports zero counts when using GTF Repeatmasker track from UCSC and Bowtie2 SAM mapped RNA-seq reads
0
gravatar for Gema Sanz
3 months ago by
Gema Sanz30
Karolinska Institutet
Gema Sanz30 wrote:

Hi,

I want to calculate the read counts mapping to repeats in my RNA-seq data. I mapped the reads using Bowtie2 to indices of repeats from UCSC Repeatmasker track in fasta format. Next, I have to calculate the read counts per repeat type, so I tried with featureCounts tool (in Galaxy server). But the output gives me zero counts for all the repeat types.

┬┐Any ideas? I thought maybe some IDs or columns are not matching but I don't know.

Thank you very much in advance, Gema

rna-seq software error • 198 views
ADD COMMENTlink modified 12 weeks ago • written 3 months ago by Gema Sanz30

Probably featureCounts is discarding multi-mapped reads, hence zero counts for repeats. You have to keep only one copy of the repeats on the genome, or count with RSEM or Salmon, which will try to estimate by expectation-maximization how to distribute the counts of multi-mapped reads.

ADD REPLYlink modified 3 months ago • written 3 months ago by h.mon9.8k

Thanks for your answer h.mon! In Bowtie2 I used default mode (search for multiple alignments, report the best one), I'm not sure what you mean with "keep only one copy of the repeat". Reads are mapped directly to repeat indices not to human genome. I think there is an option in featureCounts to keep the multi-mapped reads, I will check.

ADD REPLYlink written 12 weeks ago by Gema Sanz30
0
gravatar for Gema Sanz
12 weeks ago by
Gema Sanz30
Karolinska Institutet
Gema Sanz30 wrote:

I tried the -M option in featureCounts (allow multi-mapped) but still I'm getting zeros everywhere:

Geneid example.sam AluSp 0 AluY 0 L2b 0 L1PA10 0 L1PA2 0 L1MB7 0 ERVL-E-int 0

GTF repeatmasker file looks like this:

Seqname Source Feature Start End Score Strand Frame Attributes
chr1 hg19_rmsk exon 16777161 16777470 2147.000000 + . gene_id "AluSp"; transcript_id "AluSp";

And the sam file:

QNAME FLAG RNAME POS MAPQ CIGAR MRNM MPOS ISIZE SEQ QUAL OPT @HD VN:1.0 SO:unsorted
@SQ SN:hg19_rmsk_AluSp LN:310
@SQ SN:hg19_rmsk_AluY LN:289
@SQ SN:hg19_rmsk_L2b LN:1040
@SQ SN:hg19_rmsk_L1PA10 LN:2090

My feeling is that something is not matching but I don't know what...

ADD COMMENTlink written 12 weeks ago by Gema Sanz30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1420 users visited in the last hour