Question

why the featurecounts result is empty

0

Entering edit mode

7.6 years ago

zizigolu ★ 4.3k

hi,

I want to count the miRNA reads using the sorted .bam file containing only mapped reads (all generated by samtools from Bowtie output --SAM file) by featurecounts for Aspergillus fumigatus but the number of reads for all of the gene_ids are zero, what happened? :(

this is my command

featureCounts -t exon -g gene_id -a annotation.gtf -o counts.txt mapping_results_SE.bam

I downloaded genome fasta and GTF from esembl

RNA-Seq featurecounts • 3.1k views

ADD COMMENT • link 7.6 years ago by zizigolu ★ 4.3k

0

Entering edit mode

What is printed on the screen ? It prints the log information on terminal.

Whats the output of following commands ?

head -5 annotation.gtf
samtools view mapping_results_SE.bam | head -5

'exon' and 'gene_id' are default parameters anyway.

ADD REPLY • link 7.6 years ago by GouthamAtla 12k

0

Entering edit mode

#!genome-build AspFumNiv1_0
#!genome-version AspFumNiv1_0
#!genome-date 2015-10
#!genome-build-accession GCA_000731615.1
#!genebuild-last-updated 2015-10
[izadi@lbox146 bin]$

ADD REPLY • link 7.6 years ago by zizigolu ★ 4.3k

0

Entering edit mode

first I removed unmapped reads by

samtools view -S -h -F 4 -q 42

then converted to bam

samtools view -b -S

then sorting bam

samtools sort

ADD REPLY • link updated 7.6 years ago by GouthamAtla 12k • written 7.6 years ago by zizigolu ★ 4.3k

0

Entering edit mode

It would be useful if you provide the information asked for.

ADD REPLY • link 7.6 years ago by GouthamAtla 12k

0

Entering edit mode

sorry Goutham I am with remote access then took long time to provide commands

[izadi@lbox146 new]$ head -5 Aspergillus_fumigatus.CADRE.32.gtf

I   JCVI    gene    216 836 .   +   .   gene_id "CADAFUAG00008664"; gene_name "AFUA_1G00100"; gene_source "JCVI"; gene_biotype "protein_coding";
I   JCVI    transcript  216 836 .   +   .   gene_id "CADAFUAG00008664"; transcript_id "CADAFUAT00008664"; gene_name "AFUA_1G00100"; gene_source "JCVI"; gene_biotype "protein_coding"; transcript_name "AFUA_1G00100"; transcript_source "JCVI"; transcript_biotype "protein_coding";
I   JCVI    exon    216 836 .   +   .   gene_id "CADAFUAG00008664"; transcript_id "CADAFUAT00008664"; exon_number "1"; gene_name "AFUA_1G00100"; gene_source "JCVI"; gene_biotype "protein_coding"; transcript_name "AFUA_1G00100"; transcript_source "JCVI"; transcript_biotype "protein_coding"; exon_id "CADAFUAE00025367";
I   JCVI    CDS 216 833 .   +   0   gene_id "CADAFUAG00008664"; transcript_id "CADAFUAT00008664"; exon_number "1"; gene_name "AFUA_1G00100"; gene_source "JCVI"; gene_biotype "protein_coding"; transcript_name "AFUA_1G00100"; transcript_source "JCVI"; transcript_biotype "protein_coding"; protein_id "CADAFUAP00008664";
I   JCVI    start_codon 216 218 .   +   0   gene_id "CADAFUAG00008664"; transcript_id "CADAFUAT00008664"; exon_number "1"; gene_name "AFUA_1G00100"; gene_source "JCVI"; gene_biotype "protein_coding"; transcript_name "AFUA_1G00100"; transcript_source "JCVI"; transcript_biotype "protein_coding";
[izadi@lbox146 new]$ 



[izadi@lbox146 samtools-1.3.1]$ samtools view high-quality1-sorted | head -5

HISEQ2500:468:C8JLLACXX:4:1306:12340:99476  0   I   421 42  28M *   0   0   CCACCTCAGTCCCAGGTCGGATTGTTCC    CCCFFFFFHFHHHJJIGHHJJJJIGIGJ    AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:28 YT:Z:UU
HISEQ2500:468:C8JLLACXX:4:2214:2219:40231   0   I   421 42  28M *   0   0   CCACCTCAGTCCCAGGTCGGATTGTTCC    CCCFFFFFHHHHHJJJGHIJJJJJJJJJ    AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:28 YT:Z:UU
HISEQ2500:468:C8JLLACXX:4:2112:13311:58196  0   I   954 42  17M *   0   0   CTCGGACCTTTCTCGCC   CCCFFFFFHHHHHJJJJ   AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:17 YT:Z:UU
HISEQ2500:468:C8JLLACXX:4:2110:8568:78842   0   I   25284   42  17M *   0   0   AGATACGGAAAGGCATG   CCCFFFFFHHHHHJJJJ   AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:17 YT:Z:UU
HISEQ2500:468:C8JLLACXX:4:1108:15749:19044  0   I   25328   42  21M *   0   0   CCTGCGGTGTAGTTGTGGCCC   CCCFFFFDHHHHHJJJJJJJJ   AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:21 YT:Z:UU

[izadi@lbox146 samtools-1.3.1]$

ADD REPLY • link updated 7.6 years ago by Devon Ryan 104k • written 7.6 years ago by zizigolu ★ 4.3k

1

Entering edit mode

Hmm, those should be getting counted. Hopefully the -R option will clarify things.

ADD REPLY • link 7.6 years ago by Devon Ryan 104k

0

Entering edit mode

thank you,

you mean I should -R in the same screen I am running featurecounts?

ADD REPLY • link 7.6 years ago by zizigolu ★ 4.3k

1

Entering edit mode

You should use that option in featureCounts.

ADD REPLY • link 7.6 years ago by Devon Ryan 104k

0

Entering edit mode

You might try the -R option, the resulting file will note for each alignment/pair why they weren't included. Note that this will be a really big text file, so maybe eyeball some alignments that you think should be getting counted and make a BAM file containing only them.

ADD REPLY • link 7.6 years ago by Devon Ryan 104k

0

Entering edit mode

sorry in ensembl there are 4 Aspergillus fumigatus, I only received linke for downloading fastq files, might be the errors is because of outmatching between fastq files and genome fasta???

ADD REPLY • link 7.6 years ago by zizigolu ★ 4.3k

1

Entering edit mode

No, fastq files don't have associated genomes.

ADD REPLY • link 7.6 years ago by Devon Ryan 104k

score 1 · Accepted Answer · 2016-09-01

1

Entering edit mode

7.6 years ago

zizigolu ★ 4.3k

I think problem was from messing between 4 genome fasta and GTF from 4 Aspergillus fumigatus in ensembl. I should use coordinated genome fasta and GTF files

ADD COMMENT • link 7.6 years ago by zizigolu ★ 4.3k