why the featurecounts result is empty
1
0
Entering edit mode
7.6 years ago
zizigolu ★ 4.3k

hi,

I want to count the miRNA reads using the sorted .bam file containing only mapped reads (all generated by samtools from Bowtie output --SAM file) by featurecounts for Aspergillus fumigatus but the number of reads for all of the gene_ids are zero, what happened? :(

this is my command

featureCounts -t exon -g gene_id -a annotation.gtf -o counts.txt mapping_results_SE.bam

I downloaded genome fasta and GTF from esembl

RNA-Seq featurecounts • 3.1k views
ADD COMMENT
0
Entering edit mode

What is printed on the screen ? It prints the log information on terminal.

Whats the output of following commands ?

head -5 annotation.gtf
samtools view mapping_results_SE.bam | head -5

'exon' and 'gene_id' are default parameters anyway.

ADD REPLY
0
Entering edit mode
#!genome-build AspFumNiv1_0
#!genome-version AspFumNiv1_0
#!genome-date 2015-10
#!genome-build-accession GCA_000731615.1
#!genebuild-last-updated 2015-10
[izadi@lbox146 bin]$
ADD REPLY
0
Entering edit mode

first I removed unmapped reads by

samtools view -S -h -F 4 -q 42

then converted to bam

samtools view -b -S

then sorting bam

samtools sort

ADD REPLY
0
Entering edit mode

It would be useful if you provide the information asked for.

ADD REPLY
0
Entering edit mode

sorry Goutham I am with remote access then took long time to provide commands

[izadi@lbox146 new]$ head -5 Aspergillus_fumigatus.CADRE.32.gtf

I   JCVI    gene    216 836 .   +   .   gene_id "CADAFUAG00008664"; gene_name "AFUA_1G00100"; gene_source "JCVI"; gene_biotype "protein_coding";
I   JCVI    transcript  216 836 .   +   .   gene_id "CADAFUAG00008664"; transcript_id "CADAFUAT00008664"; gene_name "AFUA_1G00100"; gene_source "JCVI"; gene_biotype "protein_coding"; transcript_name "AFUA_1G00100"; transcript_source "JCVI"; transcript_biotype "protein_coding";
I   JCVI    exon    216 836 .   +   .   gene_id "CADAFUAG00008664"; transcript_id "CADAFUAT00008664"; exon_number "1"; gene_name "AFUA_1G00100"; gene_source "JCVI"; gene_biotype "protein_coding"; transcript_name "AFUA_1G00100"; transcript_source "JCVI"; transcript_biotype "protein_coding"; exon_id "CADAFUAE00025367";
I   JCVI    CDS 216 833 .   +   0   gene_id "CADAFUAG00008664"; transcript_id "CADAFUAT00008664"; exon_number "1"; gene_name "AFUA_1G00100"; gene_source "JCVI"; gene_biotype "protein_coding"; transcript_name "AFUA_1G00100"; transcript_source "JCVI"; transcript_biotype "protein_coding"; protein_id "CADAFUAP00008664";
I   JCVI    start_codon 216 218 .   +   0   gene_id "CADAFUAG00008664"; transcript_id "CADAFUAT00008664"; exon_number "1"; gene_name "AFUA_1G00100"; gene_source "JCVI"; gene_biotype "protein_coding"; transcript_name "AFUA_1G00100"; transcript_source "JCVI"; transcript_biotype "protein_coding";
[izadi@lbox146 new]$ 



[izadi@lbox146 samtools-1.3.1]$ samtools view high-quality1-sorted | head -5

HISEQ2500:468:C8JLLACXX:4:1306:12340:99476  0   I   421 42  28M *   0   0   CCACCTCAGTCCCAGGTCGGATTGTTCC    CCCFFFFFHFHHHJJIGHHJJJJIGIGJ    AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:28 YT:Z:UU
HISEQ2500:468:C8JLLACXX:4:2214:2219:40231   0   I   421 42  28M *   0   0   CCACCTCAGTCCCAGGTCGGATTGTTCC    CCCFFFFFHHHHHJJJGHIJJJJJJJJJ    AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:28 YT:Z:UU
HISEQ2500:468:C8JLLACXX:4:2112:13311:58196  0   I   954 42  17M *   0   0   CTCGGACCTTTCTCGCC   CCCFFFFFHHHHHJJJJ   AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:17 YT:Z:UU
HISEQ2500:468:C8JLLACXX:4:2110:8568:78842   0   I   25284   42  17M *   0   0   AGATACGGAAAGGCATG   CCCFFFFFHHHHHJJJJ   AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:17 YT:Z:UU
HISEQ2500:468:C8JLLACXX:4:1108:15749:19044  0   I   25328   42  21M *   0   0   CCTGCGGTGTAGTTGTGGCCC   CCCFFFFDHHHHHJJJJJJJJ   AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:21 YT:Z:UU

[izadi@lbox146 samtools-1.3.1]$
ADD REPLY
1
Entering edit mode

Hmm, those should be getting counted. Hopefully the -R option will clarify things.

ADD REPLY
0
Entering edit mode

thank you,

you mean I should -R in the same screen I am running featurecounts?

ADD REPLY
1
Entering edit mode

You should use that option in featureCounts.

ADD REPLY
0
Entering edit mode

You might try the -R option, the resulting file will note for each alignment/pair why they weren't included. Note that this will be a really big text file, so maybe eyeball some alignments that you think should be getting counted and make a BAM file containing only them.

ADD REPLY
0
Entering edit mode

sorry in ensembl there are 4 Aspergillus fumigatus, I only received linke for downloading fastq files, might be the errors is because of outmatching between fastq files and genome fasta???

ADD REPLY
1
Entering edit mode

No, fastq files don't have associated genomes.

ADD REPLY
1
Entering edit mode
7.6 years ago
zizigolu ★ 4.3k

I think problem was from messing between 4 genome fasta and GTF from 4 Aspergillus fumigatus in ensembl. I should use coordinated genome fasta and GTF files

ADD COMMENT

Login before adding your answer.

Traffic: 1964 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6