Question: why the featurecounts result is empty
0
gravatar for A
3.4 years ago by
A3.7k
A3.7k wrote:

hi,

I want to count the miRNA reads using the sorted .bam file containing only mapped reads (all generated by samtools from Bowtie output --SAM file) by featurecounts for Aspergillus fumigatus but the number of reads for all of the gene_ids are zero, what happened? :(

this is my command

featureCounts -t exon -g gene_id -a annotation.gtf -o counts.txt mapping_results_SE.bam

I downloaded genome fasta and GTF from esembl

rna-seq featurecounts • 1.3k views
ADD COMMENTlink modified 3.4 years ago • written 3.4 years ago by A3.7k

What is printed on the screen ? It prints the log information on terminal.

Whats the output of following commands ?

head -5 annotation.gtf
samtools view mapping_results_SE.bam | head -5

'exon' and 'gene_id' are default parameters anyway.

ADD REPLYlink written 3.4 years ago by geek_y10k
#!genome-build AspFumNiv1_0
#!genome-version AspFumNiv1_0
#!genome-date 2015-10
#!genome-build-accession GCA_000731615.1
#!genebuild-last-updated 2015-10
[izadi@lbox146 bin]$
ADD REPLYlink modified 3.4 years ago • written 3.4 years ago by A3.7k

first I removed unmapped reads by

samtools view -S -h -F 4 -q 42

then converted to bam

samtools view -b -S

then sorting bam

samtools sort

ADD REPLYlink modified 3.4 years ago by geek_y10k • written 3.4 years ago by A3.7k

It would be useful if you provide the information asked for.

ADD REPLYlink written 3.4 years ago by geek_y10k

sorry Goutham I am with remote access then took long time to provide commands

[izadi@lbox146 new]$ head -5 Aspergillus_fumigatus.CADRE.32.gtf

I   JCVI    gene    216 836 .   +   .   gene_id "CADAFUAG00008664"; gene_name "AFUA_1G00100"; gene_source "JCVI"; gene_biotype "protein_coding";
I   JCVI    transcript  216 836 .   +   .   gene_id "CADAFUAG00008664"; transcript_id "CADAFUAT00008664"; gene_name "AFUA_1G00100"; gene_source "JCVI"; gene_biotype "protein_coding"; transcript_name "AFUA_1G00100"; transcript_source "JCVI"; transcript_biotype "protein_coding";
I   JCVI    exon    216 836 .   +   .   gene_id "CADAFUAG00008664"; transcript_id "CADAFUAT00008664"; exon_number "1"; gene_name "AFUA_1G00100"; gene_source "JCVI"; gene_biotype "protein_coding"; transcript_name "AFUA_1G00100"; transcript_source "JCVI"; transcript_biotype "protein_coding"; exon_id "CADAFUAE00025367";
I   JCVI    CDS 216 833 .   +   0   gene_id "CADAFUAG00008664"; transcript_id "CADAFUAT00008664"; exon_number "1"; gene_name "AFUA_1G00100"; gene_source "JCVI"; gene_biotype "protein_coding"; transcript_name "AFUA_1G00100"; transcript_source "JCVI"; transcript_biotype "protein_coding"; protein_id "CADAFUAP00008664";
I   JCVI    start_codon 216 218 .   +   0   gene_id "CADAFUAG00008664"; transcript_id "CADAFUAT00008664"; exon_number "1"; gene_name "AFUA_1G00100"; gene_source "JCVI"; gene_biotype "protein_coding"; transcript_name "AFUA_1G00100"; transcript_source "JCVI"; transcript_biotype "protein_coding";
[izadi@lbox146 new]$ 



[izadi@lbox146 samtools-1.3.1]$ samtools view high-quality1-sorted | head -5

HISEQ2500:468:C8JLLACXX:4:1306:12340:99476  0   I   421 42  28M *   0   0   CCACCTCAGTCCCAGGTCGGATTGTTCC    CCCFFFFFHFHHHJJIGHHJJJJIGIGJ    AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:28 YT:Z:UU
HISEQ2500:468:C8JLLACXX:4:2214:2219:40231   0   I   421 42  28M *   0   0   CCACCTCAGTCCCAGGTCGGATTGTTCC    CCCFFFFFHHHHHJJJGHIJJJJJJJJJ    AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:28 YT:Z:UU
HISEQ2500:468:C8JLLACXX:4:2112:13311:58196  0   I   954 42  17M *   0   0   CTCGGACCTTTCTCGCC   CCCFFFFFHHHHHJJJJ   AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:17 YT:Z:UU
HISEQ2500:468:C8JLLACXX:4:2110:8568:78842   0   I   25284   42  17M *   0   0   AGATACGGAAAGGCATG   CCCFFFFFHHHHHJJJJ   AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:17 YT:Z:UU
HISEQ2500:468:C8JLLACXX:4:1108:15749:19044  0   I   25328   42  21M *   0   0   CCTGCGGTGTAGTTGTGGCCC   CCCFFFFDHHHHHJJJJJJJJ   AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:21 YT:Z:UU

[izadi@lbox146 samtools-1.3.1]$
ADD REPLYlink modified 3.4 years ago by Devon Ryan93k • written 3.4 years ago by A3.7k
1

Hmm, those should be getting counted. Hopefully the -R option will clarify things.

ADD REPLYlink written 3.4 years ago by Devon Ryan93k

thank you,

you mean I should -R in the same screen I am running featurecounts?

ADD REPLYlink written 3.4 years ago by A3.7k
1

You should use that option in featureCounts.

ADD REPLYlink written 3.4 years ago by Devon Ryan93k

You might try the -R option, the resulting file will note for each alignment/pair why they weren't included. Note that this will be a really big text file, so maybe eyeball some alignments that you think should be getting counted and make a BAM file containing only them.

ADD REPLYlink written 3.4 years ago by Devon Ryan93k

sorry in ensembl there are 4 Aspergillus fumigatus, I only received linke for downloading fastq files, might be the errors is because of outmatching between fastq files and genome fasta???

ADD REPLYlink modified 3.4 years ago • written 3.4 years ago by A3.7k
1

No, fastq files don't have associated genomes.

ADD REPLYlink written 3.4 years ago by Devon Ryan93k
1
gravatar for A
3.4 years ago by
A3.7k
A3.7k wrote:

I think problem was from messing between 4 genome fasta and GTF from 4 Aspergillus fumigatus in ensembl. I should use coordinated genome fasta and GTF files

ADD COMMENTlink written 3.4 years ago by A3.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 794 users visited in the last hour