Question

Subread FeatureCounts produces zero percent successfull alignment

0

Entering edit mode

2.3 years ago

mathavanbioinfo ▴ 80

Dear All, I working on rice genome transcriptome analysis. I have done alignment using the Hisat2 tool process greater than 80% s score. Then I perform the count matrix generation using the subread package

Commad Subread

/apps/subread-1.6.2-source/bin/featureCounts -p -B -a all.gtf -o counts  *bam

http://rice.uga.edu/pub/data/Eukaryotic_Projects/o_sativa/annotation_dbs/pseudomolecules/version_7.0/all.dir/

Reference fasta file

>LOC_Os01g01010 genomic|TBC domain containing protein, expressed
AGATGAGCTGGTGGGGATGCTCTAAGAGAACGAGAGAAGCACAGAGCAGATAAACCACAC
CCACAGGCACCACCGTCCTTGTTGGTAATGAAGAAGACGAGACGACGACTTCCCCACTAG
GAAACACGACGGAGGCGGAGATGATCGACGGCGGAGAGAGCTACAGAAACATCGATGCCT
CCTGTCCAATCCCCCCATCCCATTCGGTAGTTGGATTGAAGACTACCGAATAAGAGAAGC

GTF file all.gtf

Chr1    MSU_osa1r7  exon    2903    3268    .   +   .   transcript_id "LOC_Os01g01010.1"; gene_id "LOC_Os01g01010"; gene_name "LOC_Os01g01010";
Chr1    MSU_osa1r7  exon    3354    3616    .   +   .   transcript_id "LOC_Os01g01010.1"; gene_id "LOC_Os01g01010"; gene_name "LOC_Os01g01010";
Chr1    MSU_osa1r7  exon    4357    4455    .   +   .   transcript_id "LOC_Os01g01010.1"; gene_id "LOC_Os01g01010";gene_name "LOC_Os01g01010";

enter image description here

zero subread featrureCounts alignment • 741 views

ADD COMMENT • link updated 2.3 years ago by Carlo Yague 8.6k • written 2.3 years ago by mathavanbioinfo ▴ 80

score 1 · Answer 1 · 2022-01-04

1

Entering edit mode

2.3 years ago

Istvan Albert 100k

As you can see in your files the FASTA file does not match the GFT file

Now how you reference file calls the sequences >LOC_Os01g01010 whereas the feature file designates them as Chr1

Make sure to use a genomic file, not a transcriptome file as your alignment target.

ADD COMMENT • link 2.3 years ago by Istvan Albert 100k