zero counts for all genes in RNAseq data of Ferret
1
0
Entering edit mode
5 months ago
Sara ▴ 240

I have bulk RNA-seq data from Ferret and trying to get counts per gene. to do so I used hisat2 and got the genome from here: https://hgdownload.soe.ucsc.edu/goldenPath/musFur1/bigZips/musFur1.2bit

After aligning the fastq files I used htseq and the following command:

htseq-count \
  -f bam \
  -r pos \
  -a 27 \
  -m intersection-strict \
  --stranded=no \
  -i gene_id \
  test_R1_001.fastq.sam.bam Mustela_putorius_furo.MusPutFur1.0.110.gtf > test_R1_001.fastq.sam.bam_sorted.bam.txt

with the gtf file from here: https://ftp.ensembl.org/pub/release-110/gtf/mustela_putorius_furo/Mustela_putorius_furo.MusPutFur1.0.110.gtf.gz

but all the counts are 0. do you know what the problem could be?

RNA-seq DGE • 369 views
ADD COMMENT
2
Entering edit mode
5 months ago
GenoMax 141k

got the genome from here: https://hgdownload.soe.ucsc.edu/goldenPath/musFur1/bigZips/musFur1.2bit

Just to be certain. You did convert the 2bit file to fasta sequence using twoBitToFa and then built a HiSat2 index with the fasta? Matching GTF files from UCSC should be in https://hgdownload.soe.ucsc.edu/goldenPath/musFur1/bigZips/genes/

You are almost certainly going to run into trouble trying to mix and match sources of genome and annotation.

If you are getting the GTF from Ensembl then you could get the genome from the same location. That way you are assured that the identifiers will match between genome and GTF. That is one of the top causes of problems with counting. Use https://ftp.ensembl.org/pub/release-110/fasta/mustela_putorius_furo/dna/Mustela_putorius_furo.MusPutFur1.0.dna.toplevel.fa.gz

ADD COMMENT

Login before adding your answer.

Traffic: 1995 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6