Closed:How to build an index for zebrafish with reference genome and snp file using hisat2?
0
0
Entering edit mode
3.3 years ago
tara ▴ 30

Hello, I want to align RNA-seq data to the zebrafish reference genome with SNPs using hisat2. I followed the "How to" turtorial on the hisat2 homepage, but I cannot build an index. Maybe the VCF file containing the SNPs is not compatible with the reference genome. Unfortunately, there are no downloads for zebrafish on the hisat2 homepage.

What I did:

1) download, unzip and rename reference genome
wget ftp://ftp.ensembl.org/pub/release-102/fasta/danio_rerio/dna/Danio_rerio.GRCz11.dna.primary_assembly.fa.gz
gzip -d Danio_rerio.GRCz11.dna.primary_assembly.fa.gz
mv Danio_rerio.GRCz11.dna.primary_assembly.fa genome.fa

2) download and unzip SNP file
wget ftp://ftp.ensembl.org/pub/release-102/variation/vcf/danio_rerio/danio_rerio.vcf.gz
gzip -d danio_rerio.vcf.gz

3) extract SNP to hisat2 format
hisat2_extract_snps_haplotypes_VCF.py genome.fa danio_rerio.vcf.gz genome
-> there are a lot of errors like: "Error: the reference genome you provided seems to be incompatible with the VCF file at 654 of chromosome KZ116062.1 where C is in the reference genome while G is in the VCF file", but a file is generated

4) build HFM index
hisat2-build -p 16 --snp genome.snp --haplotype genome.haplotype genome.fa genome_snp
-> there are a lot of warnings and the program stops: "Warning: single type should have a different base than T (rs505251572) at 58622972 on 3
Time to read SNPs and splice sites: 00:00:16
Killed"

I also tried the SNP file from NCBI (https://ftp.ncbi.nlm.nih.gov/snp/organisms/archive/zebrafish_7955/VCF/00-All.vcf.gz), but that didn't work at all. Do you know where I can download the latest zebrafish reference genome with the corresponding SNP file or how I can build the index with the SNP file?
Thank you very much!

SNP hisat2 zebrafish reference genome • 371 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 2287 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6