i want to upload a single indexed fasta file containing many viruses genome to the IGV
1
0
Entering edit mode
8.2 years ago

I have created a single fasta file of the many viruses gennome. I have indexed the fasta file and also the bam file. Now i want to upload the indexed fasta file into the IGV. Doing so, the IGV give an error "Contig 'gi|168480155|ref|NC_010354.1|' already exists in fasta index." How to fix this problem. Please help!

RNA-Seq • 2.5k views
ADD COMMENT
0
Entering edit mode

If you are trying to upload this to use as a custom genome you should select the original fasta file (not the index file). Is that what you are trying?

ADD REPLY
0
Entering edit mode

I want to view the indexed bam files the indexed pathogen (viruses and bacteria genomes) fasta files in IGV for allignment. and while uploading the indexed fasta files i found this error "Contig 'gi|168480155|ref|NC_010354.1|' already exists in fasta index."

ADD REPLY
0
Entering edit mode
8.2 years ago

What is the output of

grep "^>" in.fasta | sort | uniq -c |  sort -k1,1nr | head

Its most likely because of duplicate fasta headers i.e duplicate names in fasta sequences.

ADD COMMENT
0
Entering edit mode

Thanks Goutham Atla, yeah, perhaps that could be a reason. How can i fix it?

ADD REPLY
0
Entering edit mode

If there are duplicate entires in the fasta file, you can remove them using some command as below:

grep "^>" in.fasta | sed 's/^>//g' | sort | uniq > uniq_ids.txt
samtools faidx in.fasta `cat uniq_ids.txt ` > new.fasta
samtools faidx new.fasta

. But if there are duplicate headers but not duplicate sequences, its a bit tricky as you have already aligned the data.

ADD REPLY

Login before adding your answer.

Traffic: 1034 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6