Question: GL000201,GL000202... etc What are these?
0
gravatar for morovatunc
2.6 years ago by
morovatunc380
Turkey
morovatunc380 wrote:

Hi,

I have two bam files from ICGC. During mutation calling, vardict/mutect/freebayes/varscan split bam to by its chromosomes. Those GL00 parts also appear just like chr1-2-3-4-5-6-7 appear. I check what they are and found they are contigs?!?(forgive my ignorance) How should I treat them ? Will they cause a problem during variant calling output interpretation? I read one previous thread about this but there were not any information about how to treat them. I need your help.

Best,

Tunc.

bam icgc wgs • 950 views
ADD COMMENTlink modified 2.6 years ago by igor7.1k • written 2.6 years ago by morovatunc380
3
gravatar for igor
2.6 years ago by
igor7.1k
United States
igor7.1k wrote:

You can check http://genome.ucsc.edu/cgi-bin/hgGateway where they provide details for every assembly.

This was addressed on the UCSC genome support forum: http://redmine.soe.ucsc.edu/forum/index.php?t=msg&goto=8701&S=5061c99adf3f4c4ee90f0ea362d838c7

chr_random - are called "unlocalized" sequences. The chromosome is known, but not the location on the chromosome. chrUn_ - are called "unplaced" sequences. They probably belong to the sequenced genome, but placement is unknown at this time. The GL numbers (or other types of numbers) in these names are the genbank identification numbers which can be used in a nucleotide search at Entrez. For example: chr1_GL456211_random - unlocalized sequence belonging to chr1, NCBI identification: GL456211 chrUn_JH584304 - unplaced sequence, NCBI identification JH584304

ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by igor7.1k

Thank you very much !

ADD REPLYlink written 2.6 years ago by morovatunc380

Igor,

I have a question. Can I remove those locations from my bam file ? Would that be feasible ? since they have unknown location would that be valuable for obtaining any data?

Thank you for the information again,

Best,

Tunc.

ADD REPLYlink written 2.6 years ago by morovatunc380

You may then lose multi-mapping reads or reads that would have aligned to regular chromosomes if the random ones were not present.

If you really want to do that, you should be able to do something like samtools view -h file.bam | grep -v "_random" ...

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by igor7.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1449 users visited in the last hour