Sequence Reads Unmapped To Human Genome
3
1
Entering edit mode
12.6 years ago
Liyf ▴ 300

I do the exome sequencing and some of them do not mapped to human genome, I read some paper that said they mapped them to virus genome, I wonder if exome data can do the same thing? I know that many non-coding data will miss, but I just want to see if there are some virus genome in exome data.

exome map • 3.9k views
ADD COMMENT
1
Entering edit mode

If you have the sequence of your virus, you can certainly map back to it and let us know about the results. Is there something that is stopping you from trying?

ADD REPLY
0
Entering edit mode

In fact, I am very busy in other research. I am not familiar to use BWA, I even not use once.This idear comes to me just when I read some whole genome sequencing paper. So if you all say that it is wasting time, I shall do not try. But as you all think it is worth to try, I will do it and when the result is out, I will tell you. Thanks. Maybe, it will last a long time, because I am doing other things right now and the data is not ready completely.

ADD REPLY
2
Entering edit mode
12.6 years ago
seidel 11k

What fraction of reads are you talking about? How did you isolate the molecules you are sequencing? (i.e. was it exome capture with specific probes or was it an oligo-dT based method?). If you expect all of your reads from a human sample to map to the human genome you will never be happy because there are too many opportunities for DNA from other sources to be present in your samples. It can come from viruses, or other pathogens or non-human symbionts. Depending on how your cells were prepared it can come from other organisms that were in the media, or for instance if it was a tissue sample other organisms that may be associated with the tissue (care to guess how many critters are on the surface of your skin?). It can come from contaminant DNA that comes along with some of the enzymes used in library preparation. It can come from other molecules in the lab that your lab mates are studying (often people find sequence reads from genes studied in the lab popping up in their samples - contamination artifacts).

However, in most cases these contaminants will make up a small portion of the reads, and so can usually be ignored as par for the course. However if your goal is to detect which viruses are present in human samples, then why not add a virus mapping step to your alignment pipeline?

ADD COMMENT
0
Entering edit mode

i do the whole human exome sequencing. Because the disease is related to virus infection, so I also want to map it to virus genome. What is more, I am just afraid that exome is not continuous, even there are virus genomes, it will split up, and can not map back to virus genomes. Thanks.

ADD REPLY
2
Entering edit mode
12.6 years ago
Pablo ★ 1.9k

How about adding the contaminant candidates (e.g. your virus) as 'additional chromosomes' to your reference genome and then mapping? I would also add 'usual contaminants' and phiX.

If the contaminant genome is circular, you may need to add 200bp from the beginning of the sequence, to the end (in order to be able to map reads at the end of the sequence) .

If you are using bwa to map, make sure that the total length (of the reference) is not over 4GB.

ADD COMMENT
0
Entering edit mode

This is really help for me. Thanks very much.

ADD REPLY
0
Entering edit mode
12.6 years ago
ALchEmiXt ★ 1.9k

Not sure how the exact experiment is en how you did the mapping but quite some data can contain artefacts, or contaminants as indicated above. From a representative sample you can easily do a bowtie mapping to potential contaminants or to some viral sequences you may have around.

I also strongly suggest you to map againast a so-called contaminants database of sequences (commonly used adapters and such) since often sequence data is full of them (if you are unlucky).

You can have a look at these tools that basically use bowtie to get the idea; fastq_screen and fastQC.

ADD COMMENT
0
Entering edit mode

Thanks. I will.

ADD REPLY

Login before adding your answer.

Traffic: 1638 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6