Novel Sequence Insertions 1000 Genomes Project
2
1
Entering edit mode
10.8 years ago
Ahdf-Lell-Kocks ★ 1.6k

I am looking for novel sequence insertions identified in the 1000 genomes project, and I found 3 files in this directory:

ftp://ftp.ebi.ac.uk/pub/databases/dgva/estd59_Durbin_et_al_2010/gvf/estd59_Durbin_2010_highquality_novel_sequence_insertion_pilot2.gvf
ftp://ftp.ebi.ac.uk/pub/databases/dgva/estd59_Durbin_et_al_2010/gvf/estd59_Durbin_2010_highquality_mobile_element_insertion_pilot1.gvf
ftp://ftp.ebi.ac.uk/pub/databases/dgva/estd59_Durbin_et_al_2010/gvf/estd59_Durbin_2010_highquality_mobile_element_insertion_pilot2.gvf


It seems for non-mobile element insertions, there is only about 400 novel sequence insertions. Is there any other place where I can find more?

EDIT: for mobile elements, Casey Bergman's answer seems to be the best out there. Still, out of 7830 entries in the table, only 3089 sequences are given for the predictions in this table, the rest being blank.

genome indel • 2.8k views
2
Entering edit mode

I believe the SV people has a consensus about how to define "novel". Every paper I read on "novel" sequences/insertions define "novel" essentially the same way.

1
Entering edit mode

Mobile element insertions are not novel.

1
Entering edit mode

Neither are segmental duplications or CNVs for that matter. Virtually all new sequence come from pre-exisiting sequences in the genome. I think "novel" here is shorthand "not in the reference genome".

0
Entering edit mode

I believe the SV people has a consensus about how to define "novel". Even paper I read on "novel" sequences/insertions define "novel" essentially the same way.

3
Entering edit mode
10.8 years ago

Look in Table S1 of Stewart et al (2011) A Comprehensive Map of Mobile Element Insertion Polymorphisms in Humans: http://www.plosgenetics.org/article/info%3Adoi/10.1371/journal.pgen.1002236

1
Entering edit mode

If so, then I would amend your question to clearly state that you are looking for non-mobile element sequences.

0
Entering edit mode

@Casey Bergman: as far as I can see, Table S1 in the Plos Genetics paper (http://www.plosgenetics.org/article/info%3Adoi/10.1371/journal.pgen.1002236) only contains ALU, L1 and SVA elements. I was hoping to have a place that listed the sequences for non-mobile element sequences.

0
Entering edit mode
8.4 years ago

In the 1000G pilot paper (A map of human genome variation from population-scale sequencing. The 1000 Genomes Consortium. Nature 467,1061-73 (2010)), we assembled 164 humans with Cortex and assembled novel sequence. The file is here:

ftp.1000genomes.ebi.ac.uk:/vol1/ftp/pilot_data/paper_data_sets/a_map_of_human_variation/low_coverage/sv/low_coverage.2010_10.novel_sequence


and the method is explained in the Supp Info.