Individual whole genome sequence data download FASTQ
1
0
Entering edit mode
9.6 years ago

Hey there,

I am looking for a site to download whole genome sequence data from individuals to use in a study to compare with WGS sets of patients, that we sequenced in house.

We also sequenced one HapMap sample to use as a reference but more references are always good ;)

I looked at the sequence read archive of NCBI, the 1000 genome project etc. but seem to be to stupid to find suitable, preferably Illumins HiSeq 2000 FASTQ-files to do the mapping myself, data.

Does anyone have a good resource? I would appreciate it a lot

Thanks in advance and cheers

stefan

sequence sequencing alignment • 5.0k views
ADD COMMENT
0
Entering edit mode

I didn't get it, are you looking for human reference genome?

ADD REPLY
0
Entering edit mode

hey,

Sorry if I didn't explain to well, I will try again ;)

I am not looking for the human reference genome, that I have to do the mapping and use it to map FASTQ files to create the BAM-files.

I have raw sequence reads (FASTQ files) of patients sequenced myself and for further analysis I need raw sequence reads of more individuals that I can use to compare to the DNA of my patients.

I hope I made my problem clearer, if not keep on firing questions.

Cheers
Stefan

ADD REPLY
3
Entering edit mode
9.6 years ago

looked at the sequence read archive of NCBI, the 1000 genome project etc. but seem to be to stupid to find suitable, preferably Illumins HiSeq 2000 FASTQ-files to do the mapping myself, data.

search in the ftp indexes of the 1000 genomes project:

$ curl -s "ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/sequence.index" | grep  -E '(FASTQ_FILE|WGS)' | grep -E '(HiSeq 2000|FASTQ_FILE)' | grep -v '/ERR' | head -n 10 | verticalize

>>> 2
$1            FASTQ_FILE : data/HG02654/sequence_read/SRR588495.filt.fastq.gz
$2                   MD5 : dd5d09d06e1d9480d0344ed8bdc10007
$3                RUN_ID : SRR588495
$4              STUDY_ID : SRP004077
$5            STUDY_NAME : 1000 Genomes PJL WGS sequencing
$6           CENTER_NAME : BI
$7         SUBMISSION_ID : SRA059511
$8       SUBMISSION_DATE :
$9             SAMPLE_ID : SRS290936
$10          SAMPLE_NAME : HG02654
$11           POPULATION : PJL
$12        EXPERIMENT_ID : SRX194638
$13  INSTRUMENT_PLATFORM : ILLUMINA
$14     INSTRUMENT_MODEL : Illumina HiSeq 2000
$15         LIBRARY_NAME : Sage-109754
$16             RUN_NAME : C0W2YACXX120811.5.tagged_474.bam
$17       RUN_BLOCK_NAME :
$18          INSERT_SIZE : 402
$19       LIBRARY_LAYOUT : PAIRED
$20         PAIRED_FASTQ :
$21            WITHDRAWN : 0
$22       WITHDRAWN_DATE :
$23              COMMENT :
$24           READ_COUNT : 14621
$25           BASE_COUNT : 1476721
$26       ANALYSIS_GROUP : low coverage
<<< 2

>>> 3
$1            FASTQ_FILE : data/HG02654/sequence_read/SRR588495_1.filt.fastq.gz
$2                   MD5 : 12d15fb64f40c930ad567e06d60784a5
$3                RUN_ID : SRR588495
$4              STUDY_ID : SRP004077
$5            STUDY_NAME : 1000 Genomes PJL WGS sequencing
$6           CENTER_NAME : BI
$7         SUBMISSION_ID : SRA059511
$8       SUBMISSION_DATE :
$9             SAMPLE_ID : SRS290936
$10          SAMPLE_NAME : HG02654
$11           POPULATION : PJL
$12        EXPERIMENT_ID : SRX194638
$13  INSTRUMENT_PLATFORM : ILLUMINA
$14     INSTRUMENT_MODEL : Illumina HiSeq 2000
$15         LIBRARY_NAME : Sage-109754
$16             RUN_NAME : C0W2YACXX120811.5.tagged_474.bam
$17       RUN_BLOCK_NAME :
$18          INSERT_SIZE : 402
$19       LIBRARY_LAYOUT : PAIRED
$20         PAIRED_FASTQ : data/HG02654/sequence_read/SRR588495_2.filt.fastq.gz
$21            WITHDRAWN : 0
$22       WITHDRAWN_DATE :
$23              COMMENT :
$24           READ_COUNT : 2415523
$25           BASE_COUNT : 243967823
$26       ANALYSIS_GROUP : low coverage
<<< 3

>>> 4
$1            FASTQ_FILE : data/HG02654/sequence_read/SRR588495_2.filt.fastq.gz
$2                   MD5 : da0f8bec3077c6e84ea700f742390dae
$3                RUN_ID : SRR588495
$4              STUDY_ID : SRP004077
$5            STUDY_NAME : 1000 Genomes PJL WGS sequencing
$6           CENTER_NAME : BI
$7         SUBMISSION_ID : SRA059511
$8       SUBMISSION_DATE :
$9             SAMPLE_ID : SRS290936
$10          SAMPLE_NAME : HG02654
$11           POPULATION : PJL
$12        EXPERIMENT_ID : SRX194638
$13  INSTRUMENT_PLATFORM : ILLUMINA
$14     INSTRUMENT_MODEL : Illumina HiSeq 2000
$15         LIBRARY_NAME : Sage-109754
$16             RUN_NAME : C0W2YACXX120811.5.tagged_474.bam
$17       RUN_BLOCK_NAME :
$18          INSERT_SIZE : 402
$19       LIBRARY_LAYOUT : PAIRED
$20         PAIRED_FASTQ : data/HG02654/sequence_read/SRR588495_1.filt.fastq.gz
$21            WITHDRAWN : 0
$22       WITHDRAWN_DATE :
$23              COMMENT :
$24           READ_COUNT : 2415523
$25           BASE_COUNT : 243967823
$26       ANALYSIS_GROUP : low coverage
<<< 4

>>> 5
$1            FASTQ_FILE : data/HG02696/sequence_read/SRR588497.filt.fastq.gz
$2                   MD5 : 04182f5744883fac63c0f7bfe3b56fe2
$3                RUN_ID : SRR588497
$4              STUDY_ID : SRP004077
$5            STUDY_NAME : 1000 Genomes PJL WGS sequencing
$6           CENTER_NAME : BI
$7         SUBMISSION_ID : SRA059511
$8       SUBMISSION_DATE :
$9             SAMPLE_ID : SRS290951
$10          SAMPLE_NAME : HG02696
$11           POPULATION : PJL
$12        EXPERIMENT_ID : SRX194639
$13  INSTRUMENT_PLATFORM : ILLUMINA
$14     INSTRUMENT_MODEL : Illumina HiSeq 2000
$15         LIBRARY_NAME : Sage-109762
$16             RUN_NAME : D1314ACXX120814.6.tagged_581.bam
$17       RUN_BLOCK_NAME :
$18          INSERT_SIZE : 391
$19       LIBRARY_LAYOUT : PAIRED
$20         PAIRED_FASTQ :
$21            WITHDRAWN : 0
$22       WITHDRAWN_DATE :
$23              COMMENT :
$24           READ_COUNT : 13615
$25           BASE_COUNT : 1375115
$26       ANALYSIS_GROUP : low coverage
<<< 5
ADD COMMENT
0
Entering edit mode

File not found... Can you please check?

ADD REPLY
0
Entering edit mode

this thread is 2.6 years old. Can YOU please search for it.

ADD REPLY

Login before adding your answer.

Traffic: 1584 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6