Hello, Can i get a brief overview on data that is stored in 1000 genomes phase 3 dataset for a single sample name such as HG00113 there a multitude of fastq paired end reads in their ftp server
ERR020088_1.filt.fastq.gz 24.6 GB
ERR020088_2.filt.fastq.gz 24.7 GB
ERR229776.filt.fastq.gz 360 MB
ERR229776_1.filt.fastq.gz 9.4 GB
ERR229776_2.filt.fastq.gz 9.7 GB
SRR070517.filt.fastq.gz 7.4 MB
SRR070517_1.filt.fastq.gz 2.2 GB
SRR070517_2.filt.fastq.gz 2.3 GB
SRR070802.filt.fastq.gz 6.8 MB
SRR070802_1.filt.fastq.gz 2.2 GB
SRR070802_2.filt.fastq.gz 2.3 GB
can someone explain as how to interpret the data is it the same sample or different samples that are included in the same run accession.
why do i get multiple set of paired end reads ??
Thanks a ton Kevin, this explained the confusion that i had... thanks for the sources will go through them !
If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.
Thanks Wouter! You the man!
I feel like a bot sometimes though. Anyways, keeps me busy when waiting for scripts to finish/travis to check my build/...
Okay great - good luck with it! It's a lot of data