I am trying to get aligned data from a NA12878 sample.
I am just interested in Chromosome 20.
I tried downloading a file directly but after a while the downloading just stops and I get an message that the downloading failed. I do not get a reason why.
after many attempts I tried the SRA tool-kit.
but I do not manage to get it working.
I use the following command line
sam-dump --aligned-region 20:1-16444167 --output-file SRR1976040_chr20.sam SRR1976040
When I use the command it is doing something for a couple of seconds.
When I look into the file I see that the file does have data in it. but the file is only 948 bytes and I can't work with it.
I am new to using this kind of software and tried googeling but could not find a answer that I understand and could use.
maybe someone here could help me out.
I am now downloading the whole bam file in the download section (NA12878_WGS_possorted_bam.bam) This file is 120 Gigs.
So far that file is downloading around 2,5Mbit/s
I also tried another download for just Chr20. this file is downloading around 50kb/s. and has an unknown amount of time left to be completed.
- Edit 2
someone showed me the correct command line to use.
sam-dump --aligned-region chr20 --output-file SRR1976036_chr20.sam SRR1976036
I used " 20" instead of "chr20"
however while running the task I get the following message.
" sam-dump.2.8.2 sys: timeout exhausted while reading file within network system module - mbedtls_ssl_read returned -76 ( NET - Reading information from the socket failed "
but the command line is still running.
now using the correct commandl line I encounter the next problem.
2017-04-13T07:52:59 sam-dump.2.8.2 sys: error unknown while reading file within network system module - mbedtls_ssl_read returned -76 ( NET - Reading information from the socket failed )
Fun part is I have the same error message when I use the fastq-dump command. The IT guy is now also looking into it.
I now asked just for a fragment of chr20.
Using the following command
sam-dump --aligned-region chr20:2500000-2600000 --output-file SRR1976036_chr20.sam SRR1976036
I don't encounter any problems. So it could be very well that the size of the file is after all a problem as people mentioned here below.
the IT guy looked into it.
he said there is an issue with the bandwith because there is a lot of traffic on the network of ncbi. for that reason I get a lot of time out errors.