Fastq-dump warning: too many reads at spot id
0
1
Entering edit mode
2.6 years ago

I want to download a dataset PRJNA281410 from SRA and corresponding reference genome (fasta format) . My code

esearch -db sra -query PRJNA281410 \
| elink -target assembly \
| efetch -format docsum \
| xtract -pattern DocumentSummary -element FtpPath GenBank \
| cut -d ',' -f 1 \
| grep SRR \
| xargs -n 1 -P 4 fastq-dump --split-files --gzip --skip-technical SRR18781516

Warning of skipped lines (many lines of the following warning): fastq-dump warn: too many reads at spot id XXX, maximum YY supported, skipped

References:

fastq-dump • 867 views
ADD COMMENT
0
Entering edit mode

If you are using xargs to pass values why do you have a fixed SRR18781516 at the end of your command? Additionally your command as posted does not work past the first search step. Many of these datasets are PacBio so I don't think that blanket fastq-dump command will work.

Finally not sure what you mean by

corresponding reference genome (fasta format)

Are you looking to get the reference genomes for the two bacteria that are part of the data?

ADD REPLY
0
Entering edit mode

fastq-dump threw an error without specifying the SRR

ADD REPLY
0
Entering edit mode

You can check out ENA which also provides fastq files.

ADD REPLY

Login before adding your answer.

Traffic: 2270 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6