Question: SRA to paired fastq per read group
0
gravatar for MAPK
4 months ago by
MAPK1.7k
MAPK1.7k wrote:

I am trying to download SRA data and create paired end fastq files per read groups. Can someone please share how I can get this done? I would really appreciate if you could share a shell script to do this.

I tried this, which only splits fastq per RGs, but I also need to split them into FQ1 and FQ2 per RGs.

SRR="SRR1350739"
IFS=$'\n'
RGLINES=($(sam-dump --ngc XXXX.ngc ./${SRR} | sed -n '/^[^@]/!p;//q' | grep ^@RG))
args=(tee)
for RGLINE in ${RGLINES[@]}; do
  unset IFS
  RG=(${RGLINE})
args+=(\>\(grep -A3 --no-group-separator \"\\.${RG[1]#ID:}/[12]$\" \| gzip \> "./${SRR}.${RG[1]#ID:}.fastq-dump.split.defline.z.tee.fq.gz"\))

done

echo "Splitting ${SRR} into ${#RGLINES[@]} ReadGroups"
fastq-dump --ngc XXXX.ngc --split-e --defline-seq '@$ac.$si.$sg/$ri' --defline-qual '+' -Z "${SRR}" | eval ${args[@]}
ngs sra • 229 views
ADD COMMENTlink modified 4 months ago by GenoMax95k • written 4 months ago by MAPK1.7k
0
gravatar for GenoMax
4 months ago by
GenoMax95k
United States
GenoMax95k wrote:

Use bamtofastq from biobambam2. It can separate data into RG specific files.

ADD COMMENTlink written 4 months ago by GenoMax95k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1692 users visited in the last hour
_