Question: How to make the script work with stringtie command first and then merge the outputs?
0
gravatar for Biologist
23 months ago by
Biologist190
Biologist190 wrote:

Hi,

With hisat2 command I have bam files as output and sorted them to use as input for stringtie.

I have a bash script like below. First it will take sorted.bam files as input and give gtf as output. Then path for each sample gtf will be given into merge list.txt. and then use stringtie merge on them.

I totally have 40 sorted.bam files.

for sample in /path/*.sorted.bam
do
  dir="/pathto/hisat2_output"
  dir2="/pathto/folder"
  base=`basename $sample '.sorted.bam'`
  "stringtie -p 8 -G gencode.v27.primary_assembly.annotation_nochr.gtf \
    -o ${dir2}/stringtie_output/${base}/${base}_GRCh38.gtf \
    -l ${dir2}/stringtie_output/${base}/${base} ${dir}/${base}.sorted.bam; 
  ls ${dir2}/stringtie_output/*/*_GRCh38.gtf > mergelist.txt; 
  stringtie --merge -p 8 -G gencode.v27.primary_assembly.annotation_nochr.gtf \
    -o ${dir2}/stringtie_output/stringtie_merged.gtf mergelist.txt"
done

I separated the commands with ; After running the script on all sorted.bam files and after completing the job I see that mergelist.txt has paths only for 33 sample gtf's. Which means the path for other 7 sample gtfs is missing in merge list.txt.

How to make the script work with one command first and then to next command?

linux rna-seq merge bash stringtie • 2.0k views
ADD COMMENTlink modified 23 months ago by h.mon29k • written 23 months ago by Biologist190
1

Maybe run stringtie manually on one or two of the missing samples and check if that generates anything or whether it fails for those samples.

On a side note, I would move the ls ${dir2}/stringtie_output/*/*_GRCh38.gtf > mergelist.txt; and stringtie --merge -p 8 -G gencode.v27.primary_assembly.annotation_nochr.gtf -o ${dir2}/stringtie_output/stringtie_merged.gtf mergelist.txt out of the loop. You only want to do that once after all samples have been processed. You can also move the dir="/pathto/hisat2_output" and dir2="/pathto/folder" to be executed before the loop, as they don't seem to be sample-specific.

ADD REPLYlink written 23 months ago by cschu1812.0k

Is this what you want to me to change in the script?

  dir="/pathto/hisat2_output"
  dir2="/pathto/folder"
for sample in /path/*.sorted.bam
do
  base=`basename $sample '.sorted.bam'`
  "stringtie -p 8 -G gencode.v27.primary_assembly.annotation_nochr.gtf \
    -o ${dir2}/stringtie_output/${base}/${base}_GRCh38.gtf \
    -l ${dir2}/stringtie_output/${base}/${base} ${dir}/${base}.sorted.bam"
done

  ls ${dir2}/stringtie_output/*/*_GRCh38.gtf > mergelist.txt; 
  stringtie --merge -p 8 -G gencode.v27.primary_assembly.annotation_nochr.gtf \
    -o ${dir2}/stringtie_output/stringtie_merged.gtf mergelist.txt
ADD REPLYlink modified 23 months ago • written 23 months ago by Biologist190

Yes, but make sure to check what stringtie does with the bam files for which you don't get output. The script changes should only affect run-time.

ADD REPLYlink written 23 months ago by cschu1812.0k

Yes, when ran the stringtie with bam files which I don't get in mergelist.txt works when I gave manually.

ADD REPLYlink written 23 months ago by Biologist190

In the above way it gives an error. I submitted the job.

ls: cannot access /path//_GRCh38.gtf: No such file or directory Error: no transcripts were found in input file mergelist.txt

ADD REPLYlink written 23 months ago by Biologist190
1

Remove the "" around "stringtie -p 8 -G gencode.v27.primary_assembly.annotation_nochr.gtf \ -o ${dir2}/stringtie_output/${base}/${base}_GRCh38.gtf \ -l ${dir2}/stringtie_output/${base}/${base} ${dir}/${base}.sorted.bam" ?

ADD REPLYlink written 23 months ago by cschu1812.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 940 users visited in the last hour