Question: Can I use PICARD (SortSam) instead of SAMTOOLS (sort) for sorting bam files in RNAseq pipeline where HISAT2 is used for alignment.
1
gravatar for shuksi1984
10 days ago by
shuksi198420
shuksi198420 wrote:

My pipeline includes following steps:

STEP-1: Alignmnet with HISAT2

path/to/hisat2 -f -x /path/to/in-built/genome -1 /path/to/SRR925687_1.fa -2 /path/to/SRR925687_2.fa -S /path/to/RNA.sam

STEP-2 SAM-->Conversion

samtools view -S -b  /path/to/RNA.sam > /path/to/RNA.bam

STEP-3 BAM sorting

sudo java -jar /path/to/picard.jar SortSam INPUT=RNA.bam OUTPUT=RNA.sorted.bam SORT_ORDER=coordinate

SREP-4 Assemble transcripts with StringTie

/path/to/stringtie RNA.sorted.bam -A RNA.gene.abudance.tab -C RNA.cov.refs.gtf -G Homo_sapiens.GRCh38.86.gtf -B -e -o RNA.gtf -p 4

STEP-5 Prepare for DESeq2

cd /path/to/RNA #(where ballgown subdirectory is created)
python ./prepDE.py

Error: sub-directory 'ballgown' not found!

I created a subdirectory with the name "ballgown" and placed *.ctab files and GTF files, then executed the above the command, I got following error:

Error: no GTF files found under ./ballgown !

I believe error might be due to BAM sorting by sortsam. But, I didnt get any error message in rest of the steps.

ADD COMMENTlink modified 9 days ago • written 10 days ago by shuksi198420

To me those errors do not suggest something is wrong with your bam file sorting. What is the output of

ls /path/to/RNA(where ballgown subdirectory is created)?

ADD REPLYlink written 10 days ago by WouterDeCoster28k

Following files I moved in ballgown subdirectory

e2t.ctab
e_data.ctab
i2t.ctab
i_data.ctab
t_data.ctab
RNA.cov.refs.gtf
RNA.gtf
RNA.gene.abudance.tab

I also moved Homo_sapiens.GRCh38.86.gtf, when my error was not resolved

ADD REPLYlink modified 9 days ago by Ram15k • written 10 days ago by shuksi198420

I looked at the code of prepDE.py and this suggests that it can indeed not find the gtf files in that directory (perhaps the name is not as expected) and does not have a link with sorting bam files.

ADD REPLYlink modified 10 days ago • written 10 days ago by WouterDeCoster28k

Can the code of prepDE.py not recognize GTF file with .gtf extension?

Shall I perform the sorting step with SAMTOOLS?

ADD REPLYlink written 10 days ago by shuksi198420

Can the code of prepDE.py not recognize GTF file with .gtf extension?

The code is looking for a *.gtf file, but I'm not sure if it requires other naming constraints, this is the line searching for gtf files:

samples = [(i,glob.iglob(os.path.join(opts.input,i,"*.gtf")).next()) for i in next(os.walk(opts.input))[1] if re.search(opts.pattern,i)]

Shall I perform the sorting step with SAMTOOLS?

If that makes you happy, go for it.

ADD REPLYlink modified 10 days ago • written 10 days ago by WouterDeCoster28k

I used following command:

samtools sort RNA.bam -o RNA.sorted.bam

It got stuck.

ADD REPLYlink modified 10 days ago • written 10 days ago by shuksi198420

Stuck in what way? Depending on the size of the file sorting can take a while.

ADD REPLYlink written 10 days ago by genomax48k

It is running from the past 24hrs. File size is 3.5G RNA.bam

ADD REPLYlink written 10 days ago by shuksi198420

At that size sorting should not take 24h. How much memory do you have? Are you able to see if the samtools process is doing anything? Are there *tmp* files?

ADD REPLYlink written 9 days ago by genomax48k

Numerous RNA.sorted.bam.0000.bam files are generated, which disapper after sometime. Then after nothing is generated except multiples lines of "c62;c62;c62;c62;c62;c62;c62;" in terminal.

My machine has: RAM-16G HDD-1TB

ADD REPLYlink modified 9 days ago • written 9 days ago by shuksi198420

Let me know if anybody can find any solution

ADD REPLYlink modified 9 days ago • written 9 days ago by shuksi198420

Based on your description of hardware and file size this should be a trivial conversion taking less than 20 min.

What OS are you using? Did you compile samtools yourself? What version of samtools are you using?

ADD REPLYlink written 8 days ago by genomax48k

OS description:

Distributor ID: Ubuntu

Description:Ubuntu 15.04

Release:15.04

Codename:vivid

Yes, I compiled samtools myself. SAMTOOLS: Version: 1.2 (using htslib 1.2.1)

ADD REPLYlink modified 8 days ago • written 8 days ago by shuksi198420

Samtools is currently in v. 1.8. I suggest that you upgrade and see if that helps.

ADD REPLYlink written 8 days ago by genomax48k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 662 users visited in the last hour