Are my RNA-seq datasets strand-specific?
1
1
Entering edit mode
8.2 years ago
zizigolu ★ 4.3k

Hi

How I could know whether my data sets coming from RNA-seq are strand-specific or not? I emailed the company but they are in Chinese new year and could not help me yet but I should start the analysis.

Thank you

alignment next-gen-sequencing RNA-Seq • 4.0k views
ADD COMMENT
2
Entering edit mode
8.2 years ago

Align a million or so reads (using the unstranded mode, if your aligner has one) and then use RSeQC to infer the experiment type.

ADD COMMENT
2
Entering edit mode

Just to complete your suggestion with a link =)

http://rseqc.sourceforge.net/#infer-experiment-py

ADD REPLY
1
Entering edit mode

Thank you, then I need both bam and bed files? But in UCSC Table Browser there is no such a bed for Arabidopsis, then what to do?

ADD REPLY
2
Entering edit mode

Get the bed from BioMart (Ensembl plant): http://plants.ensembl.org/Arabidopsis_thaliana/Info/Index

Make sure the genome build matches.

ADD REPLY
1
Entering edit mode

Thank you, for aligning I used gtf coming from ensembl but there is gff3. no problem?

The main problem is this, I tried to convert my gtf to bed by bedops but some bug that I could not solve then I tried galaxy but in galaxy there is only gff to bed then I used my gtf instead of gff but

[izadi@lbox161 bin]$ infer_experiment.py -r Galaxy142-[GFF-to-BED_on_data_141].bed -i 204.bam
infer_experiment.py: No match.
[izadi@lbox161 bin]$

Then do you know another solution please?

ADD REPLY
1
Entering edit mode
./infer_experiment.py -r Galaxy142-[GFF-to-BED_on_data_141].bed -i 204.bam
ADD REPLY
1
Entering edit mode

Looks like you need to provide the path for infer_experiment.py since that does not seem to be in that bin directory.

ADD REPLY
0
Entering edit mode

Thank you but please consider

[izadi@lbox161 bin]$ pwd
/usr/data/nfs6/izadi/RSeQC-2.6.3/usr/bin
[izadi@lbox161 bin]$ ls
204.bam                                 junction_saturation.py
bam2fq.py                               mismatch_profile.py
bam2wig.py                              normalize_bigwig.py
bam_stat.py                             overlay_bigwig.py
clipping_profile.py                     read_distribution.py
deletion_profile.py                     read_duplication.py
divide_bam.py                           read_GC.py
FPKM_count.py                           read_hexamer.py
Galaxy142-[GFF-to-BED_on_data_141].bed  read_NVC.py
geneBody_coverage2.py                   read_quality.py
geneBody_coverage.py                    RNA_fragment_size.py
infer_experiment.py                     RPKM_saturation.py
inner_distance.py                       split_bam.py
insertion_profile.py                    split_paired_bam.py
junction_annotation.py                  tin.py
[izadi@lbox161 bin]$
ADD REPLY
1
Entering edit mode

Try this in case Devon's suggestion did not work

$ python ./infer_experiment.py -r Galaxy142-[GFF-to-BED_on_data_141].bed -i 204.bam
ADD REPLY
1
Entering edit mode

Thank you,

[izadi@lbox161 bin]$ python infer_experiment.py -r Galaxy142-[GFF-to-BED_on_data_141].bed -i 204.bam
python: No match.
[izadi@lbox161 bin]$

:(

ADD REPLY
1
Entering edit mode

You don't have python installed on this machine? What OS are you using?

$ which python
ADD REPLY
1
Entering edit mode

fedora22

[izadi@lbox161 bin]$ which python
/bin/python
[izadi@lbox161 bin]$
ADD REPLY
1
Entering edit mode

Looks like python/infer_experiment.py does not like that atrocious BED file name. Can you move/copy it to something else and try.

$ cp Galaxy142-[GFF-to-BED_on_data_141].bed simple.bed
$ ./infer_experiment.py -r simple.bed -i 204.bam
ADD REPLY
1
Entering edit mode

Thank you,

[izadi@lbox161 bin]$ infer_experiment.py -r simple.bed -i 204.bam
Traceback (most recent call last):
  File "infer_experiment.py", line 49, in <module>
    from bx.bitset import *
ImportError: No module named bx.bitset
[izadi@lbox161 bin]$
ADD REPLY
1
Entering edit mode

Thank you

[izadi@lbox161 RSeQC-2.6.3]$ python setup.py build
running build
running build_py

running build_ext
running build_scripts
[izadi@lbox161 RSeQC-2.6.3]$
[izadi@lbox161 RSeQC-2.6.3]$ sudo python setup.py install
[sudo] password for izadi:
Sorry, user izadi is not allowed to execute '/bin/python setup.py install' as root on lbox161.mpikg.mpg.de.
[izadi@lbox161 RSeQC-2.6.3]$
ADD REPLY
1
Entering edit mode
python setup.py install --root=/home/user/XXX/
ADD REPLY
0
Entering edit mode

You could ask your sys admins to install the modules or try to install them in your own directory: https://www.seas.upenn.edu/cets/answers/install-python-module.html

It is a bit late in this thread but can't you ask whoever produced this data if they used a "strand-specific" protocol?

ADD REPLY
1
Entering edit mode

Thank you

I emailed the company but they are in holiday

ADD REPLY
0
Entering edit mode

Not ideal but you could look at the alignments in IGV/Tablet to see if you can see an identifiable pattern in the alignment of reads.

ADD REPLY
1
Entering edit mode

Thank you so much.

In a biostars post I read something about sam flags. Then I saw abundances of 83, 99, 147 and 163 flags of my sam file is the same=1, then might library IS sequenced strand-specific. Anyway I think they will back from holiday Feb 14 and they may reply to my email

ADD REPLY
1
Entering edit mode

Sorry, I installed Python then this is my result

Reading reference gene model simple.bed ... Done
Loading SAM/BAM file ... Total 200000 usable reads were sampled

This is PairEnd Data
Fraction of reads failed to determine: 0.0068
Fraction of reads explained by "1++,1--,2+-,2-+": 0.4992
Fraction of reads explained by "1+-,1-+,2++,2--": 0.4940

My library was sequenced in strand-specific mode? Sorry if so first or second strand?

Thank you

ADD REPLY
1
Entering edit mode

No it was not strand-specific. See the explanation here: http://rseqc.sourceforge.net/#infer-experiment-py

ADD REPLY
1
Entering edit mode

thank youuu

ADD REPLY
0
Entering edit mode

Eeek, I had a few missing words in there too! Thanks for adding the link!

ADD REPLY
1
Entering edit mode

Thank you, I aligned but totally on transcriptom and I have accepted_hits.bam now. I am in this path: /usr/data/nfs6/izadi/RSeQC-2.6.3/usr/bin. How can I detect if is strand-specific?

ADD REPLY

Login before adding your answer.

Traffic: 3120 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6