Question: Find number of reads from particular transcript
0
gravatar for shmaisrael
2.3 years ago by
shmaisrael0
shmaisrael0 wrote:

I got a few fastq files from RNA-seq experiment and somebody ask me to check the number of reads that come from retroelements L1-ORF1/2p. Is it a simple way to perform blast of the whole fastq file agains the sequence of the elements (instead of align it to the reference genome using hisat/bowtie/bwa etc) and check the number of reads in each sample? I can try a grep but it can take a long time. Thank you for advise.

rna-seq fastq • 741 views
ADD COMMENTlink modified 2.3 years ago • written 2.3 years ago by shmaisrael0

instead of align it to the reference genome using hisat/bowtie/bwa etc

Since you are talking about RNA-seq you want a splice-aware aligner so bwa and bowtie will not suffice. HISAT is a good one, alternatives are STAR and bbmap.

ADD REPLYlink written 2.3 years ago by WouterDeCoster40k

Thank you. I'll try to run it after indexing. By the way is it possible to run bam files against indexed sequence?

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by shmaisrael0
1

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized.

If you only need a rough idea of the reads that could be from that transcript you may be able to use kallisto or Salmon to do pseudoalignments.

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by genomax69k

Thanks. Could you please update the link to instructions?

ADD REPLYlink written 2.3 years ago by shmaisrael0

Again. Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized. I moved your post now.

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by WouterDeCoster40k
0
gravatar for Santosh Anand
2.3 years ago by
Santosh Anand4.9k
Santosh Anand4.9k wrote:

You are better off with NGS aligners like bwa. Reasons:

1) They would be faster than blast / grep.

2) They are quality aware.

3) grep cannot do a fuzzy search. So reads with sequencing error or polymorphism will not be found with grep. grep is terribly slow for NGS searches unless you optimize the search params and restrict the search to specific LOCALE.

4) It's supereasy to do it! (just index the spliced sequenced and run bwa mem, for example)

ADD COMMENTlink written 2.3 years ago by Santosh Anand4.9k
0
gravatar for Devon Ryan
2.3 years ago by
Devon Ryan91k
Freiburg, Germany
Devon Ryan91k wrote:

Since you're interested in a repeat, you might find TETranscript useful. That will allow more accurate quantification. You might also try TECounts from my pull request on that repository.

ADD COMMENTlink written 2.3 years ago by Devon Ryan91k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 867 users visited in the last hour