Question

how to fetch the specific sequence of gene from rna seq data

0

Entering edit mode

5.5 years ago

fatimarasool135 ▴ 90

can we retrived the spesific seqence of the gene from rna seq data of one species.?

For this pupose i perform the following steps but the resulyt out put file contain long string of nnnn with nuclotide sequnce

Indexing by bowtie2

bowtie2-build --large-index -f  wheat.fa  wheat

mapping by tophat

tophat -p 30 -G wheat.gff3  wheat  G1_cleaned_R1.fastq  G1_cleaned_R2.fastq -o Alingment

Get consensus fastq file

samtools mpileup -uf REFERENCE.fasta  Aceepted_hits.bam | bcftools call -c | vcfutils.pl vcf2fq > batis_cns.fastq

Convert .fastq to .fasta

seqtk seq -aQ64 baris_cns.fastq > batis_cns.fasta

cordinate fetching

zcat Triticum_aestivum.gff3.gz |grep "TraesCS5A02G213300" > coordinate.bed12

Command for sequence fetching

bedtools getfasta -fi batis_cns.fasta -bed coordinate.bed12 -fo OUTPUT-GENE-SEQUENCE.fa

RNA-Seq gene alignment • 1.5k views

ADD COMMENT • link updated 5.5 years ago by finswimmer 16k • written 5.5 years ago by fatimarasool135 ▴ 90

2

Entering edit mode

Hello fatimarasool135 ,

Please use the formatting bar (especially the code option) to present your post better. I've done it for you this time.
You should know that the old 'Tuxedo' pipeline of Tophat(2) and Cufflinks is no longer the "advisable" tool for RNA-seq analysis. The software is deprecated/ in low maintenance and should be replaced by HISAT2, StringTie and ballgown. See this paper: Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. There are also other alternatives, including alignment with STAR and bbmap, or pseudo-alignment using salmon.

Please stop using Tophat https://t.co/Es4ohxOEyx Cole and I developed the method in *2008*. It was greatly improved in TopHat2 then HISAT & HISAT2. There is no reason to use it anymore. I have been saying this for years yet it has more citations this year than last #methodsmatter
— Lior Pachter (@lpachter) December 2, 2017

Thank you!

ADD REPLY • link 5.5 years ago by finswimmer 16k

0

Entering edit mode

Thank you please tell me steps to get the specific sequence of gene from rna seq data .like there is a gene abc in wheat i want to retrieve this abc from mine sample of rna seq .

ADD REPLY • link 5.5 years ago by fatimarasool135 ▴ 90

0

Entering edit mode

You have described the procedure to get the consensus sequence for the gene you are interested in in your original post. So is there a question beyond that?

ADD REPLY • link 5.5 years ago by GenoMax 142k

0

Entering edit mode

I simply follow these step to get gene sequence.... i want to know how i fetch the desired seq of gene from rna seq of mine sample

ADD REPLY • link 5.5 years ago by fatimarasool135 ▴ 90

0

Entering edit mode

Hi. what will the reference sequence used for mapping of rna seq read ? is it shuold be cDNA or genomic sequence?

ADD REPLY • link 5.4 years ago by fatimarasool135 ▴ 90

0

Entering edit mode

Both can potentially be used. If you are not interested in discovery of novel transcripts you could map against transcriptome ( and if you are using salmon or kallisto you would go this route). General recommendation is to align against the genome and then use a GTF (gene model file) to count reads that fall under the boundaries defined in that file.

ADD REPLY • link 5.4 years ago by GenoMax 142k

0

Entering edit mode

from where i can got GTF file ?

ADD REPLY • link 5.4 years ago by fatimarasool135 ▴ 90

0

Entering edit mode

Looks like you have a GFF3 file. That can work.

ADD REPLY • link 5.4 years ago by GenoMax 142k

0

Entering edit mode

i have to generate it from tool or download it from ensembl ?

ADD REPLY • link 5.4 years ago by fatimarasool135 ▴ 90