Entering edit mode
5.5 years ago
fatimarasool135
▴
90
can we retrived the spesific seqence of the gene from rna seq data of one species.?
For this pupose i perform the following steps but the resulyt out put file contain long string of nnnn with nuclotide sequnce
Indexing by bowtie2
bowtie2-build --large-index -f wheat.fa wheat
mapping by tophat
tophat -p 30 -G wheat.gff3 wheat G1_cleaned_R1.fastq G1_cleaned_R2.fastq -o Alingment
Get consensus fastq file
samtools mpileup -uf REFERENCE.fasta Aceepted_hits.bam | bcftools call -c | vcfutils.pl vcf2fq > batis_cns.fastq
Convert .fastq to .fasta
seqtk seq -aQ64 baris_cns.fastq > batis_cns.fasta
cordinate fetching
zcat Triticum_aestivum.gff3.gz |grep "TraesCS5A02G213300" > coordinate.bed12
Command for sequence fetching
bedtools getfasta -fi batis_cns.fasta -bed coordinate.bed12 -fo OUTPUT-GENE-SEQUENCE.fa
Hello fatimarasool135 ,
Please use the formatting bar (especially the
code
option) to present your post better. I've done it for you this time.You should know that the old 'Tuxedo' pipeline of Tophat(2) and Cufflinks is no longer the "advisable" tool for RNA-seq analysis. The software is deprecated/ in low maintenance and should be replaced by HISAT2, StringTie and ballgown. See this paper: Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. There are also other alternatives, including alignment with STAR and bbmap, or pseudo-alignment using salmon.
Thank you!
Thank you please tell me steps to get the specific sequence of gene from rna seq data .like there is a gene abc in wheat i want to retrieve this abc from mine sample of rna seq .
You have described the procedure to get the consensus sequence for the gene you are interested in in your original post. So is there a question beyond that?
I simply follow these step to get gene sequence.... i want to know how i fetch the desired seq of gene from rna seq of mine sample
Hi. what will the reference sequence used for mapping of rna seq read ? is it shuold be cDNA or genomic sequence?
Both can potentially be used. If you are not interested in discovery of novel transcripts you could map against transcriptome ( and if you are using salmon or kallisto you would go this route). General recommendation is to align against the genome and then use a GTF (gene model file) to count reads that fall under the boundaries defined in that file.
from where i can got GTF file ?
Looks like you have a GFF3 file. That can work.
i have to generate it from tool or download it from ensembl ?