Question: gene sequence fetching from bam or fastq file of rna seq data
0
gravatar for fatimarasool135
8 weeks ago by
fatimarasool13520 wrote:

I have 6 wheat samples sequenced using RNA-seq. I received forward and reverse fastq files and I generated bam files by using hisat2 tool which are aligned with the reference wheat genome. I have been asked to build multiple sequence alignment for 3 genes from this sequenced rna seq data. I believe I need to select a gene sequence from all the samples and do a multiple sequence alignment. But I am struck in fetching the gene sequence from bam files. How do I select one gene sequence for all the 6 samples? Any suggestions? kidly send me commands for fetching the sequence of genes from this RNA seq data in fastq file or aligned.bam file.

ADD COMMENTlink modified 7 weeks ago • written 8 weeks ago by fatimarasool13520

are you looking for isoforms? read about transcript assembly with stringtie, or isoform detection, once detected or assembled you can extract the sequences and align them.

ADD REPLYlink modified 8 weeks ago • written 8 weeks ago by Buffo1.5k
0
gravatar for fatimarasool135
7 weeks ago by
fatimarasool13520 wrote:

Hi Buffo, Thanks, I have run the srtigtie.got this error. kindly check this error.i did not get it and solve it .Kindly set this commandaccording to my sample.

./stringtie G1_sorted.bam -B -o G1.gtf -G Triticum_aestivum.IWGSC.42.gtf -p 4 -C G1.refs.gtf -A G1.abund.tab -WARNING: no reference transcripts were found for the genomic sequences where reads were mapped! Please make sure the -G annotation file uses the same naming convention for the genome sequences

ADD COMMENTlink written 7 weeks ago by fatimarasool13520

If you use -c

-C output a file with reference transcripts that are covered by reads

And gets

WARNING: no reference transcripts were found for the genomic sequences where reads were mapped!

It literaly means that you reads have mapped to regions without annotated transcripts. Be sure you are using the correct annotation. column 3 = Transcript

ADD REPLYlink written 7 weeks ago by Buffo1.5k

Now I run this command without the -c option .again got the same error. suggest me a solution .

./stringtie G1_sorted.bam -G Triticum_aestivum.IWGSC.42.gtf -l G1-Label -o G1_ST.gtf -p 15 WARNING: no reference transcripts were found for the genomic sequences where reads were mapped! Please make sure the -G annotation file uses the same naming convention for the genome sequences.

ADD REPLYlink written 7 weeks ago by fatimarasool13520

here is gtf file

!genome-build IWGSC

!genome-version IWGSC

!genome-date 2018-07

!genome-build-accession GCA_900519105.1

3B IWGSC gene 212892 214491 . - . gene_id "TraesCS3B02G000100"; gene_source "IWGSC"; gene_biotype "protein_coding"; 3B IWGSC transcript 212892 214491 . - . gene_id "TraesCS3B02G000100"; transcript_id "TraesCS3B02G000100.1"; gene_source "IWGSC"; gene_biotype "protein_coding"; transcript_source "IWGSC"; transcript_biotype "protein_coding";

ADD REPLYlink modified 7 weeks ago • written 7 weeks ago by fatimarasool13520

Before to perform any bioinformatic analysis I would recommend you:

Be sure what are you looking for 
It is possible to get by in silico analysis?
what tools are available to do it? and how it works?
Do I need professional help?

Sorry but it looks like you do not have idea what are you doing, read about gtf file format. If you perform Stringtie analysis it searchsfor transcripts (column 3), and therefore, if you do not have annotated transcripts or your reads maps to not annotated regions you will get a warning like....

WARNING: no reference transcripts were found for the genomic sequences where reads were mapped! Please make sure the -G annotation file uses the same naming convention for the genome sequences.
ADD REPLYlink modified 6 weeks ago • written 6 weeks ago by Buffo1.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1050 users visited in the last hour