Question: How can I view and manage the Trinity.fasta by bash
0
gravatar for kamoltip.lao
3 months ago by
kamoltip.lao0 wrote:

Hi everyone,

I'm a beginner and learning to analyze the sequence by practicing. Please give me a suggestion, How can I view and manage the assembly transcriptome from Trinity as Trinity.fasta by bash?

$ less Trinity.fasta 
>TRINITY_DN30765_c0_g1_i1 len=228 path=[0:0-227]
TGGCGAAGTTTAGAGCACGGTGTTATCGGTGCTAAAGCAGGTTTTATGGGTAGCATAGCTAAATCGCATAAATATATGCTACACATATGGCTTTCCATTGCCAACAACATGCTTACCACCTGCACTGGGTTCCACAGAAACTGGGTTCCACAGCTGTGTATTGGTCCAAGTGTTGT
AACATGCTTACCACCTGCACTGGGTTCCACAGAAATATGCAGTGTTATCTCTTTACATGCTTTCTGTGTATTGTGCGCGTTC
>TRINITY_DN30719_c0_g1_i1 len=202 path=[0:0-201]
CGCCGATATAAAAGATGGAGCACCCTGTATATGTATATACGTTCATGTCTTAATACAACTGTTGTTGTATACTTATATAAATACAAATCTGTTAATTCGTGGAATAGCAATTTACCACCCATGAATAAAGTGAATTGTTCTCAGTACCTTTGAAATACGTTTAAGTAATGTAATTTACATAA
>TRINITY_DN30753_c0_g1_i1 len=336 path=[0:0-335]
AAAAAATTAGCTTTATTTTTACTTTATGGTAATAGCTTTGGTGAAATATCGAAATTTTGACTTGAATTGTACCTATCAAACCATCTGAAATCGTACATTAGTACACAAGCAAATCATTTAGACTCTTTCGTCTATCTTCGGAACAAAAACTACCACTGCTTATATGTTTGGTTTTTAATGACGTGCTGGGACCATGTAATAAGGAGTT

I tried to use 'grep' to search for the transcript that specific to 'TTTGGTGAAATATCGAAA' sequence.

grep "TTTGGTGAAATATCGAAA" Trinity.fasta
AAAAAATTAGCTTTATTTTTACTTTATGGTAATAGCTTTGGTGAAATATCGAAATTTTGACTTGAATTGTACCTATCAAACCATCTGAAATCGTACATTAGTACACAAGCAAATCATTTAGACTCTTTCGTCTATCTTCGGAACAAAAACTACCACTGCTTATATGTTTGGTTTTTAATGACGTGCTGGGACCATGTAATAAGGAGTT

But, the output is without transcript ID, How can I view the transcript name and the sequence at the same time?

Or, in the case that I know the transcript ID, How can I view its sequence?

Thank you so much

rna-seq • 179 views
ADD COMMENTlink modified 3 months ago by lieven.sterck4.5k • written 3 months ago by kamoltip.lao0

try using tools like seqkit:

ADD REPLYlink written 3 months ago by cpad011211k
0
gravatar for lieven.sterck
3 months ago by
lieven.sterck4.5k
VIB, Ghent, Belgium
lieven.sterck4.5k wrote:
grep -B1 "TTTGGTGAAATATCGAAA" Trinity.fasta

will do the trick , and do

grep -A1 "transcriptID" to get the reverse, id +sequence

Depending on what your goal is, this is likely not the best approach to analyse or manage fasta files (eg. if the pattern you look for is split over two lines you will already miss it).

ADD COMMENTlink modified 3 months ago • written 3 months ago by lieven.sterck4.5k

Thank you so much, I got it.

ADD REPLYlink written 3 months ago by kamoltip.lao0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1473 users visited in the last hour