Question: transcript and fasta sequence
0
gravatar for qudrat
2.6 years ago by
qudrat70
NATIONAL INSTITUTE OF IMMUNOLOGY, INDIA
qudrat70 wrote:

Hello all! I have two file containing ten thousands of transcripts fasta sequences each with different ids and I am interested in finding common sequences between the two files. Somebody please help me as it is hindering my work. Thank you in appreciation

sequence assembly • 1.0k views
ADD COMMENTlink modified 2.6 years ago by glihm620 • written 2.6 years ago by qudrat70

Somebody please help me as it is hindering my work.

In what way?

How about this tool that merges assemblies? See more options here: High quality de novo transcriptome assembly rely on merging multiple assembly? Specifically dedupe.sh from BBMap should be very simple to use.

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by genomax83k

ten thousands of transcripts fasta sequences each with different ids

Can you post an example? Is this a de novo assembly?

ADD REPLYlink written 2.6 years ago by st.ph.n2.5k

Actually this a de novo assembly produced by using two different softwares to minimizes the false positives

ADD REPLYlink written 2.6 years ago by qudrat70
0
gravatar for glihm
2.6 years ago by
glihm620
France
glihm620 wrote:

Hello qudrat,

  1. If you are interested in IDENTICAL sequences, you can simply write a very short script to extract identical sequences in both files.

  2. You want to apply a "similarity" score, if so I strongly suggest using multi-aligners (BLAST or MUSCLE for instance) and then parse the results to have a global overview of similarity between sequences from your two different files.

  3. EDIT @genomax commentary: Use of assembly merge-tool.

ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by glihm620

Hi glihm, Actually I was thinking of sort but I do not know script writing. This is a de novo assembly using two different software and I am doing this to minimizes the false positives.

ADD REPLYlink written 2.6 years ago by qudrat70

Your request is now clearer. The answer of @genomax is in this case well suited for your issue by using assembly merging.

ADD REPLYlink written 2.6 years ago by glihm620
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 828 users visited in the last hour