Match trimmed sequences with full length reference set

0

Entering edit mode

9.0 years ago

pmr • 0

I have a set of sequences that have been trimmed and that only consist of the ITS2 subregion. Is there a way I could match this set against the untrimmed dataset to retrieve a set of full length reads for every sequence in the trimmed dataset?

sequence dna • 1.5k views

ADD COMMENT • link updated 14 months ago by Ram 43k • written 9.0 years ago by pmr • 0

0

Entering edit mode

Are these fasta sequences? Are the headers in the trimmed dataset same as in the untrimmed dataset? If yes, you can use these headers to get the original sequences. Otherwise you can "grep" the trimmed sequences to identify the original sequences in the untrimmed file. But this will also fetch you extra sequences in case your trimmed sequences has more than one matches in the untrimmed file.

ADD REPLY • link updated 14 months ago by Ram 43k • written 9.0 years ago by Ashutosh Pandey 12k

0

Entering edit mode

Yes they are in fasta format and the headers are the same. How would I go about to do that?

ADD REPLY • link 9.0 years ago by pmr • 0

0

Entering edit mode

Please search for the similar posts on Biostars. You can use the search feature. Similar posts include:

ADD REPLY • link updated 14 months ago by Ram 43k • written 9.0 years ago by Ashutosh Pandey 12k

Login before adding your answer.