Entering edit mode
7.9 years ago
pmr
•
0
I have a set of sequences that have been trimmed and that only consist of the ITS2 subregion. Is there a way I could match this set against the untrimmed dataset to retrieve a set of full length reads for every sequence in the trimmed dataset?
Are these fasta sequences? Are the headers in the trimmed dataset same as in the untrimmed dataset? If yes, you can use these headers to get the original sequences. Otherwise you can "grep" the trimmed sequences to identify the original sequences in the untrimmed file. But this will also fetch you extra sequences in case your trimmed sequences has more than one matches in the untrimmed file.
Yes they are in fasta format and the headers are the same. How would I go about to do that?
Please search for the similar posts on Biostars. You can use the search feature. Similar posts include: