Question

How to remove redundant and poor quality ESTs

0

Entering edit mode

9.4 years ago

dinesh ▴ 50

Hi community, i have a question that, How to remove redundant and poor quality ESTs from whole data set through online .......?

genome blast alignment • 2.2k views

ADD COMMENT • link updated 2.2 years ago by Ram 43k • written 9.4 years ago by dinesh ▴ 50

Ram · Answer 1 · 2014-11-17

1

Entering edit mode

9.4 years ago

Prakki Rama ★ 2.7k

These are not online, but can be helpful.

Redundant sequences

Check this biostars post to remove exact duplicate sequences.
If you want to remove sequences based on similarity cutoff, you can try cd-hit-est, uclust etc.

Poor quality sequences

If you have reads, then you can map those reads using some mapping tools such as bowtie, BWA and check if the sequence has sufficient coverage or not. Those with insufficient must be of poor quality.

ADD COMMENT • link updated 2.2 years ago by Ram 43k • written 9.4 years ago by Prakki Rama ★ 2.7k

0

Entering edit mode

1) sir, the default options in cd-hit-EST are enough to run the programme or i have to change the value(suggest me)..........

2) I found EGassembler online software. can i use this one for my work

ADD REPLY • link 9.4 years ago by dinesh ▴ 50

0

Entering edit mode

It depends on your requirement. If suppose you want to collapse shorter sequences covering 80% of the longer sequence with 70% identity, you need to change the parameters accordingly. Look for options -aS and -c in cd-hit. I have not used EGassembler before, so I cannot comment.

ADD REPLY • link updated 2.2 years ago by Ram 43k • written 9.4 years ago by Prakki Rama ★ 2.7k