Question: How to remove redundant and poor quality ESTs
0
gravatar for dinesh
4.3 years ago by
dinesh50
India
dinesh50 wrote:

Hi community, i have a question that, How to remove redundant and poor quality ESTs from whole data set through online .......?

blast alignment genome • 1.3k views
ADD COMMENTlink modified 4.3 years ago by Devon Ryan88k • written 4.3 years ago by dinesh50
1
gravatar for Prakki Rama
4.3 years ago by
Prakki Rama2.2k
Singapore
Prakki Rama2.2k wrote:

These are not online, but can be helpful.

Redundant sequences

1) check this biostar How To Remove The Same Sequences In The Fasta Files? to remove exact duplicate sequences.

2) If you want to remove sequences based on similarity cutoff, you can try cd-hit-est, uclust etc.

Poor quality sequences

1) If you have reads, then you can map those reads using some mapping tools such as bowtie, BWA and check if the sequence has sufficient coverage or not. Those with insufficient must be of poor quality.

 

ADD COMMENTlink written 4.3 years ago by Prakki Rama2.2k

1) sir, the default options in cd-hit-EST are enough to run the programme or i have to change the value(suggest me)..........

2) I found EGassembler online software. can i use this one for my work

ADD REPLYlink written 4.3 years ago by dinesh50

It depends on your requirement. If suppose you want to collapse shorter sequences covering 80% of the longer sequence with 70% identity, you need to change the parameters accordingly. Look for options -aS and -c in cd-hit. I have not used EGassembler before, so I cannot comment.

ADD REPLYlink written 4.3 years ago by Prakki Rama2.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2333 users visited in the last hour