I need little help to work with the genomic DNA seq. I would like to trim the reads (.fastq files) until first 'TA' position. And then, want to remove all the reads having length <40bp. It would be helpful if someone could share some useful commands to do the same. Thanks very much!
For ex -
@K00302:80:HLTWCBBXX:3:1101:3478:1402 1:N:0:NTAGGC GGCGATGCGGCGGCGTTATTCCCATGACCCGCCGGGCAGCTTCCGGGAAACCAAAGTCTTTGGGTTCCGGGGGGAGTATGGTTGCAAAGCTGAAACAAAAA + FAAA<JAA7AAFJ<AJJ-AJAFAFJFJ<-A<7<AAA-7AA-A7<F7-AJF7AA-77FFAJFFFFFJA<JAFJJ-A77AJFJFJF7F<FA7FJ<JJ7<J-A- @K00302:80:HLTWCBBXX:3:1101:2199:1402 1:N:0:NTAGGC GATAAATGCATTGTCCACTAAGAAGTTCTGAGCTGGAAAAAAAAAAAAAAAGATCGGAAGAGCACACGTCTGAACTCCAGTCACTTAGGCTTCACGTATTCCGT + JAFFFJF-<FFF-<-A--FFF-7JJFJJ--<A<<J-7FFFJJFFJF<JF7A7<--77J-AA-7AA-AAJ-7FFFA7<-7-7--7J--<---<)---7-7-----)7<
** that were not part of the sequence. @genomax