I have Illumina MiSeq PE 2X250 data
I got the following info from the sequencing (some irrelevant or private data omitted):
[Header] IEMFileVersion,4 Investigator Name,xxxx Experiment Name,xxxxxx Date,x/x/xxxx Workflow,Assembly Application,Assembly Assay,TruSeq HT Description, Chemistry,Amplicon [Reads] 301 301 [Settings] ReverseComplement,0 kmer,31 Adapter,AGATCGGAAGAGCACACGTCTGAACTCCAGTCA AdapterRead2,AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT [Data] Sample_ID,Sample_Name,Sample_Plate,Sample_Well,I7_Index_ID,index,I5_Index_ID,index2,GenomeFolder,Sample_Project,Description 1,AB1,,,D701,ATTACTCG,D501,TATAGCCT,,, 2,AB2,,,D701,ATTACTCG,D502,ATAGAGGC,,, 3,AB3,,,D701,ATTACTCG,D503,CCTATCCT,,, 4,AB4,,,D701,ATTACTCG,D504,GGCTCTGA,,, 5,AB5,,,D701,ATTACTCG,D505,AGGCGAAG,,, 6,AB6,,,D701,ATTACTCG,D506,TAATCTTA,,, 7,AB7,,,D701,ATTACTCG,D507,CAGGACGT,,,
I want to trim with timmomatic and then assemble with Spades (and maybe other assemblers). I do not have the data yet. So I do not know if the adapters and indexes are trimmed away. But let's assume they are still there.
Now some questions:
1. Is it necessary to add the indexes to the adapter file of trimmomatic (so that they get removed as well)? Or do the short sequences in general not interfere with assembly? Or could it even happen that trimmomatic finds "false positives", because those sequences are so short?
2. Even if the adapters and indexes would have been trimmed away would it still be necessary to provide the sequences because of read-throughs? (For example R1 reading through adaptor of R2)
3. How do I have to configure the adapter file of trimmomatic? For which sequences do I have to provide the reverse-complement? And what about the /1 and /2 option of trimmomatic? I did not realy understand...