hello every one,
I know it is quit irritating for those how have good experience but defiantly helpful for newbie.
I am looking for better method for capturing chimeric reads from millions of illumina reads, for that I read few papers but did not find conclusive. but what I got is : 1st step conversion of SRA file to fastq/a file
2nd step is map fastq/a file to the reference genome[which is backbone of further study], there are no. of freely available tools like bowtie1 bowtie2, novoalign etc BUT WHICH ONE IS BETTER FOR HI-C DATA still i don't know and how many mapping parameters are suitable to capture more chimeric reads from million of reads [seeking for experts comments or suggestions]
3rd step is filtering those chimeric reads which mapped properly on reference genome, here properly mapped means, if we have a chimeric read which map on chromosome A at two different positions like first portion of read(which could be length of read - N, where N could be any integer) mapped at 5000 to 5030 and remaining portion of read mapped at 10000-10068 position of chromosome A. how to extract such type of mapping from output .sam file [syntax which extract such type of information]
4th step to visualization of mapping, there are number of tools for visualization of mapping data.
your valuable comments are always welcome.