Hi there. Im looking to extract chr, start coordinate of read 1 and the end coordinate of read 2 of paired-end NGS into one "bed" file. I have over 7 million reads and I am not sure that every paired-end read has a "pair". I have used bamtobed function of bedtools and sorted the file by read info (eg M01269....). Here is an example of what I need, For the following:
chrII 404128 404259 M01269:176:000000000-BW364:1:1101:10000:12221/1 60 -
chrII 404126 404251 M01269:176:000000000-BW364:1:1101:10000:12221/2 60 +
chrVII 350990 351120 M01269:176:000000000-BW364:1:1101:10000:24715/1 60 -
chrVII 350971 351093 M01269:176:000000000-BW364:1:1101:10000:24715/2 60 +
chrXII 527617 527747 M01269:176:000000000-BW364:1:1101:10000:26164/1 60 +
chrXII 527627 527753 M01269:176:000000000-BW364:1:1101:10000:26164/2 60 -
chrVII 826318 826449 M01269:176:000000000-BW364:1:1101:10000:8567/1 60 +
chrVII 826335 826461 M01269:176:000000000-BW364:1:1101:10000:8567/2 60 -
chrXII 880431 880562 M01269:176:000000000-BW364:1:1101:10001:14255/1 60 +
chrXII 880448 880574 M01269:176:000000000-BW364:1:1101:10001:14255/2 60 -
I need:
chrII 404128 404251
chrVII 350990 351093
chrXII 527617 527753
chrVII 826318 . 826461
chrXII 880431 880574
Any help would be very appreciated. Thanks
Hello rjobmc,
Please use the formatting bar (especially the
![code_formatting](https://image.ibb.co/fg0nMx/code_formatting.png)
code
option) to present your post better. I've done it for you this time.Thank you!
Hello Could you please explain what 60 and - or + stand for ?
probably length of the read and strand (+ and -) rania.hamdy1