Question: Matching list of barcodes to specific bases to read2 in a paired end file
0
gravatar for bibaswan.ghoshal
4 weeks ago by
bibaswan.ghoshal0 wrote:

Hi,

I have a list of barcodes each 8nt long. I want to look for these barcodes in specific bases (11-18, 47-54, 89-96) of the read 2 sequences allowing 1 mismatch in each region and then filtering the reads that have these barcodes and then match them with the read 1.

I tried at multiple programs like cutadapt, barcode_splitter but they can't be able to handle the method? How can this be achieved?

Thanks for any help you can offer.

ADD COMMENTlink written 4 weeks ago by bibaswan.ghoshal0

example of input / output is needed.

ADD REPLYlink written 4 weeks ago by Pierre Lindenbaum113k

Input files: sample1_R1.fastq.gz

@NS500310:191:H3WFYBGX9:1:11101:10722:1034 1:N:0:GATCAG
TATTANAGTTGCTGCAGTTAAAAAGCTCNTAGTTGGATCTTGGGAGCGGGCGGGCGGTCCGCCGCG
+
AAAAA#EEEEEEEEEEEEEEEEEEEEEE#EEEEEEEEEEEEEEE/EEEEEEEEEEEEEEE<EAEE/
@NS500310:191:H3WFYBGX9:1:11101:9361:1035 1:N:0:GATCAG
GATTGNTTCTCAGTTGGACATGGTGGTGCAGGCCTGGTACTTGGAAGGTGTCCTAGGAGTCCTAGA
+
AAAAA#EEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEAAEA

sample1_R2.fastq.gz:

@NS500310:191:H3WFYBGX9:1:11101:10722:1034 2:N:0:GATCAG
NCGAGCAGNCCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCTGTAANCNNNNNNNNNNNNAGNNNANNNCNNTNNNATNCNNN
+
#/A/A//E#//########################################//A///#/############//###/###/##/###//#/###
@NS500310:191:H3WFYBGX9:1:11101:9361:1035 2:N:0:GATCAG
NTGCGTGTNTANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTTCGAANCNNNNNNNNNNNNAGNNNAGNNCNNTCNNCCNCNNN
+
#AAAAEEE#A6########################################/6/E//#/############/A###/6##/##//##//#/###

barcodes.txt:

AACGTGAT
AAACATCG
ATGCCTAA
AGTGGTCA
ACCACTGT
ACATTGGC
CAGATCTG
CATCAAGT
CGCTGATC
ACAAGCTA

Desired output: Matched paired-end fastq

ADD REPLYlink modified 4 weeks ago by Pierre Lindenbaum113k • written 4 weeks ago by bibaswan.ghoshal0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2333 users visited in the last hour