cutadapt 'cut_prefix' substring
0
0
Entering edit mode
6 months ago
noodle ▴ 560

Dear Biostars,

I'd like to add a substring of the UMI to the name of each fastq line. Below is an example of what I'd like to do (that doesn't work) where I'd like to cut the first 7bp of R1 but only append the name with the first 6bp. Should this be possible? Can someone tell me how to correct my code to make this work? TIA!

cutadapt -u 7 --rename='{id} {comment} $(echo {cut_prefix} | cut -c1-6)'

The output of which is;

1:2101:13928:1000 1:N:0:GTCGCCTT+AAA/ACTAATT $(echo NTTTATT | cut -c1-6)

But what I want is;

1:2101:13928:1000 1:N:0:GTCGCCTT+AAA/ACTAATT NTTTAT

cutadapt • 487 views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode

A side note to GenoMax's response, but the reason that doesn't work as-is: the prefix placeholder is filled in within cutadapt as it runs, but your shell's command substitution would need to happen before cutadapt runs, while the shell prepares the cutadapt call. (The single quotes around the rename pattern prevent your shell from interpreting the command you're trying to use, but if you switched to double quotes, it would just see a literal "{cut_prefix}" and turn it into "{cut_p" which clearly wouldn't work either.) The question made me curious if you can put arbitrary Python into cutadapt's format strings to get what you want, but apparently not. I like making use of cutadapt's impressive flexibility, but this task is beyond it, I think.

ADD REPLY
0
Entering edit mode

thanks, feature request incoming ;)

ADD REPLY

Login before adding your answer.

Traffic: 2384 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6