Fastq header modification
0
0
Entering edit mode
6.1 years ago
Guillaume • 0

Hi everyone,

I hope this question has not been posted before, even if I looked for an answer in the forum I didn't find anything that fits my problem.

I need to modify the header of my fastq files (generated by MinION) in order to fit with the new storage of the fast5 files.

For example, the name is :

@ba66450b /root/**barcode09/15/**file.fast5

And I need :

@ba66450b /root/file.fast5

And the number can be 15 but also 20,21,25, etc..

Thanks in advance if you can help me !

G

sequence next-gen • 1.7k views
ADD COMMENT
0
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. I've done it for you this time. Formatting bar

ADD REPLY
0
Entering edit mode

OK thank you, I'll do that next time.

ADD REPLY
0
Entering edit mode

Is sed an option? If the number is the only variable, you can write a regex around it.

ADD REPLY
0
Entering edit mode

How about following sed command?

   sed -e 's/\*\*barcode09\/[0-9]*\///g'
ADD REPLY
0
Entering edit mode

Or perl:

perl -pe 's/\*\*barcode\d+\/\d+\/\*\*//' < file_in > file_out
ADD REPLY
2
Entering edit mode

The reason I asked "Is sed OK" and did not give sed/perl code was so OP could work on writing their own code.

ADD REPLY
1
Entering edit mode

Give up on that dream @Ram :) Not going to happen.

ADD REPLY
1
Entering edit mode

I care more about people providing ready-to-use code than people asking for such code. The latter depend on the former, and the former encourage the latter. Askers will always exist, providers need to create scarcity so askers turn to themselves.

ADD REPLY
0
Entering edit mode

Many of us are guilty of handing out free drinks erm solutions. I agree that it would be nice that askers can learn, but often the formulation of their question already indicates they have no idea on how to solve their issue.

ADD REPLY
0
Entering edit mode

Hence, the pointers. Use tool X, with options y and z, focusing on solving the challenge around concept c. Use sed to substitute a regex focusing on the number part of the identifier.

ADD REPLY
0
Entering edit mode

I wonder is pepsi OK?

ADD REPLY
0
Entering edit mode

I prefer Thums Up

ADD REPLY
0
Entering edit mode

perl -pe 's/**barcode\d+\/\d+\/**//' < file_in > file_out

Nice, this one is working. Thanks

ADD REPLY
0
Entering edit mode

The reason I asked "Is sed OK" and did not give sed code was so OP could work on writing their own code.

ADD REPLY
0
Entering edit mode

I totally understand you want askers to be able to write their own code. However for people like me that don't have strong bioinformatics skills it's hard to understand the code, except if you give us details enough to explain the line. You will tell me "follow bioinformatics courses", yes you're right, I should do that.

Thanks a lot for your kind help

ADD REPLY
1
Entering edit mode

I understand that people might be just starting out, but one has to start _somewhere_. You could google for the code and adapt that code - that way, you'll learn both Google search as well as code tweaking. By copy pasting code here, how would you know what each part of the code does?

One can either claim to be a beginner and stay there with the code others provide, or they can work their way towards not being a beginner any more.

ADD REPLY
1
Entering edit mode

But not everyone has the same ambitions - to become a real biofinformagician.

ADD REPLY
0
Entering edit mode

+1 for Bioinformagicians :)

ADD REPLY
0
Entering edit mode

Fair point. OP seems like they're willing to learn, so I'm doing the socratic method spiel.

ADD REPLY

Login before adding your answer.

Traffic: 1622 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6