Question

NGS raw data manipulation using Linux/Unix

0

Entering edit mode

2.6 years ago

rijithraj • 0

Hi,

I wanna add sample ID along with read IDs in my NGS data using unix platform. Can anyone help me with this?

Thanks in advance

Rijith Jayarajan

datamanipulation • 1.1k views

ADD COMMENT • link updated 2.6 years ago by 5heikki 11k • written 2.6 years ago by rijithraj • 0

0

Entering edit mode

add sample ID along with read IDs

Add exactly where? If in fastq header, that would break the Illumina sequence identifier format.

ADD REPLY • link 2.6 years ago by GenoMax 141k

0

Entering edit mode

If you go from:

@SEQ_ID
GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
+
!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65

to:

@SEQ_ID;SAMPLE_ID
GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
+
!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65

How would that break the header format?

ADD REPLY • link 2.6 years ago by 5heikki 11k

0

Entering edit mode

I meant to say Illumina sequence identifier format. Amended above.

ADD REPLY • link 2.6 years ago by GenoMax 141k

0

Entering edit mode

I mostly deal with Ion Torrent reads. Are there any programs that would stop working properly if Illumina read headers were modified by the user?

ADD REPLY • link 2.6 years ago by 5heikki 11k

score 0 · Answer 1 · 2021-09-22

0

Entering edit mode

2.6 years ago

swbarnes2 14k

This is...uncommon. Usually each fastq files name indicates its sample name, or you can add sample ID to the read group info in a bam, but for many applications, again, the sample name can be indicated in the bam name.

ADD COMMENT • link 2.6 years ago by swbarnes2 14k