Tutorial:What is new in samtools release 1.5 [Solstice Release] (21st June 2017)
0
9
Entering edit mode
6.8 years ago

Official announcement:


Samtools Release 1.5 [Solstice Release] (21st June 2017)

  • Samtools fastq now has a -i option to create a fastq file from an index tag, and a -T option (similar to -t) to add user specified aux tags to the fastq header line.
  • Samtools fastq can now create compressed fastq files, by giving the output filenames an extention of .gq, .bgz, or .bgzf
  • Samtools sort has a -t TAG option, that allows records to be sorted by the value of the specified aux tag, then by position or name. Merge gets a similar option, allowing files sorted this way to be merged.

Let's go over each item and see how it works in practice.

Samtools fastq now has a -i option to create a fastq file from an index tag, and a -T option (similar to -t) to add user specified aux tags to the fastq header line.

Help flags for samtools fastq:

...
    -i          add Illumina Casava 1.8 format entry to header (eg 1:N:0:ATCACG)
    -T TAGLIST  copy arbitrary tags to the FASTQ header line
...

Let's give it a go. Get the test file.

curl https://raw.githubusercontent.com/samtools/samtools/develop/test/dat/bam2fq.005.sam > test.sam

now run:

samtools fastq test.sam | head -4

it prints

@ref1_grp1_p001/1
CGAGCTCGGT
+
!!!!!!!!!!

whereas:

samtools fastq -T MD,BC,za test.sam | head -4

prints:

@ref1_grp1_p001/1   MD:Z:10 BC:Z:AC-GT  za:Z:Hello world!
CGAGCTCGGT
+
!!!!!!!!!!

The -i flag is poorly documented and I managed to figure it out only by scouring the test examples on the GitHub repository. It requires setting the --index-format parameter and a file specified via the --i1 parameter to collect the indices into.

samtools fastq -i --i1 indices.fq --index-format 'i2' -T MD,BC,za test.sam | head -4

will produce:

@ref1_grp1_p001/1   MD:Z:10 BC:Z:AC-GT  za:Z:Hello world! 1:N:0:AC
CGAGCTCGGT
+
!!!!!!!!!!

and a file called indices.fq that contains:

@ref1_grp1_p001/1   MD:Z:10 BC:Z:AC-GT  za:Z:Hello world! 1:N:0:AC
AC
+
""
@ref1_grp1_p002/1   MD:Z:10 BC:Z:AATT+CCGG  za:Z:Another string 1:N:0:AA
AA
+
""
@ref1_grp2_p001/1   MD:Z:8  BC:Z:TG+CA  za:Z:!"$%^&*() 1:N:0:TG
TG
+
ab

Samtools fastq can now create compressed fastq files, by giving the output filenames an extention of .gq, .bgz, or .bgzf

Example:

samtools fastq -1 read1.fq.gz -2 read2.fq.gz align.bam

The release note is a bit confusing though.It is not clear what the .qg extension above means.Perhaps a typo for .gz since that works as well as demonstrated above. Also unclear is the difference between .bgz and bgzf.

Samtools sort has a -t TAG option, that allows records to be sorted by the value of the specified aux tag, then by position or name. Merge gets a similar option, allowing files sorted this way to be merged.

samtools sort align.bam | samtools view | cut -f 1,12-25 | head -5

prints:

SRR343051.887   NM:i:0  MD:Z:101    AS:i:101    XS:i:101    RG:Z:foo    XA:Z:NC_020370.1,-55728,101M,0;
SRR343051.542   NM:i:0  MD:Z:101    AS:i:101    XS:i:101    RG:Z:foo    XA:Z:NC_020370.1,-55615,101M,0;
SRR343051.9863  NM:i:0  MD:Z:101    AS:i:101    XS:i:101    RG:Z:foo    XA:Z:NC_020370.1,-55587,101M,0;
SRR343051.887   NM:i:0  MD:Z:101    AS:i:101    XS:i:101    RG:Z:foo    XA:Z:NC_020370.1,+55573,101M,0;
SRR343051.9863  NM:i:0  MD:Z:101    AS:i:101    XS:i:101    RG:Z:foo    XA:Z:NC_020370.1,+55479,101M,0;

whereas:

samtools sort -t AS align.bam | samtools view | cut -f 1,12-25 | head -5

prints:

SRR343051.1909  AS:i:0  XS:i:0  RG:Z:foo
SRR343051.5040  AS:i:0  XS:i:0  RG:Z:foo
SRR343051.22    AS:i:0  XS:i:0  RG:Z:foo
SRR343051.2588  AS:i:0  XS:i:0  RG:Z:foo
SRR343051.3324  AS:i:0  XS:i:0  RG:Z:foo
samtools • 2.3k views
ADD COMMENT
0
Entering edit mode

Why is this post labeled "forum" and not "news"?

ADD REPLY
0
Entering edit mode

Or 'tutorial', maybe?

ADD REPLY
0
Entering edit mode

I was not sure what it ought to be. News would fit if there was just the initial statement. A tutorial label felt like giving it too much importance. Either way would work.

ADD REPLY
0
Entering edit mode

Posts labeled with "forum" have some component that warrants discussion/generates opposing opinions (in my mind). Since you are demonstrating some of the features with example data tutorial may fit better.

ADD REPLY
0
Entering edit mode

I'll make it a tutorial then.

ADD REPLY

Login before adding your answer.

Traffic: 2224 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6