Question: Output From Picard And Samtools Flagstat Not Very Useful From My Sambam Files:
2
gravatar for Angel
7.8 years ago by
Angel210
United States
Angel210 wrote:

Sorry for another post ... as I could not fit this message as a "comment" in my last post of alignment statistics:

I got the following from Picard CollectAligment and samtools flagstat ... I found flagstat result more informative than picard but no software allows me to fill the info in table at the end that I want. Sorry for my dumb questions. Any comments? Is there stand-alone perl or any other script that I can use? Thanks very much.

CATEGORY FIRSTOFPAIR SECONDOFPAIR PAIR

TOTAL_READS 269056355 269056355 538112710

PF_READS 269056355 269056355 538112710

PCTPFREADS 1 1 1

PFNOISEREADS 256 288 544

PFREADSALIGNED 0 0 0

PCTPFREADS_ALIGNED 0 0 0

PFALIGNEDBASES 0 0 0

PFHQALIGNED_READS 0 0 0

PFHQALIGNED_BASES 0 0 0

PFHQALIGNEDQ20BASES 0 0 0

PFHQMEDIAN_MISMATCHES 0 0 0

PFMISMATCHRATE 0 0 0

PFHQERROR_RATE 0 0 0

PFINDELRATE 0 0 0

MEANREADLENGTH 100 100 100

READSALIGNEDIN_PAIRS 0 0 0

PCTREADSALIGNEDINPAIRS 0 0 0

BAD_CYCLES 0 0 0

STRAND_BALANCE 0 0 0

PCT_CHIMERAS 0 0 0

PCT_ADAPTER 0.001418 0.00058 0.000999


I tried: samtools flagstat as well and I got:

538112710 + 0 in total (QC-passed reads + QC-failed reads)

0 + 0 duplicates

512844854 + 0 mapped (95.30%:-nan%)

538112710 + 0 paired in sequencing

269056355 + 0 read1

269056355 + 0 read2

501199328 + 0 properly paired (93.14%:-nan%)

509847920 + 0 with itself and mate mapped

2996934 + 0 singletons (0.56%:-nan%)

728423 + 0 with mate mapped to a different chr

568502 + 0 with mate mapped to a different chr (mapQ>=5)


Table that I want to fill is:

Total read = 538112710

Uniquely paired read#

Non-uniquely paired read#

Uniquely mapped read1#

Non-uniquely mapped read1#

Uniquely mapped read2#

Non-uniquely mapped read2#

Unmapped read1#

Unmapped read2#

Uniquely paired read%

Non-uniquely paired read%

Uniquely mapped read1%

Non-uniquely mapped read1%

Uniquely mapped read2%

Non-uniquely mapped read2%

Unmapped read1%

Unmapped read2%

exome picard samtools • 3.9k views
ADD COMMENTlink modified 7.7 years ago by Daniel Swan13k • written 7.8 years ago by Angel210

What is your mapping program ?

ADD REPLYlink written 7.8 years ago by toni2.1k

have you tried Bioconductor or Biopython? If you are desperate consider placing the project up for bid on Guru or Elance.

ADD REPLYlink written 7.8 years ago by Jeremy Leipzig18k

Hi Jeremy, I looked into http://www.bioconductor.org/developers/new_packages/ and do not see a package mentioned that can read bam file and give me the metrics I am looking for. Am I sorry if I missed it. Any comments? My bam file are of the size of 60-70GB, would bioconductor be able to handle these files?

ADD REPLYlink written 7.8 years ago by Angel210

Hi Tony, I am using bwa for alignment and samtools for the rest of the pipeline.

ADD REPLYlink written 7.8 years ago by Angel210
2
gravatar for Daniel Swan
7.8 years ago by
Daniel Swan13k
Aberdeen, UK
Daniel Swan13k wrote:

You might want to try CollectMultipleMetrics from Picard instead. For whole exome work I use that and CalculateHsMetrics and get as much information as I could possibly need..

ADD COMMENTlink written 7.8 years ago by Daniel Swan13k

Hi Daniel, I could not understand/find info on how to get input "Bait_file" and "Interval_file". COuld you point me to a command or link on how to get these input files?

ADD REPLYlink written 7.8 years ago by Angel210

Depends what your capture method is. We use mainly Agilent SureSelect, so we get this information from E-Arrayhttps://earray.chem.agilent.com/earray/ I'm sure whoever makes your capture kits will have a similar resource

ADD REPLYlink written 7.8 years ago by Daniel Swan13k

Hi Daniel Thanks!

I am being told the data was created using Agilent SureSelect whole exome 50MB kit for the hybrid capture (in a solution, not on a chip).

I registered at the site you gave me but sorry I still don't know how to get the "bait" and "interval" file. Would you be able to provide little bit more info on how can I use the site to get these files? I highly appreciate your help.

ADD REPLYlink written 7.8 years ago by Angel210

Angel, I have these files for the 50MB kit for hg19. Drop me an email: dan[dot]swan[at]gmail[dot]com

ADD REPLYlink written 7.8 years ago by Daniel Swan13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2294 users visited in the last hour