how to use the option of --rf and the options of -a and -a -a (-aa) in samtools mpileup
1
1
Entering edit mode
6.2 years ago

I am now using samtools (1.3.1-33) mpileup to create mpileup files, and would like to remove reads that have flags of 4, 8, 256, 512 and 1024. So should the --rf be set to 1804 (4+8+256+512+1024) or the --ff set to 1804? Also, what is the difference between options of -a and -a -a (-aa)? Thanks you!

sequence genome • 2.8k views
ADD COMMENT
1
Entering edit mode
6.2 years ago

If you want to remove reads that have certain bits set, then you need to use --ff, --excl-flags. If you don't set anything, the default is:

--ff [UNMAP,SECONDARY,QCFAIL,DUP]

That is, don't look at unaligned, secondary, QC-failed, or duplicate reads.

You should look at the flag table to see which you require: flags
upload image

Kevin

ADD COMMENT
0
Entering edit mode

Thanks Kevin for your answer. Do you know what is the difference between -a and -a -a (-aa)?

ADD REPLY
0
Entering edit mode

By using either of these options, you should get a very large file.

  • -a, I believe, will output a record for every base that's listed in your reference sequence and for which there has been some form of alignment.
  • -aa or -a -a will get you an even larger file because it will output a record for every base in your reference, irrespective of whether there has been alignment or not.

You should check how they both behave, just to make sure.

I think that these options are mainly fr use in conjunction with a target BED file

ADD REPLY
0
Entering edit mode

Thanks for your time. The following is the reply from samtools developers about the difference between -a and -a -a (-aa):

-a and -aa is all vs absolutely all (bad term I know). The difference is to do with references with zero coverage.

For example if we have a file with @SQ headers for all chromosomes, but only data on chr 1, then -a would show all of chr1 irrespective of coverage, but none on chr2, 3, etc, while -aa would show the zero coverage on all the other chromosomes too.

ADD REPLY
0
Entering edit mode

Thanks for your time. The following is the reply from samtools developers about the difference between -a and -a -a (-aa):

-a and -aa is all vs absolutely all (bad term I know). The difference is to do with references with zero coverage.

For example if we have a file with @SQ headers for all chromosomes, but only data on chr 1, then -a would show all of chr1 irrespective of coverage, but none on chr2, 3, etc, while -aa would show the zero coverage on all the other chromosomes too.

ADD REPLY

Login before adding your answer.

Traffic: 2192 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6