Question: Filtering uniquely mapping reads from bam files generated from STAR
I am learning to use MACS2 to call peaks for CLIP-Seq. I have bamfiles generated from STAR using the following command:

#get the input data file

    module load STAR
    STAR --genomeDir /scratch/midway2/caiqi/GRCh37_star_index_150bp \
    --runThreadN 16 \
    --readFilesIn ${INPUT}_1.fastq ${INPUT}_2.fastq \
    --outFileNamePrefix GRCh37${INPUT} \
    --twopassMode Basic \
    --sjdbOverhang 150 \
    --outSAMtype BAM SortedByCoordinate \
    --outFilterMultimapNmax 20 \
    --outFilterMismatchNmax 999 \
    --outFilterMismatchNoverLmax 0.06 \
     --alignIntronMin 70 \
     --alignIntronMax 500000 \
     --alignMatesGapMax 500000 \
     --alignSJoverhangMin 8 \
     --alignSJDBoverhangMin 1 \
     --outSAMstrandField intronMotif \
     --outFilterType BySJout

Now I would like to filter the BAM files using

sambamba as advised:

$ sambamba view -h -t 2 -f bam \
-F "[XS] == null and not unmapped  and not duplicate" \
H1hesc_Input_Rep1_chr12_aln_sorted.bam > H1hesc_Input_Rep1_chr12_aln.bam

The parameter following -F is [XS] is not appropriate for me because ‘XS’ is a tag generated by Bowtie2 that gives an alignment score for the second-best alignment, and it is only present if the read is aligned and more than one alignment was found for the read.

How can modify the argument after -F to make it suitable for the bam files from STAR?


