What does this mean in my read (samtools)
2
0
Entering edit mode
8.9 years ago

What does this mean

XP:Z:1,+150884058,55M45S,0,0;

This is the entire read

HS2000-645_410:4:2313:12585:49651    353    22    37097777    0    54S46M    =    37097858    181    GCCATCACACCTGGCTAATTTTTGTATTTTTAGTAGAGACGGGGTTTCACCATGTACACAAATACACACACGCAGCCAGCCACCTCGAGTGATATGTGTG    <=<ABBCABA@AB@@A@@@@?@;A<?@?@@@@A=?@@A@>9A=:21,/5=?1@.29@A@A@@<A@A@A@B?=:;B+&;=>BBB/=7=?C==DCDCDB@AA    BD:Z:JJNMPPOKJJNMNMNMKJJJCCCKHKHJCCCJLLKLJKJLJJIILJCJKHLLKLHKKHHHIBJHKHHIIIIKMNNONMNONMJNNLMMNOKPOLMQIIII    PG:Z:MarkDuplicates    RG:Z:XXXXXXXXXX(sample id)     BI:Z:LLONPPQLLKOOPNPPNLLKFFFNKNKKFFFMONNOLMLMMLKKOLGLOKONNNLONKKKMFMKNKKKKKKOOQQRPOQRPOMQQNNQQRPTSPQSLLLL    NM:i:0    XP:Z:1,+150884058,55M45S,0,0;    MQ:i:60    AS:i:46    XS:i:20

Is there a bitwise flag to were I can view reads with only this XP? Similar to samtools view in.bam | grep "XP:" but without the grep so I can do samtools view -f ??? -c in.bam and return the counts.

Super-duper thanks.

ngs samtools wgs • 2.7k views
ADD COMMENT
0
Entering edit mode

As I said in the other thread, update to the latest version of bwa.

ADD REPLY
2
Entering edit mode
8.9 years ago

XP is an old tag and has been renamed SA in the sam spec:

http://samtools.github.io/hts-specs/SAMv1.pdf

Other canonical alignments in a chimeric alignment, formatted as a semicolon-delimited list:
{\tt (}\emph{rname}{\tt ,}\emph{pos}{\tt ,}\emph{strand}{\tt ,}\emph{CIGAR}{\tt ,}\emph{mapQ}{\tt ,}\emph{NM}{\tt ;)}+.
Each element in the list represents a part of the chimeric alignment. Conventionally, at a supplementary line,
the first element points to the primary line.\\
ADD COMMENT
0
Entering edit mode

Thanks, is there a bitwise flag I can use to find all the reads?

$ samtools view -f ??? in.bam 1:1000-2000
ADD REPLY
0
Entering edit mode

Using my tool samjs

java -jar dist-1.133/samjs.jar -e 'record.getAttribute("SA")!=null' input.bam
ADD REPLY
1
Entering edit mode
8.9 years ago
Ying W ★ 4.2k

What tool are you using to generate these BAM files? According to the bam spec, tags that start with X are reserved for local use. Maybe there are hints on what the field means in the header (type in samtools view -H file.bam)

ADD COMMENT
0
Entering edit mode

BWA-MEM with split read option on

ID:GATK IndelRealigner    VN:2.5-2-gf57256b    CL:knownAlleles=[(RodBinding name=knownAlleles source=/----/----/----/temp_project/broad_bundles/2.3/b37/Mills_and_1000G_gold_standard.indels.b37.vcf)] targetIntervals=/----/---/----/----/----/----/----/---- LODThresholdForCleaning=5.0 consensusDeterminationModel=USE_READS entropyThreshold=0.15 maxReadsInMemory=150000 maxIsizeForMovement=3000 maxPositionalMoveAllowed=200 maxConsensuses=30 maxReadsForConsensuses=120 maxReadsForRealignment=20000 noOriginalAlignmentTags=false nWayOut=null generate_nWayOut_md5s=false check_early=false noPGTag=false keepPGTags=false indelsFileForDebugging=null statisticsFileForDebugging=null SNPsFileForDebugging=null
@PG    ID:MarkDuplicates    PN:MarkDuplicates    VN:1.86(1363)    CL:net.sf.picard.sam.MarkDuplicates INPUT=[/----/----/----/-----/----/----/----/----] OUTPUT=/----/----/----/----/----/----/----/---- METRICS_FILE=/----/ REMOVE_DUPLICATES=true ASSUME_SORTED=true MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=1000 VALIDATION_STRINGENCY=SILENT CREATE_INDEX=true    PROGRAM_RECORD_ID=MarkDuplicates PROGRAM_GROUP_NAME=MarkDuplicates MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=50000 SORTING_COLLECTION_SIZE_RATIO=0.25 READ_NAME_REGEX=[a-zA-Z0-9]+:[0-9]:([0-9]+):([0-9]+):([0-9]+).* OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_MD5_FILE=false
@PG    ID:GATK PrintReads    VN:2.5-2-gf57256b    CL:readGroup=null platform=null number=-1 downsample_coverage=1.0 sample_file=[] sample_name=[] simplify=false no_pg_tag=false
@CO    Sorting ----.bam file with piacrd tool
ADD REPLY

Login before adding your answer.

Traffic: 2929 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6