I'm trying to modify the header and chromosome to include the file prefix and output this to a new bam file. I was going to do this with sed but would rather do it with pysam if it is possible.
A line of my bam file is as follows:
GWNJ-0901:658:GW2006263225th:6:1101:12824:2610 99 NC_011993.1_Escherichia_coli_LF82_complete_genome_length_4773108 2056740 42 27M = 2056869 150 CGGCTGCACGGGCGAAGTTTCCGCCGC FJ-AJAJJJ<AAFJJJ<J<JFJ<-A7A AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:27 YS:i:-3 YT:Z:CP
Where I want to access the 3rd column, chromosome colum I think it is called, and concat a prefix to it:
I'm having trouble writing out this amended column to an output file using fetch:
for read in input_bam.fetch(reference=species): print(input_bam.get_reference_name(read.reference_id)) # Returns chromosome column prefixed_chrom=prefix + '_' +input_bam.get_reference_name(read.reference_id) with pysam.AlignmentFile(full_output_path, "w",template=input_bam) as outf: a = pysam.AlignedSegment() a.query_name = read.query_name a.query_sequence = read.query_sequence a.get_reference_name(read.reference_id) = prefixed_chrom a.flag = read.flag a.reference_id = read.reference_id a.reference_start = read.reference_start a.mapping_quality = read.mapping_quality a.cigar = read.cigar a.next_reference_id = read.next_reference_id a.next_reference_start= read.next_reference_start a.template_length=read.template_length a.query_qualities = read.query_qualities a.tags = read.tags outf.write(a) SyntaxError: cannot assign to function call
How can I write this amended chromosome column to an output file using pysam? Many thanks!
Linda : This is a specific question so don't use
Forumposts are generally open-ended discussions that may have more than one point of view.
My bad. This is my first question. Hopefully it's updated now.
see also Bam File: Change Chromosome Notation