I read about removing sequences with >33% gap from the alignment after alignment trimming, before phylogenetic tree construction. I am wondering it this makes sense or one should realign the sequences. Is there is paper which discusses the effect of gaps in the alignment/alignment programs for maximum likelihood phylogentic tree construction?
It really depends a lot on what you are doing. Is this single gene or multi-gene/phylogenomic analyses? Those incomplete or heavyily gapped sequences may be of interest so you don't want to remove them. Amino acid or nucleotide alignments? What model of evolution? What phylogenetic program? Some programs handle gaps differently than others for instance. People may have started using a 33% rule of thumb, but only because that means the sequence is "missing" or lacking information at 1/3 of positions in your alignment. If it is because it is a partial sequence that is one thing, but in large datasets they often represent partial pseudogene or paralagous sequences which is why they are often removed.
There are no hard and fast cutoffs in phylogenetics really, because there are a lot of factors that go in to setting up your experiment.