Handle HETATM bioassembly cases and related processing updates#13
Handle HETATM bioassembly cases and related processing updates#13jhmlam wants to merge 1 commit intoTHGLab:mainfrom
Conversation
|
@Ericwang6
|
| # Ligands with elements other than this list will be discarded | ||
| COMMON_ELEMENTS = ['H', 'C', 'N', 'O', 'F', 'P', 'S', 'Cl', 'Br', 'I'] | ||
| # | ||
| AMBER14_TIP3PXML_ACCEPTABLE_RESNAME = [ |
There was a problem hiding this comment.
- For the hetatms, only water and metal ions in https://github.com/openmm/openmm/blob/master/wrappers/python/openmm/app/data/amber14/tip3p.xml were included. This covers most of the metals of interest e.g. MG ZN CA MN etc. Other cofactors and artifact solvent in the crystal are excluded and a message is printed
| include['hetatm'].append(residue) | ||
| if np.min(cdist(positions, het_positions)) * 10 > hetatm_cutoff: | ||
| continue | ||
| if residue.name not in AMBER14_TIP3PXML_ACCEPTABLE_RESNAME: |
There was a problem hiding this comment.
- For the hetatms, only water and metal ions in https://github.com/openmm/openmm/blob/master/wrappers/python/openmm/app/data/amber14/tip3p.xml were included. This covers most of the metals of interest e.g. MG ZN CA MN etc. Other cofactors and artifact solvent in the crystal are excluded and a message is printed
| for residue in self.topology.residues(): | ||
| system = forcefield.createSystem(self.topology, nonbondedMethod=app.CutoffNonPeriodic, constraints=None, rigidWater=False) | ||
| original_masses = [system.getParticleMass(i) for i in range(system.getNumParticles())] | ||
| #omega_atom_quads = self._get_nonpro_peptide_omega_atom_quads() |
There was a problem hiding this comment.
Remaining issue, I do see some cis-peptide bonds at the pdbfixer added residues. This is also visible in the Figshare data 1ork.
For now I commented out this block, while it cures some of the cis peptide bond, it doesnt cure all; so I reserve this for the next upgrade.
|
It also writes a json which we can inspect how many system contains certain ions and if bioassembly did anything to the PDB model. (most PDB model only have 1 assembly case). Also attached a plot generated with the following code reading the jsons after running 1000 systems (sorted pdbids, covering pdbid 10** to 1l**), so for these early pdb structures ~80% carries water and 15% have meaningful assembly. |
Note that occurrence of these water, ions and bio-assembly may be less frequent in later pdbids as the early pdbids are mostly crystals though |
No description provided.