Motivation

I noticed whenever I built 3D conformers of molecules containing piperazine (or cyclohexane) using RDKit, I tend to get a distorted ring conformation. RDKit's ETKDG (Experimental Torsion angle Knowledge-based Distance Geometry) algorithm works really well in general, but, in this case, it was not doing a good job at coming up with a reasonable initial conformation. I wanted to first quantify how much chair vs boat vs twisted conformer I get, so either I could use it to filter out non-desirable conformers or improve the RDKit's conformer generation routine.

%matplotlib inline
import matplotlib.pyplot as plt

from io import BytesIO
import pandas as pd
import numpy as np
from rdkit.Chem import PandasTools

from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem import DataStructs
from rdkit.Chem import rdMolDescriptors
from rdkit.Chem import rdRGroupDecomposition

from rdkit.Chem.Draw import IPythonConsole #Needed to show molecules
from rdkit.Chem import Draw
from rdkit.Chem import rdDepictor
from rdkit.Chem.Draw import rdMolDraw2D
from rdkit.Chem.Draw.MolDrawing import MolDrawing, DrawingOptions #Only needed if modifying defaults

DrawingOptions.bondLineWidth=1.8
IPythonConsole.ipython_useSVG=True
from rdkit import RDLogger
RDLogger.DisableLog('rdApp.warning')
import rdkit
import py3Dmol
print(rdkit.__version__)
2020.03.2

Ring conformation using ETKDG v1

Let's make a piperazine molecule for testing. I put atom map in my SMILES, so I can refer them easily later.

smiles = '[NH:4]1[CH2:3][CH2:2][NH:1][CH2:6][CH2:5]1'
mol = Chem.MolFromSmiles(smiles)
mol
N:4H C:3 C:2 HN:1 C:6 C:5

Let's use default parameter for embedding, which uses ETKDG version 1, and generate 500 conformers.

params = Chem.rdDistGeom.EmbedParameters()
params.randomSeed = 0xf00d
params.clearConfs = True

mol_h = Chem.AddHs(mol)
cids = AllChem.EmbedMultipleConfs(mol_h, numConfs=500, params=params)
atoms = {a.GetAtomMapNum(): a.GetIdx() for a in mol.GetAtoms() if a.GetAtomMapNum() > 0}

# torsion angle atom indices
torsions = [
    (atoms[1], atoms[2], atoms[3], atoms[4]),
    (atoms[2], atoms[3], atoms[4], atoms[5]),
    (atoms[3], atoms[4], atoms[5], atoms[6]),
    (atoms[4], atoms[5], atoms[6], atoms[1]),
    (atoms[5], atoms[6], atoms[1], atoms[2]),
    (atoms[6], atoms[1], atoms[2], atoms[3]),
]

I'm going to define a simple check in chain vs boat vs twisted conformers of cyclohexane ring. If all six torsions are near +/- 60˚ (margin 30˚), it is considered a chair conformation. If two opposite torsions are close to 0˚ (margin 30˚) and the other remaining torsion angles are around +/- 60˚, then it is considered a boat conformation. The rest are considered as a twisted conformation.

def ring_conformers(mol):
    angles = []
    confs = []

    for conf in mol.GetConformers():
        angs = np.abs([Chem.rdMolTransforms.GetDihedralDeg(conf, *t) for t in torsions])
        angles.append(angs)

        if np.all(np.abs((angs - 60)/60) < 0.66): 
            confs.append('chair')
        elif angs[0] < 30 and angs[3] < 30 and np.all(np.abs((angs[[1,3,4,5]] - 60)/60) < 0.5):
            confs.append('boat')
        elif angs[1] < 30 and angs[4] < 30 and np.all(np.abs((angs[[0,2,3,4]] - 60)/60) < 0.5):
            confs.append('boat')        
        elif angs[2] < 30 and angs[5] < 30 and np.all(np.abs((angs[[0,1,3,4]] - 60)/60) < 0.5):
            confs.append('boat')        
        else:
            confs.append('twisted')
    return confs
from collections import Counter
confs = ring_conformers(mol_h)
Counter(confs)
Counter({'chair': 254, 'twisted': 228, 'boat': 18})

So, I got 203 chair (40%), 18 boat (4%), and 279 twisted (56%) conformers. Let's take a look at some conformers and make sure we got the right conformers. First chair conformation.

from random import shuffle
conf_idx = [i for i, c in enumerate(confs) if c == 'chair']
shuffle(conf_idx)

viewer = py3Dmol.view(width=300, height=300)
viewer.addModel(Chem.MolToMolBlock(mol_h, confId=conf_idx[0]), 'mol')
viewer.setStyle({"stick":{}})
viewer.zoomTo()
viewer.show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

And a boat conformation:

from random import shuffle
conf_idx = [i for i, c in enumerate(confs) if c == 'boat']
shuffle(conf_idx)

viewer = py3Dmol.view(width=300, height=300)
viewer.addModel(Chem.MolToMolBlock(mol_h, confId=conf_idx[0]), 'mol')
viewer.setStyle({"stick":{}})
viewer.zoomTo()
viewer.show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

And finally a twisted conformation:

from random import shuffle
conf_idx = [i for i, c in enumerate(confs) if c == 'twisted']
shuffle(conf_idx)

viewer = py3Dmol.view(width=300, height=300)
viewer.addModel(Chem.MolToMolBlock(mol_h, confId=conf_idx[0]), 'mol')
viewer.setStyle({"stick":{}})
viewer.zoomTo()
viewer.show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

Optimize Geometry using MMFF

Optimizing the Geometry using MMFF does the number of chair conformations and reduces twisted conformers.

for conf in mol_h.GetConformers():
    AllChem.MMFFOptimizeMolecule(mol_h, confId=conf.GetId())
confs = ring_conformers(mol_h)
Counter(confs)
Counter({'chair': 366, 'twisted': 131, 'boat': 3})

Still about 50% of the conformers remain in the twisted conformation.

from random import shuffle
conf_idx = [i for i, c in enumerate(confs) if c == 'twisted']
shuffle(conf_idx)

viewer = py3Dmol.view(width=300, height=300)
viewer.addModel(Chem.MolToMolBlock(mol_h, confId=conf_idx[0]), 'mol')
viewer.setStyle({"stick":{}})
viewer.zoomTo()
viewer.show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

ETKDG version 3

I commented this issue on Greg's blog and was suggested to try ETKDG version 3 algorithm [ref]. Here's the result.

params = Chem.rdDistGeom.srETKDGv3()
params.randomSeed = 0xf00d
params.clearConfs = True

cids = AllChem.EmbedMultipleConfs(mol_h, numConfs=500, params=params)
confs = ring_conformers(mol_h)
Counter(confs)
Counter({'chair': 500})

All 500 chair conformation even without doing minimization! But strange things were happening when I actually looked at some conformers. It appears the torsion angle requirements are all satisfied, but the ring was actually very much twisted.

from random import shuffle
#conf_idx = [i for i, c in enumerate(confs) if c == 'chair']
#shuffle(conf_idx)
conf_idx = [443]

viewer = py3Dmol.view(width=300, height=300)
viewer.addModel(Chem.MolToMolBlock(mol_h, confId=conf_idx[0]), 'mol')
viewer.setStyle({"stick":{}})
viewer.zoomTo()
viewer.show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

conf_idx = 443
conf = mol_h.GetConformer(conf_idx)
angs = [Chem.rdMolTransforms.GetDihedralDeg(conf, *t) for t in torsions]
print(angs)
[-46.976379180669305, -46.49507991432351, 70.73380111873767, -47.28801834351138, -46.701253412582766, 70.21989187631925]

In the six-membered ring, the signs of the torsion angle should alternate around positive and negative 60 degree. In my naive torsion angle check, I only checked if the absolute value of the torsion angles were around 60˚. However, in this case the torsion angles are arranged in around (-60, -60, +60, -60, -60, +60), which makes it twisted. Let's count how many of these are in our conformers.

confs = []
sums = []

for conf in mol_h.GetConformers():
    angs = np.array([Chem.rdMolTransforms.GetDihedralDeg(conf, *t) for t in torsions])
    
    # check if pos/neg values are alternating
    signs = angs / np.abs(angs)
    if np.sum(signs[1:] + signs[:-1]) == 0:
        confs.append('chair')
    else:
        confs.append('twisted')
Counter(confs)
Counter({'chair': 358, 'twisted': 142})

OK, so turns out, 358 (72%) of the conformers were actually in chair conformation and 142 (28%) of the conformers were in twisted conformers. Let's take a look at some of the chair and twisted conformers.

from random import shuffle
conf_idx = [i for i, c in enumerate(confs) if c == 'chair']
shuffle(conf_idx)

viewer = py3Dmol.view(width=300, height=300)
viewer.addModel(Chem.MolToMolBlock(mol_h, confId=conf_idx[0]), 'mol')
viewer.setStyle({"stick":{}})
viewer.zoomTo()
viewer.show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

A twisted conformer:

from random import shuffle
conf_idx = [i for i, c in enumerate(confs) if c == 'twisted']
shuffle(conf_idx)

viewer = py3Dmol.view(width=300, height=300)
viewer.addModel(Chem.MolToMolBlock(mol_h, confId=conf_idx[0]), 'mol')
viewer.setStyle({"stick":{}})
viewer.zoomTo()
viewer.show()

You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

Conclusion

I have used piperazine as an example because the molecule I was working had one, but I imagine cyclohexane may have the similar issue. The new ETKDG v3 algorithm does a good job getting chair conformation without FF optimization, but it could still produce seriously twisted conformers sometimes. User should check their conformer in such case with care before proceeding.

I have not looked into how RDKit's ETKDG algorithm is implemented yet. If it is simple, I might be able to contribute. Otherwise, I can use the conformation checker I used here to filter out non-chair conformers.