Automatization of Tinker for conformational search Tinker is a freely available, open source collection of molecular mechanics programs, which supports a variety of force fields, including MMFF, specifically designed for small organic molecules. One of the programs in Tinker is the scan conformers generator. The feature of this generator is the thoroughness of conformational search, which clearly distinguishes it from its contemporary analogues aimed at quick work with huge databases of drug-like compounds rather than targeting completeness of the conformational space. Some our recent examples indicating this are http://limor1.nioch.nsc.ru/quant/conformers/POXTRD/ http://limor1.nioch.nsc.ru/quant/conformers/1w96/ http://limor1.nioch.nsc.ru/quant/conformers/Verticillatriene/ http://limor1.nioch.nsc.ru/quant/conformers/tentoxin/ However, Tinker does not have scripts to automatically assign the atom types needed to run the force field calculations, e.g. MMFF94, the most reliable force field for small organic molecules. Fortunately, sdf2xyz2sdf can automatically generate Tinker inputs with MMFF94 atom types. Here, we propose an algorithm to automate an application of Tinker for conformational search. The algorithm is implemented in an operable bash script, see below. All explanations are given in the comments. #!/bin/bash # It's the public domain software # Dependencies: # 1) scan program from Tinker package, http://dasher.wustl.edu/tinker/ # Based on our expertise, we recommend the scan program from Tinker v.7 as # scan subroutine from Tinker v.8 (last release) works less reliable. # 2) sdf2tinkerxyz to be downloaded from sdf2xyz2sdf, http://sdf2xyz2sdf.sourceforge.net/ # 3) obabel to be downloaded from Open Babel, http://openbabel.org/wiki/Main_Page # Useful link about Tinker+sdf2xyz2sdf: # https://www.wildcardconsulting.dk/useful-information/scripting-molecular-mechanics-calculations-using-tinker-and-sdf2xyz2sdf/ TINKER_DIR="/usr/local/lib/tinker" scan="$TINKER_DIR/bin/scan" obabel="/usr/local/bin/obabel" sdf2tinkerxyz="$TINKER_DIR/bin/sdf2tinkerxyz" #tinkerxyz2sdf="$TINKER_DIR/bin/tinkerxyz2sdf" # Let's exemplified an application of this algorithm by # conformational search in cyclodecene. # You may draw 2D cyclodecene i.e. in MarvinSketch, add hydrogenes, # transform it in 3D and save it as sdf OR # generate from smile with Open Babel echo 'Generate 3D structure' $obabel -:"C1CCCC=CCCCC1" -O cyclodecene.sdf --gen3D #$obabel -:"C(=O)(O)C(C)(C)Oc1ccc(cc1)CC(=O)Nc1cc(cc(c1)C)C" -O 1g9w.sdf --gen3D echo # Likewise, any other format can be converted to sdf, i.e. #$obabel cyclodecene.xyz -O cyclodecene.sdf # But please check bonds in sdf if initial format doesn't contain connectivity. #for file in "$@"; do # For tinker_scan *.sdf for file in cyclodecene.sdf; do mol=`basename $file .sdf` # Tinker scan needs "good" numeration of atoms to work automatically # For example, cyclodecene drawn in Marvin Sketch # has "bad" numeration and scan fails with error # 'INITROT -- Rotation about 1 2 occurs more than once in Z-matrix' # Molden can renumerate atoms in "good" manner, but it isn't scriptable. # We found that canonical numeration produced by Open Babel is "good" for scan. echo 'Renumerate structure' $obabel $mol.sdf -O $mol.sdf --canonical --title $mol echo # Explicit title (1st line of sdf file) is needed for sdf2tinkerxyz # Convert SDF files into TINKER XYZ files performing automatic assignment # of MMFF94 atom types, bond types and charges. # http://dx.doi.org/10.1007/s00894-011-1111-7 echo "Convert SDF into TINKER XYZ" [ -e $mol.xyz ] && mv $mol.xyz $mol.xyzSA $sdf2tinkerxyz < $mol.sdf # title.xyz and title.key files are generated (title is 1st line of sdf file) mv $mol.xyz $mol.txyz [ -e $mol.xyzSA ] && mv $mol.xyzSA $mol.xyz echo # MMFF94 parameters echo "PARAMETERS $TINKER_DIR/params/mmff" >> $mol.key # To keep the chirality found at chiral tetravalent centers echo "ENFORCE-CHIRALITY" >> $mol.key ############ This paragraph does not apply to scan, but is relevant for ############ other tinker programs that compute the single point energy. # When dealing with the pi-sistems, sdf2tinkerxyz adds lines of type # 'mmff-pibond 2 3' into key file. Depending on the numbering of atoms, # this line may have an inverted order of numbers: 'mmff-pibond 3 2'. # This does not affect the optimized structure and its energy, but only the 1st # case gives correct single point energy. Although this has not been thoroughly # checked, it is possible that sorting the numbers in ascending order is the # solution to the problem. Our perl-script 'conformers' does this just in case. ############ # !!!!! RUN GENERATION OF CONFORMERS !!!!!! # Detailed description of Tinker's scan algorithm: # http://dasher.wustl.edu/ponder/papers/ccb-report-1998-01.pdf echo "Run Tinker scan for $mol" $scan $mol.txyz 0 5 20 0.0001 | tee $mol.out | \ grep -e 'Number of Torsions' -e 'Potential Surface Map' # Parameters '0 5 20 0.0001' for scan: # 0 - Automatic Selection of Torsional Angles # 5 - Number Search Directions for Local Search # 20 - Energy Threshold for Local Minima # 0.0001 - RMS Gradient per Atom Criterion # We recommend to try to increase 2nd and 3rd parameters for more full # conformational space (but sometimes incteasing of Search Directions # results in less space) # Extract from out file lines like # ` Potential Surface Map Minimum 54 36.9909` # Resulting line is '054 36.9909' echo "Extract energies from out file" perl -ane '/Potential Surface Map/ && printf "%03d %s\n", @F[4,5]' \ $mol.out > $mol.energy echo # Tinker format isn't common. # Unfortunately, program tinkerxyz2sdf from sdf2xyz2sdf doesn't work # well and obabel doesn't recognize Tinker's MMFF atoms. # Therefore we will convert Tinker format to XMol xyz format manually, # numbering the files according to the energy and # placing the energy in the comments (the 2nd line). # It takes a lot of time, perl script 'conformers' completes it more faster. echo "Sort conformers by energy and convert Tinker format into XMol xyz" count=1 sort -n -k 2 $mol.energy | while read N E; do fcount=`printf %03d $count` echo "$mol.$N => $mol.$fcount.xyz" (read N_atoms _ echo " $N_atoms" > $mol.$fcount.xyz echo " Energy $E" >> $mol.$fcount.xyz while read _ at x y z _; do at=`echo "$at" | perl -ne 's/^W+//; s/ZINC/ZN/; /^(C[LAU]|NA|SI|FE|[HCONSPFK]|BR|LI|ZN|MG)/ && print ucfirst lc $1'` echo " $at $x $y $z" >> $mol.$fcount.xyz done) < $mol.$N let count++ done echo # !!!!! END GENERATION OF CONFORMERS !!!!!! # The same work conformers perl-script can do # http://limor1.nioch.nsc.ru/quant/program/conformers/ #conformers -tinker -only_gen *.sdf # -only_gen eliminates additional optimization and removal of duplicates. # Often Tinker scan provides already optimized and only unique conformers, # therefore post-processing of the conformers is not required. # But sometimes removing of duplicates is required, i.e. by conformers script: # cat $mol.[0-9][0-9][0-9]*.xyz > t.xyz; conformers -no_mopac -no_ret t.xyz # Tinker scan does'n keep cis/trans configuration of double bonds. Is's bad. # As workaround, we use clusterization of conformers by smiles. echo "Clusterization of conformers by smiles" $obabel $mol.[0-9][0-9][0-9]*.xyz -o smi - | perl -lane \ '($smi,$name) = /(\S+)\s+(.+)/; push @{$h{$smi}}, $name; END {while (($k,$v) = each %h) {print "$k ",$#$v+1; print "@$v"}}' # In case of cyclodecene, result is # C\1=C\CCCCCCCC1: 32 conformers with cis double bond # C\1=C/CCCCCCCC1: 9 trans # [CH]1[CH]CCCCCCCC1: 13 also trans with strained double bond # which isn't recognized by obabel as double bond. # Perl script 'sort_by_smi' produce more readable and compact ouput # http://limor1.nioch.nsc.ru/quant/program/sort_by_smi/ # Delete conformers in Tinker format rm -f $mol.{[0-9][0-9][0-9],[0-9][0-9][0-9][0-9],[0-9][0-9][0-9][0-9][0-9]} rm -f $mol.{txyz,key,out,energy} done