A pinboard by
Chuan Tian

I'm a PhD candidate at Stony Brook University studying computational chemistry.


Understanding the underlying mechanisms that govern the biological process requires scrutiny at spatial and temporal resolutions that can be challenging for current experimental techniques. Over the past three decades, molecular dynamics (MD) simulation has evolved to become an indispensable tool for studying biological phenomena with atomic precision and at relevant timescales. Such simulations have served as a “computational microscope”, providing information previously unattainable by experiments. Today, MD simulations are having a profound impact on the treatment of diseases and the development of new, potent drugs. For example, researchers have utilized MD simulations to capture the assembly of a whole virus inside host cells and tried to explain how virus cause diseases. Insight provided by these findings could be particularly informative for developing new drugs. Currently, an experimental drug entering Phase I clinical trial has only a 10% chance of reaching the market, and lack of clinical efficacy has become the most frequent cause. A deeper understanding of the fundamental mechanisms that underlie disease pathologies will enhance the success rate. To this effect, MD simulations have and will continue to make vital contributions.

In general, all biochemical processes in MD are governed by physical equations termed force field (FF). The FF mimic the forces acting on a molecule, and the accuracy of such forces governs the predictive capability of all MD simulations. As such, iteratively developing more precise FFs is vital to ensure that the theoretically derived models map onto physical reality. My current research is focused on improving the predictive power of FF by performing quantum calculations and numerical optimizations using state-of-the-art supercomputers. Specifically, I proposed a robust modelling framework that is capable of providing more rigorous description of the field of forces, followed by extensive testing against experimental benchmarks. My research has revealed that this ‘new’ FF can provide more reliable description of reality which is indeed a prerequisite for any MD simulation. Moreover, the new FF is superior to our lab’s prior FF that is the standard in the field with over 4000 citations. The application of more accurate model will greatly enhance our understanding of biological processes and aid in our ability to control, manipulate and arrest such processes, which will profoundly contribute to new drug development.


ff14SB: Improving the Accuracy of Protein Side Chain and Backbone Parameters from ff99SB.

Abstract: Molecular mechanics is powerful for its speed in atomistic simulations, but an accurate force field is required. The Amber ff99SB force field improved protein secondary structure balance and dynamics from earlier force fields like ff99, but weaknesses in side chain rotamer and backbone secondary structure preferences have been identified. Here, we performed a complete refit of all amino acid side chain dihedral parameters, which had been carried over from ff94. The training set of conformations included multidimensional dihedral scans designed to improve transferability of the parameters. Improvement in all amino acids was obtained as compared to ff99SB. Parameters were also generated for alternate protonation states of ionizable side chains. Average errors in relative energies of pairs of conformations were under 1.0 kcal/mol as compared to QM, reduced 35% from ff99SB. We also took the opportunity to make empirical adjustments to the protein backbone dihedral parameters as compared to ff99SB. Multiple small adjustments of φ and ψ parameters were tested against NMR scalar coupling data and secondary structure content for short peptides. The best results were obtained from a physically motivated adjustment to the φ rotational profile that compensates for lack of ff99SB QM training data in the β-ppII transition region. Together, these backbone and side chain modifications (hereafter called ff14SB) not only better reproduced their benchmarks, but also improved secondary structure content in small peptides and reproduction of NMR χ1 scalar coupling measurements for proteins in solution. We also discuss the Amber ff12SB parameter set, a preliminary version of ff14SB that includes most of its improvements.

Pub.: 18 Nov '15, Pinned: 30 Aug '17

Comparison of multiple Amber force fields and development of improved protein backbone parameters.

Abstract: The ff94 force field that is commonly associated with the Amber simulation package is one of the most widely used parameter sets for biomolecular simulation. After a decade of extensive use and testing, limitations in this force field, such as over-stabilization of alpha-helices, were reported by us and other researchers. This led to a number of attempts to improve these parameters, resulting in a variety of "Amber" force fields and significant difficulty in determining which should be used for a particular application. We show that several of these continue to suffer from inadequate balance between different secondary structure elements. In addition, the approach used in most of these studies neglected to account for the existence in Amber of two sets of backbone phi/psi dihedral terms. This led to parameter sets that provide unreasonable conformational preferences for glycine. We report here an effort to improve the phi/psi dihedral terms in the ff99 energy function. Dihedral term parameters are based on fitting the energies of multiple conformations of glycine and alanine tetrapeptides from high level ab initio quantum mechanical calculations. The new parameters for backbone dihedrals replace those in the existing ff99 force field. This parameter set, which we denote ff99SB, achieves a better balance of secondary structure elements as judged by improved distribution of backbone dihedrals for glycine and alanine with respect to PDB survey data. It also accomplishes improved agreement with published experimental data for conformational preferences of short alanine peptides and better accord with experimental NMR relaxation data of test protein systems.

Pub.: 19 Sep '06, Pinned: 30 Aug '17

Optimization of the additive CHARMM all-atom protein force field targeting improved sampling of the backbone φ, ψ and side-chain χ(1) and χ(2) dihedral angles.

Abstract: While the quality of the current CHARMM22/CMAP additive force field for proteins has been demonstrated in a large number of applications, limitations in the model with respect to the equilibrium between the sampling of helical and extended conformations in folding simulations have been noted. To overcome this, as well as make other improvements in the model, we present a combination of refinements that should result in enhanced accuracy in simulations of proteins. The common (non Gly, Pro) backbone CMAP potential has been refined against experimental solution NMR data for weakly structured peptides, resulting in a rebalancing of the energies of the α-helix and extended regions of the Ramachandran map, correcting the α-helical bias of CHARMM22/CMAP. The Gly and Pro CMAPs have been refitted to more accurate quantum-mechanical energy surfaces. Side-chain torsion parameters have been optimized by fitting to backbone-dependent quantum-mechanical energy surfaces, followed by additional empirical optimization targeting NMR scalar couplings for unfolded proteins. A comprehensive validation of the revised force field was then performed against data not used to guide parametrization: (i) comparison of simulations of eight proteins in their crystal environments with crystal structures; (ii) comparison with backbone scalar couplings for weakly structured peptides; (iii) comparison with NMR residual dipolar couplings and scalar couplings for both backbone and side-chains in folded proteins; (iv) equilibrium folding of mini-proteins. The results indicate that the revised CHARMM 36 parameters represent an improved model for the modeling and simulation studies of proteins, including studies of protein folding, assembly and functionally relevant conformational changes.

Pub.: 24 Jan '13, Pinned: 30 Aug '17

Further along the Road Less Traveled: AMBER ff15ipq, an Original Protein Force Field Built on a Self-Consistent Physical Model.

Abstract: We present the AMBER ff15ipq force field for proteins, the second-generation force field derived using the Implicitly Polarized Q (IPolQ) scheme for deriving implicitly polarized atomic charges in the presence of explicit solvent. The ff15ipq force field is a complete rederivation including more than 300 unique atomic charges, 900 unique torsion terms, 60 new angle parameters, and new atomic radii for polar hydrogens. The atomic charges were derived in the context of the SPC/Eb water model, which yields more accurate rotational diffusion of proteins and enables direct calculation of NMR relaxation parameters from molecular dynamics simulations. The atomic radii improve the accuracy of modeling salt bridge interactions relative to contemporary fixed-charge force fields, rectifying a limitation of ff14ipq that resulted from its use of pair-specific Lennard-Jones radii. In addition, ff15ipq reproduces penta-alanine J-coupling constants exceptionally well, gives reasonable agreement with NMR relaxation rates, and maintains the expected conformational propensities of structured proteins/peptides as well as disordered peptides - all on the μs time scale, which is a critical regime for drug design applications. These encouraging results demonstrate the power and robustness of our automated methods for deriving new force fields. All parameters described here and the mdgx program used to fit them are included in the AmberTools16 distribution.

Pub.: 12 Jul '16, Pinned: 30 Aug '17

Binding Mechanism of Inhibitors to CDK5/p25 Complex: Free Energy Calculation and Ranking Aggregation Analysis.

Abstract: Cyclin-dependent kinase-5 (CDK5) plays an indispensable role in the central nervous system. Competitive inhibition of the ATP-binding pocket of CDK5 is involved in fighting with neurodegenerative diseases, diabetes, tumors, inflammations etc. To better design ATP-binding competitive inhibitors, the binding mechanism of three important inhibitors of kinases, (R)-roscovitine (RRC), aloisine-A (ALH) and indirubin-3'-oxime (IXM), together with their receptor CDK5, were studied by molecular dynamics simulations. The H-bond analysis demonstrated that a strong bond was formed between the CO or NH groups in the backbone of Cys83 and the N or NH groups on the nitrogen-containing ring of inhibitors. These hydrogen bonds significantly increase the binding and inhibitory efficiency. The free energy analysis show that the order of predicted binding affinities of these three inhibitors toward CDK5/p25 is IXM>ALH>RRC, which is consistent with the experimental data. Besides the hydrogen bond formation, the van der Waals interactions between residues Ile10, Val18, and Leu133 of CDK5 and inhibitors were discovered to constitute another substantial component of their binding mode. Worth mentioning is that the conformational turnover of the inhibitor RRC was observed during the course of molecular dynamics simulations. We believe that this is the reason why RRC has the lower H-bond occupancy and binding affinity than the other two inhibitors. Furthermore, during the analysis of the per-residue decomposition, the ranking aggregation method was firstly employed to rank the contribution of different residues. The results demonstrated that the top five residues in the active pocket of CDK5 were Cys83, Leu133, Ile10, Phe82, and Glu81, which is in good agreement with the results of H-bond analysis and binding free energy analysis. These findings should provide insights into the inhibition mechanism of the CDK5/p25 complex and be useful for the rational design of novel ATP-binding competitive inhibitors in the near future.

Pub.: 01 Mar '13, Pinned: 30 Aug '17

Intrinsic backbone preferences are fully present in blocked amino acids.

Abstract: The preferences of amino acid residues for ,psi backbone angles vary strikingly among the amino acids, as shown by the backbone angle found from the (3)J(H(alpha),H(N)) coupling constant for short peptides in water. New data for the (3)J(H(alpha),H(N)) values of blocked amino acids (dipeptides) are given here. Dipeptides exhibit the full range of coupling constants shown by longer peptides such as GGXGG and dipeptides present the simplest system for analyzing backbone preferences. The dipeptide coupling constants are surprisingly close to values computed from the coil library (conformations of residues not in helices and not in sheets). Published coupling constants for GGXGG peptides agree closely with dipeptide values for all nonpolar residues and for some polar residues but not for X = D, N, T, and Y, which are probably affected by polar side chain-backbone interactions in GGXGG peptides. Thus, intrinsic backbone preferences are already determined at the dipeptide level and remain almost unchanged in GGXGG peptides and are strikingly similar in the coil library of conformations from protein structures. The simplest explanation for the backbone preferences is that backbone conformations are strongly affected by electrostatic dipole-dipole interactions in the peptide backbone and by screening of these interactions with water, which depends on nearby side chains. Strong backbone electrostatic interactions occur in dipeptides. This is shown by calculations both of backbone electrostatic energy for different conformers of the alanine dipeptide in the gas phase and by electrostatic solvation free energies of amino acid dipeptides.

Pub.: 21 Jan '06, Pinned: 30 Aug '17

A Novel Approach for Deriving Force Field Torsion Angle Parameters Accounting for Conformation-Dependent Solvation Effects.

Abstract: A procedure for deriving force field torsion parameters including certain previously neglected solvation effects is suggested. In contrast to the conventional in vacuo approaches, the dihedral parameters are obtained from the difference between the quantum-mechanical self-consistent reaction field and Poisson-Boltzmann continuum solvation models. An analysis of the solvation contributions shows that two major effects neglected when torsion parameters are derived in vacuo are (i) conformation-dependent solute polarization and (ii) solvation of conformation-dependent charge distribution. Using the glycosidic torsion as an example, we demonstrate that the corresponding correction for the torsion potential is substantial and important. Our approach avoids double counting of solvation effects and provides parameters that may be used in combination with any of the widely used nonpolarizable discrete solvent models, such as TIPnP or SPC/E, or with continuum solvent models. Differences between our model and the previously suggested solvation models are discussed. Improvements were demonstrated for the latest AMBER RNA χOL3 parameters derived with inclusion of solvent effects in a previous publication (Zgarbova et al. J. Chem. Theory Comput.2011, 7, 2886). The described procedure may help to provide consistently better force field parameters than the currently used parametrization approaches.

Pub.: 11 Sep '12, Pinned: 30 Aug '17

Context-independent, temperature-dependent helical propensities for amino acid residues.

Abstract: Assigned from data sets measured in water at 2, 25, and 60 degrees C containing (13)C=O NMR chemical shifts and [theta](222) ellipticities, helical propensities are reported for the 20 genetically coded amino acids, as well as for norvaline and norleucine. These have been introduced by chemical synthesis at central sites within length-optimized, spaced, solubilized Ala(19) hosts. The resulting polyalanine-derived, quantitative propensity sets express for each residue its temperature-dependent but context-independent tendency to forego a coil state and join a preexisting helical conformation. At 2 degrees C their rank ordering is: P < G < H < C, T, N < S < Y, F, W < V, D < K < Q < I < R, M < L < E < A; at 60 degrees C the rank becomes: H, P < G < C < R, K < T, Y, F < N, V < S < Q < W, D < I, M < E < A < L. The DeltaDeltaG values, kcal/mol, relative to alanine, for the cluster T, N, S, Y, F, W, V, D, Q, imply that at 2 degrees C all are strong breakers: DeltaDeltaG(mean) = +0.63 +/- 0.11, but at 60 degrees C their breaking tendencies are dramatically attenuated and converge toward the mean: DeltaDeltaG(mean) = +0.25 +/- 0.07. Accurate modeling of helix-rich proteins found in thermophiles, mesophiles, and organisms that flourish near 0 degrees C thus requires appropriately matched propensity sets. Comparisons are offered between the temperature-dependent propensity assignments of this study and those previously assigned by the Scheraga group; the special problems that attend propensity assignments for charged residues are illustrated by lysine guest data; and comparisons of errors in helicity assignments from shifts and ellipticity data show that the former provide superior precision and accuracy.

Pub.: 26 Aug '09, Pinned: 30 Aug '17

Universal solvation model based on solute electron density and on a continuum model of the solvent defined by the bulk dielectric constant and atomic surface tensions.

Abstract: We present a new continuum solvation model based on the quantum mechanical charge density of a solute molecule interacting with a continuum description of the solvent. The model is called SMD, where the "D" stands for "density" to denote that the full solute electron density is used without defining partial atomic charges. "Continuum" denotes that the solvent is not represented explicitly but rather as a dielectric medium with surface tension at the solute-solvent boundary. SMD is a universal solvation model, where "universal" denotes its applicability to any charged or uncharged solute in any solvent or liquid medium for which a few key descriptors are known (in particular, dielectric constant, refractive index, bulk surface tension, and acidity and basicity parameters). The model separates the observable solvation free energy into two main components. The first component is the bulk electrostatic contribution arising from a self-consistent reaction field treatment that involves the solution of the nonhomogeneous Poisson equation for electrostatics in terms of the integral-equation-formalism polarizable continuum model (IEF-PCM). The cavities for the bulk electrostatic calculation are defined by superpositions of nuclear-centered spheres. The second component is called the cavity-dispersion-solvent-structure term and is the contribution arising from short-range interactions between the solute and solvent molecules in the first solvation shell. This contribution is a sum of terms that are proportional (with geometry-dependent proportionality constants called atomic surface tensions) to the solvent-accessible surface areas of the individual atoms of the solute. The SMD model has been parametrized with a training set of 2821 solvation data including 112 aqueous ionic solvation free energies, 220 solvation free energies for 166 ions in acetonitrile, methanol, and dimethyl sulfoxide, 2346 solvation free energies for 318 neutral solutes in 91 solvents (90 nonaqueous organic solvents and water), and 143 transfer free energies for 93 neutral solutes between water and 15 organic solvents. The elements present in the solutes are H, C, N, O, F, Si, P, S, Cl, and Br. The SMD model employs a single set of parameters (intrinsic atomic Coulomb radii and atomic surface tension coefficients) optimized over six electronic structure methods: M05-2X/MIDI!6D, M05-2X/6-31G, M05-2X/6-31+G, M05-2X/cc-pVTZ, B3LYP/6-31G, and HF/6-31G. Although the SMD model has been parametrized using the IEF-PCM protocol for bulk electrostatics, it may also be employed with other algorithms for solving the nonhomogeneous Poisson equation for continuum solvation calculations in which the solute is represented by its electron density in real space. This includes, for example, the conductor-like screening algorithm. With the 6-31G basis set, the SMD model achieves mean unsigned errors of 0.6-1.0 kcal/mol in the solvation free energies of tested neutrals and mean unsigned errors of 4 kcal/mol on average for ions with either Gaussian03 or GAMESS.

Pub.: 16 Apr '09, Pinned: 30 Aug '17

A point-charge force field for molecular mechanics simulations of proteins based on condensed-phase quantum mechanical calculations.

Abstract: Molecular mechanics models have been applied extensively to study the dynamics of proteins and nucleic acids. Here we report the development of a third-generation point-charge all-atom force field for proteins. Following the earlier approach of Cornell et al., the charge set was obtained by fitting to the electrostatic potentials of dipeptides calculated using B3LYP/cc-pVTZ//HF/6-31G** quantum mechanical methods. The main-chain torsion parameters were obtained by fitting to the energy profiles of Ace-Ala-Nme and Ace-Gly-Nme di-peptides calculated using MP2/cc-pVTZ//HF/6-31G** quantum mechanical methods. All other parameters were taken from the existing AMBER data base. The major departure from previous force fields is that all quantum mechanical calculations were done in the condensed phase with continuum solvent models and an effective dielectric constant of epsilon = 4. We anticipate that this force field parameter set will address certain critical short comings of previous force fields in condensed-phase simulations of proteins. Initial tests on peptides demonstrated a high-degree of similarity between the calculated and the statistically measured Ramanchandran maps for both Ace-Gly-Nme and Ace-Ala-Nme di-peptides. Some highlights of our results include (1) well-preserved balance between the extended and helical region distributions, and (2) favorable type-II poly-proline helical region in agreement with recent experiments. Backward compatibility between the new and Cornell et al. charge sets, as judged by overall agreement between dipole moments, allows a smooth transition to the new force field in the area of ligand-binding calculations. Test simulations on a large set of proteins are also discussed.

Pub.: 08 Oct '03, Pinned: 30 Aug '17

Populations of the three major backbone conformations in 19 amino acid dipeptides.

Abstract: The amide III region of the peptide infrared and Raman spectra has been used to determine the relative populations of the three major backbone conformations (P(II), β, and α(R)) in 19 amino acid dipeptides. The results provide a benchmark for force field or other methods of predicting backbone conformations in flexible peptides. There are three resolvable backbone bands in the amide III region. The major population is either P(II) or β for all dipeptides except Gly, whereas the α(R) population is measurable but always minor (≤ 10%) for 18 dipeptides. (The Gly ϕ,ψ map is complex and so is the interpretation of the amide III bands of Gly.) There are substantial differences in the relative β and P(II) populations among the 19 dipeptides. The band frequencies have been assigned as P(II), 1,317-1,306 cm(-1); α(R), 1,304-1,294 cm(-1); and β, 1,294-1,270 cm(-1). The three bands were measured by both attenuated total reflection spectroscopy and by Raman spectroscopy. Consistent results, both for band frequency and relative population, were obtained by both spectroscopic methods. The β and P(II) bands were assigned from the dependence of the (3)J(H(N),H(α)) coupling constant (known for all 19 dipeptides) on the relative β population. The P(II) band assignment agrees with one made earlier from Raman optical activity data. The temperature dependences of the relative β and P(II) populations fit the standard model with Boltzmann-weighted energies for alanine and leucine between 30 and 60 °C.

Pub.: 06 Jan '11, Pinned: 30 Aug '17

Structure and dynamics of the homologous series of alanine peptides: a joint molecular dynamics/NMR study.

Abstract: The phi,psi backbone angle distribution of small homopolymeric model peptides is investigated by a joint molecular dynamics (MD) simulation and heteronuclear NMR study. Combining the accuracy of the measured scalar coupling constants and the atomistic detail of the all-atom MD simulations with explicit solvent, the thermal populations of the peptide conformational states are determined with an uncertainty of <5 %. Trialanine samples mainly ( approximately 90%) a poly-l-proline II helix-like structure, some ( approximately 10%) beta extended structure, but no alphaR helical conformations. No significant change in the distribution of conformers is observed with increasing chain length (Ala(3) to Ala(7)). Trivaline samples all three major conformations significantly. Triglycine samples the four corner regions of the Ramachandran space and exists in a slow conformational equilibrium between the cis and trans conformation of peptide bonds. The backbone angle distribution was also studied for the segment Ala3 surrounded by either three or eight amino acids on both N- and C-termini from a sequence derived from the protein hen egg white lysozyme. While the conformational distribution of the central three alanine residues in the 9mer is similar to that for the small peptides Ala(3)-Ala(7), major differences are found for the 19mer, which significantly (30-40%) samples alphaR helical stuctures.

Pub.: 01 Feb '07, Pinned: 30 Aug '17

Revised RNA dihedral parameters for the Amber force field improve RNA molecular dynamics.

Abstract: The backbone dihedral parameters of the Amber RNA force field were improved by fitting using multiple linear regression to potential energies determined by quantum chemistry calculations. Five backbone and four glycosidic dihedral parameters were fit simultaneously to reproduce the potential energies determined by a high-level density functional theory calculation (B97D3 functional with the AUG-CC-PVTZ basis set). Umbrella sampling was used to determine conformational free energies along the dihedral angles, and these better agree with the population of conformations observed in the protein data bank for the new parameters than for the conventional parameters. Molecular dynamics simulations performed on a set of hairpin loops, duplexes and tetramers with the new parameter set show improved modeling for the structures of tetramers CCCC, CAAU and GACC, and an RNA internal loop of non-canonical pairs, as compared to the conventional parameters. For the tetramers, the new parameters largely avoid the incorrect intercalated structures that dominate the conformational samples from the conventional parameters. For the internal loop, the major conformation solved by NMR is stable with the new parameters, but not with the conventional parameters. The new force field performs similarly to the conventional parameters for the UUCG and GCAA hairpin loops and the [U(UA)6A]2 duplex.

Pub.: 04 Jan '17, Pinned: 30 Aug '17