10[1]化学分子计算机代数下载_在线阅读_25

is_659975

暂无简介

10[1]化学分子计算机代数 Computer algebra methods for studying and computing molecular conformations Ioannis Z. Emiris and Bernard Mourrain Institut National de Recherche en Informatique et Automatique (INRIA) 2004, route des Lucioles, B.P. 93, 06902 Sophia-Antipolis, France. {emiris,...

Computer algebra methods for studying and computing molecular conformations Ioannis Z. Emiris and Bernard Mourrain Institut National de Recherche en Informatique et Automatique (INRIA) 2004, route des Lucioles, B.P. 93, 06902 Sophia-Antipolis, France. {emiris,mourrain}@sophia.inria.fr June 15, 1997 Abstract A relatively new branch of computational biology has been emerging as an e�ort to supplement traditional tech- niques of large scale search in drug design by structure-based methods, in order to improve e�ciency and guarantee completeness. This paper studies the geometric structure of cyclic molecules, in particular the enumeration of all possible conformations, which is crucial in �nding the energetically favorable geometries, and the identi�cation of all degenerate conformations. Recent advances in computational algebra are exploited, including distance geometry, sparse polynomial theory, and matrix methods for numerically solving nonlinear multivariate polynomial systems. Moreover, we propose a complete array of computer algebra and symbolic computational geometry methods for modeling the rigidity constraints, formulating the problems in algebraic terms and, lastly, visualizing the computed conformations. The use of computer algebra systems and of public domain software is illustrated, in addition to more specialized programs developed by the authors, which are also freely available. Throughout our discussion, we show the relevance of successful paradigms and algorithms from geometry and robot kinematics to computational biology. Keywords: Molecular conformations, structure-based design, geometric and kinematic constraints, computer algebra, equation solving. 1 Introduction Identifying molecular structure is a critical question in pharmaceutical drug design and discovery for human medicine as well as veterinary products, insecticides and herbicides [BMB94]. More speci�cally, geometric structure is essential in function identi�cation concerning intramolecular and intermolecular properties [HK88], docking of relatively small �exible ligands to macromolecules, especially when the structure and the chemically predominant features at the receptor site's are unknown [LK92] and, lastly, pharmacophoric pattern matching [CJW + 94]. A contemporary e�ort is to supplement traditional techniques of large scale search by structure-based design methods in order to accelerate the screening process or guarantee its completeness. To realize the e�ort and cost involved in these domains we note that human drugs take an average of 8 to 12 years of research until a new product arrives in the market [BMB94]. This paper proposes computer algebra and symbolic computational geometry methods for modeling, computing and visualizing all spatial con�gurations, or conformations, of molecules. These methods are applied to small or medium-sized molecules, a domain that has been very active lately [Lea91, BMB94, FHK + 96]. In particular, we concentrate on cyclic, or ring, molecules; more generally, we consider constraints on cyclic portions of molecules. Enumeration of all possible conformations provides a complete and rigorous way to �nd an optimal or a few acceptable con�gurations with respect to energy minimization. The basic hypothesis is that, at a �rst approximation, energy depends only upon the torsion dihedral angles about certain bonds while bond angles and bond lengths may 1 be considered as rigid [GS70, CJW + 94]. Analogous methods are used to examine the intimately related problem of identifying singular or unstable conformations. An immediate generalization of these problems is to computing local deformations, where conditions are sought on the rotatable bonds connecting two rigid bodies so that they are in a given relative position [SNW94], and to embedding molecules in 3-dimensional Euclidean space. We propose general methods for modeling each problem in algebraic terms so that the resulting polynomial system has small dimension. We provide a brief introduction to relatively new and very powerful algorithms for studying basic properties of systems of simultaneous polynomial equations, in particular for counting and computing their common roots and identifying their singularities. We illustrate the use of computer algebra packages, such as Maple, which are straightforward to use, as well as software libraries and specialized implementations, which include algebraic and numeric algorithms of higher performance. Most of the implementations referred to in this paper are already publicly available, otherwise they can be obtained by the authors. Lastly, we demonstrate the use of two freely available visualization tools. The principal merits of the proposed methods are their completeness, their mathematical rigor and the limited need for user intervention. Furthermore, our methods are able to deal with approximate inputs or inputs of limited accuracy and produce the best possible output under some measure. Lastly, although not required, these methods can incorporate probabilistic techniques such as those in [FHK + 96] for dealing with large molecules. An overview of the paper follows. The next section points to related work. Section 3 examines algorithms to identify all possible conformations of a molecule, given some rigidity constraints. The problem is translated into algebraic terms in sections 3.1.1 and 3.1.2, �rst by a �ap angle formulation and then by distance geometry, both approaches yielding a reduction in the apparent problem dimension. Sections 3.2 and 3.3 apply recent advances in computational algebra that exploit the sparseness of the equations in order to estimate the number of solutions, then compute them numerically. We compare di�erent alternatives with respect to their speed and accuracy. Section 3.4 extends our methods from 6-atommolecules to 5- and 7-atommolecules, which is interesting algebraically, since the respective polynomial systems are over- and under-constrained. Section 4 proposes constructive methods for identifying the in�nitesimally unstable conformations. Section 5 discusses the software used, especially an object-oriented Maple modeler of geometric constraints and the visualization tools employed. A summary of our contributions concludes the paper in section 6. 2 Related work Several algorithmic approaches have been proposed for modeling, then solving problems of geometric nature con- cerning molecules. For conformational search (�nding all conformations minimizing a given energy function), di�erent search approaches have mostly been used, besides molecular dynamics, randomized and optimization tech- niques [HK88, Lea91, CJW + 94]. Such methods su�er from lack of completeness and are not always su�ciently fast. The problem addressed here is a critical subproblem of conformational search. We focus on constraints entailed by rings, which has been extensively studied by G o and Scheraga in the 1970's [GS70, GS73]. They established the viewpoint adopted in this paper, namely of considering the dihedral torsion angles as the only non-rigid parameters. At a �rst level of approximation, the search in the space of dihedral angles yields all minimum-energy conformations. Formalizing the rigidity constraints into algebraic terms may be harder than one imagines, especially since we try to minimize the dimension of the resulting polynomial system. Our �rst approach discussed in section 3.1.1 uses �ap angles and has been proposed by Parsons [PC94] for the cyclohexane, but had never been generalized. Distance geometry, used in section 3.1.2, provides a quite general approach and has been also used in [Hav91, HN95, HN97]. Although we do not develop these applications in detail, our methods can be used to solve kinematic and geometric constraints appearing in ligand docking as well as in searching for pharmacophores (3-dimensional structures exhibiting certain chemical properties); see [LK92, CJW + 94, PC94, FHK + 96]. The underlying premise is that such constraints are expressed by polynomial equations whose solution indicates either the position or the space transformation for the molecule in question. In solving the algebraic system, several ad hoc methods have been proposed, including [HN95]; see also [HN97]. The main contribution of [MZW95] is to use matrix methods for e�ciently solving geometric problems regarding cyclic molecules with 6 to 8 rotational degrees of freedom. In particular, extending to molecules with more than six varying dihedrals has been approached by a grid search of the space of the two last dihedrals, each grid point giving 2 rise to a six-dimensional subproblem [GS70, MZW95]. For molecules with six free dihedrals and an arbitrary number of atoms, the problem is identical to the inverse kinematics of a general serial manipulator with six rotational joints, which is a settled question for roboticists [MC92]. We suggest new methods that are general and guarantee the computation of all solutions, while exploiting the sparseness of the polynomial equations. Figure 1: One conformation of a cyclic 6-atom molecule. We point out the relevance of our work with respect to a relatively recent e�ort for applying successful paradigms from combinatorial geometry, robot kinematics, mechanism theory as well as vision to predicting the structure of molecules, embedding them in Euclidean space and �nding the energetically favorable conformations [PC94, FHK + 96]. The main premise for this interaction is the observation that various structural requirements on molecules can be modeled as macroscopic geometric or kinematic constraints. This work focuses on algebraic algorithms that have been very useful in the areas mentioned above but are not well-known among structural biologists and computational chemists. Incorporating energy minimization into the conformational analysis is the �nal objective and is most often performed independently, after the conformational analysis [HK88, FHK + 96]. However, this is a hard enough problem to warrant separate consideration. 3 Molecular conformations This section examines the problem of computing all conformations of cyclic molecules of 6 atoms, then section 3.4 extends to rings of 5 and 7 atoms. Conformations specify the 3-dimensional structure of the molecule under the ring closure requirement. Energy minima are typically found among the conformations obtained by allowing only the torsion dihedral angles to vary about single covalent bonds, while keeping bond lengths and bond angles �xed. A torsion dihedral angle is the solid angle between two consecutive planes, each de�ned by an atom and its two bonds. The medium of the molecule is ignored. Figure 1 shows a molecule drawn by izic; see section 5. The relationship to geometry and robotics is obvious once bonds are thought of as rigid joints and atoms as the mechanism's links or articulations. Figure 2 by D. Parsons [Par94] illustrates the kinematic equivalence between a cyclic 6-atom molecule and a robot with six rotational degrees of freedom (6R). The only movement allowed is rotation around the bonds' axes, so the question of identifying a conformation reduces to �nding the respective pose. In kinematic terms, the molecule is equivalent to a serial mechanism in which each pair of consecutive axes intersects at a link. This implies that the link o�sets are zero for all six links, which will allow us to reduce the 3 Figure 2: A cyclic 6-atom molecule and the equivalent 6R robot. 6-dimensional problem to a system of 3 polynomials in 3 unknowns. The product of all link transformation matrices is the identity matrix, since the end-e�ector is at the same position and orientation as the base link. 3.1 Algebraic formulation The molecule has a cyclic backbone of 6 atoms, typically of carbon. They determine primary structure, the object of our study. Carbon-hydrogen or other bonds outside the backbone are ignored. The bond lengths and angles provide the constraints while the six dihedral angles are allowed to vary. 3.1.1 A formulation using angles We adopt an approach proposed by Parsons [PC94, Par94]. His approach has been generalized and used on other molecules, including section 3.4, since it possesses the merit of capturing the e�ective dimension of the underlying system and of facilitating the visualization of the molecule once all parameters have been computed. Notation is de�ned in �gure 3. Backbone atoms are regarded as points p 1 ; : : : ; p 6 2 R 3 ; the unknown dihedrals are the angles ! 1 ; : : : ; ! 6 about axes (p 6 ; p 1 ) and (p i�1 ; p i ) for i = 2; : : : ; 6. Each of triangles T 1 = 4(p 1 ; p 2 ; p 6 ), T 2 =4(p 2 ; p 3 ; p 4 ) and T 3 = 4(p 4 ; p 5 ; p 6 ) is �xed for constant bond lengths L 1 ; : : : ; L 6 and bond angles � 1 ; � 3 ; � 5 . Then the lengths of (p 2 ; p 6 ), (p 2 ; p 4 ) and (p 4 ; p 6 ) are constant, hence base triangle 4(p 2 ; p 4 ; p 6 ) is �xed in space, de�ning the xy-plane of a coordinate frame. Let � 1 be the (dihedral) angle between the plane of T 1 and the xy-plane. Clearly, for any conformation, � 1 is well-de�ned. Similarly we de�ne angles � 2 and � 3 . We call them �ap (dihedral) angles to distinguish them from the bond dihedrals. Figure 3 attempts to capture the 3-dimensional nature of the molecule by showing the normal vectors to the triangles T i , denoted by N(T i) in the �gure, i = 1; 2; 3. Conversely, given lengths L i , angles � i for i = 1; : : : ; 6 and �ap angles � i for i = 1; : : : ; 3 the coordinates of all points p i are uniquely determined and hence the bond dihedral angles and the associated conformation are all well-de�ned. We have therefore reduced the problem to computing the three �ap angles � i which satisfy the constraints on bond angles � 2 ; � 4 ; � 6 . Hence we obtain polynomial system � 11 + � 12 cos � 2 + � 13 cos � 3 + � 14 cos � 2 cos � 3 + � 15 sin � 2 sin � 3 = 0; � 21 + � 22 cos � 3 + � 23 cos � 1 + � 24 cos � 3 cos � 1 + � 25 sin � 3 sin � 1 = 0; � 31 + � 32 cos � 1 + � 33 cos � 2 + � 34 cos � 1 cos � 2 + � 35 sin � 1 sin � 2 = 0; (1) 4 1 θ1 �� ω �� N(T2) �� L1 L2 L6 p1 b2 b1p5 a a a3 2 1 r p6 φ φ φ 1 5 6 ω2 ω6 θ 3 N(T1) N(T3) Figure 3: The parameters of the cyclohexane. cos 2 � 1 + sin 2 � 1 � 1 = 0; cos 2 � 2 + sin 2 � 2 � 1 = 0; cos 2 � 3 + sin 2 � 3 � 1 = 0; where the � ij are input coe�cients. For our resultant solvers we prefer an equivalent formulation with a smaller number of polynomials, obtained by applying the standard transformation to half-angles that gives rational equations in the new unknowns t i : t i = tan � i 2 : cos � i = 1� t 2 i 1 + t 2 i ; sin � i = 2t i 1 + t 2 i ; i = 1; 2; 3: This transformation captures automatically the last three equations in (1). By multiplying both sides of the i-th equation by (1 + t 2 j )(1 + t 2 k ), where (i; j; k) is a permutation of f1; 2; 3g, the polynomial system becomes f 1 = � 11 + � 12 t 2 2 + � 13 t 2 3 + � 14 t 2 t 3 + � 15 t 2 2 t 2 3 = 0 f 2 = � 21 + � 22 t 2 3 + � 23 t 2 1 + � 24 t 3 t 1 + � 25 t 2 3 t 2 1 = 0 (2) f 3 = � 31 + � 32 t 2 1 + � 33 t 2 2 + � 34 t 1 t 2 + � 35 t 2 1 t 2 2 = 0 where � ij are input coe�cients. 3.1.2 A formulation using distances Another formulation of the problem relying on the geometry of distances is presented here. It aims to show how Computational Symbolic Geometry can help in easily deriving constraints for such problems. Another application of this approach can be found in [Hav91, HN95]. Let us �rst have a glimpse on the space of spheres. A sphere has a unique equation (up to a nonzero scalar) of the form u 0 (x 2 + y 2 + z 2 )� 2u 1 x� 2u 2 y � 2u 3 z + u 4 = 0; (3) therefore we can identify the space of spheres with a subset of the projective space P 4 of classes (u 0 : u 1 : : : : : u 4 ). We also denote this projective space by S. 5 Conversely, given a point (u 0 : u 1 : : : : : u 4 ) 2 S = P 4 , if u 0 6= 0, the quadric (3) de�nes a sphere of center (1; u 1 u 0 ; u 2 u 0 ; u 3 u 0 ) (or (u 0 : � � � : u 3 ) in P 3 ) and radius R such that R 2 = Q(u 0 ;:::;u 4 ) u 2 0 where Q(u 0 ; : : : ; u 4 ) = u 2 1 + u 2 2 + u 2 3 � u 0 u 4 : If u 0 = 0, the quadric obtained by homogenization of (3) is the union of an a�ne plane and the plane at in�nity of P 3 (de�ned by t = 0, where t is the homogenization variable in P 3 ). It corresponds to a �point at in�nity� in P 4 . Moreover, if u 0 = u 1 = : : : = u 3 = 0, we obtain the special �sphere� ! = (0 : : : : : 0 : 1), whose center is nowhere. The special spheres of radius 0, will be called point-spheres. To any a�ne point A = (1 : a 1 : a 2 : a 3 ) 2 A 3 , we associate the point-sphere _ A = (1 : a 1 : a 2 : a 3 : a 2 1 + a 2 2 + a 2 3 ). This gives an embedding of A 3 in S. The space of spheres S has a natural inner-product given by the following formula: for any spheres S = (u 0 : � � � : u 4 ); S 0 = (u 0 0 : � � � : u 0 4 ) in S, (SjS 0 ) = u 1 u 0 1 + u 2 u 0 2 + u 3 u 0 3 � 1 2 (u 0 u 0 4 + u 4 u 0 0 ): It is not hard to check that if u 0 = u 0 0 = 1, then (SjS 0 ) = 1 2 (R 2 +R 02 � d 2 (O;O 0 )) where R;R 0 are the radii of the spheres, O;O 0 their centers and d 2 (O;O 0 ) the square of the distance between the two centers. Therefore, for two points A;B 2 A 3 , the inner-product is ( _ Aj _ B) = � 1 2 d 2 (A;B) (4) and we also have ( _ Aj _ A) = 0. We also check that for any point A 2 A 3 , ( _ Aj!) = � 1 2 : (5) See, for instance, [MS94, DH91] for more information. Let us come back now to the con�gurations of p 1 ; : : : ; p 6 and let us denote by L the list of point-spheres (!; _p 1 ; : : : ; _p 6 ). Consider the 7� 7 symmetric matrix whose coe�cient (i; j) is the inner-product of the i-th element of L and its j-th element. It is the Gramm-Schmidt matrix, constructed with the list L and the inner-product ( j ). It is also called the Cayley-Menger matrix of the points p 1 ; : : : ; p 6 ; see, for instance, [Ber77]. The spheres, being represented by a vector with 5 coordinates, it is not hard to check that this matrix is at most of rank 5 (spheres are represented by vectors with 5 coordinates). Using the equations (4), (5) and dividing by � 1 2 , we obtain a matrix of the form ! p 1 p 2 p 3 p 4 p 5 p 6 ! p 1 p 2 p 3 p 4 p 5 p 6 0 B B B B B B B B @ 0 1 1 1 1 1 1 1 0 c 1;2 c 1;3 u 1 c 1;5 c 1;6 1 c 1;2 0 c 2;3 c 2;4 u 2 c 2;6 1 c 1;3 c 2;3 0

本文档为【10[1]化学分子计算机代数】，请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑，图片更改请在作品中右键图片并更换，文字修改请直接点击文字进行修改，也可以新增和删除文档中的内容。

10[1]化学分子计算机代数

热门搜索

历史搜索