Research Article Open Access
Are Proteins Dynamically Heterogeneous? Neutron Scattering Analysis of Hydrogen Displacement Distributions
Wolfgang Doster1
1Technical University Munich, Physics-Department, D-85748 Garching, Germany and Jülich Centre for Neutron Science, FRM II, Lichtenbergstr. 1, D-85747 Garching, Germany.
*Corresponding author: Wolfgang Doster, Technical University Munich, Physics-Department, D-85748 Garching, Germany and Jülich Centre for Neutron Science, FRM II, Lichtenbergstr. 1, D-85747 Garching, Germany. Tel: (+49)8122903562; E-mail: @
Received: March 4, 2018; Accepted: April 2, 2018; Published: April 6, 2018
Citation: Doster W (2018) Are Proteins Dynamically Heterogeneous? Neutron Scattering Analysis of Hydrogen Displacement Distributions. Int J Mol Theor Phy. 2(1):1-14.
Abstract Top
Dynamic neutron scattering is a technique to record time resolved the molecular displacements of hydrogen atoms in biomolecules. With spectrometers, measuring the energy exchange, ħω, and momentum exchange, ħQ, of neutrons with the hydrogen atoms of the sample, one determines the scattering function S (Q, ω). The central physical quantity of interest, which can be derived from the scattering function by a double Fourier transform, is the displacement distribution Gs(r, t) . In this article we explain by three theoretical methods, how to determine the displacement distribution of hydrogen atoms in proteins from experimental data: (1) a moment expansion, (2) a numerical transform of the scattering function and (3) an analytical model of two principal dynamic components. The expansion yields a temperature dependent second moment of the displacement distribution and a Gauss-deviation, suggesting asymmetric motions. The numerical transform reveals a temperature dependent bimodal distribution function of small scale diffusive displacements and of rotational jumps on a larger scale. The principal components of the analytical model are thus rotational transitions of side chains and local diffusion of residues in a quasi-harmonic potential. The local diffusion is modelled by an over-damped Brownian oscillator derived from the Smoluchowski equation. The rotational transitions are treated by a three equivalent site model of the methyl group. Comparison of the model predictions with experimental data in the time domain show a remarkable agreement within a wide range of time scales, momentum exchange and temperature. The microscopic dynamics are compared with functional kinetic studies of ligand binding to myoglobin on a longer time scale. Dynamical heterogeneity is reduced to a bimodal distribution of molecular processes. Multiple scattering calculations provide an alternative explanation to heterogeneous quasi-elastic neutron scattering spectra and energy landscapes.

Keywords: Protein dynamical heterogeneity; Neutron scattering; Myoglobin; Methyl groups; Hydration water; Energy landscape, Ornstein-Uhlenbeck process.
Introduction
Proteins are hetero-polymers exhibiting well-defined, compact structures in contrast to artificial homo-polymers. Molecular dynamics in proteins thus differs from disordered open polymer structures. Depending on the sequence of amino acids one expects a heterogeneous system of local sites, which gives rise to considerable dynamical heterogeneity. Site sensitive techniques like neutron scattering and molecular dynamic simulations should be able to characterize the properties of fast motions in proteins. Is heterogeneity a dominant feature of protein dynamics or is it possible to focus on a small set of essential motions? The biological activity of enzymes is generally characterized by few well defined rate coefficients, while heterogeneity plays a minor role. This raises the question, whether the catalytic power of proteins requires the assistance of molecular dynamics. Arieh Warshel, co-Nobel prize winner of 2013 in Chemistry, argues that “flexibility is interfering with rate acceleration, catalysis requires rigid stereo-chemical structures“[1]. However conformational changes play an important role in regulation and specific catalytic steps. This open question can be addressed by combining dynamic studies with functional experiments. The idea of functional heterogeneity came from low temperature kinetic studies of ligand binding to the oxygen storage protein myoglobin. Flash photolysis experiments in various cryo-solvents revealed non-exponential kinetic steps with distributed rate coefficients [2]. Wide temperature range neutron scattering experiments and simulations of myoglobin were performed since 1988 to identify molecular motions on a pico-second time scale [4-7]. Instead of protein solutions, thermally stable hydrated powders were investigated [3]. This method avoids crystallization and reduces solvent scattering, moreover, the lack of solvent inhibits blurring global translational and rotational motions, while preserving the biological function. Two kinds of molecular motions, rotational transitions of side chains and small scale diffusive displacements were identified by low temperature transitions and the analysis of the non-Gaussian elastic scattering function [4]. An alternative interpretation of “dynamical heterogeneity” in terms of continuous displacement distributions was proposed simultaneously by Smith [6]. The two interpretations persist independently up to date, which is one of the main topics in this work. The new approach with myoglobin was soon applied to numerous other proteins. But instead of expanding the original spectral analysis of protein dynamics [4], the main effort until today is devoted to elastic scattering, which reflects mainly the ‘rigid’ part of the molecule [8-10]. The elastic spectrum is the most intense and thus most easily measurable component, allowing a simple analysis in analogy to static low angle scattering. Even more recent publications analyze elastic scattering curves complemented by simulations, which provide the missing spectral information [11]. Elastic scans were performed with numerous protein hydrated powders. Absolute Mean Square Displacement (MSD) can be determined from the slope of the scattering function versus momentum exchange (Figure 2) at low values as discussed below. Figure 1 shows
Figure 1: Second moment of hydrogen atom displacement distribution in D2O-hydrated myoglobin (green diamonds) and lysozyme (blue squares), instrument IN13, ${\tau }_{res}$ = 140 ps [4, 8, 9], heme iron of myoglobin, crystal: red circles, in sucrose solution: red triangles by Mössbauer spectroscopy [12, 13], short dashed line: vibrational displacements calculated from density of states [5]. Full line: model calculation, type II. The onset temperatures of type I and II motions are indicated. Dashed arrow: onset of vibrational displacements (Bose factor), arrow: unresolved transition, magenta circles: δB2 of the TR model.
Figure 2: Normalized elastic scattering profiles of D2O-hydrated myoglobin versus Q2 and temperature, backscattering spectrometer IN13 [4, 8, 7]. The full lines were adjusted to the TR model of equation (19-21)
typical temperature scans of the hydrogen atom mean square displacements comparing two proteins with different structures, hydrated lysozyme (β-sheets) and myoglobin (α-helices) [4, 8, 9]. There is little difference in MSD between the two proteins at low temperature, slight deviations emerge however above 240 K. The low temperature plateau is terminated at TL ≅ 50 K by the condition ħωL ≅ KBTL, the Bose factor of low frequency vibrations, crossing over to a linear regime of harmonic excitations [5]. A first deviation from linearity occurs at about 180 K, which is denoted as type I transition identical for both proteins. A second transition emerges at 240 K, hardly visible as a deviation from the dashed line, which we call type II transition. The type II displacements in the α-helical protein seem to be slightly larger than in the β-sheet protein. But the discrepancy could equally well reflect a different degree of hydration. The figure is meant to illustrate that the popular method based on the interpretation of displacement scans alone is difficult. The two transition temperatures were already identified in 1989 [4]. The assignment was only possible with the assistance of an analytical model of the complete elastic scattering function as shown in Figure 2. A combination of two molecular processes had to be assumed, rotational transitions (type I) and local translational diffusion (type II). The distinction became convincing, when it was realized that type II requires the presence of water, while type I is nearly independent of the protein environment [4, 8]. This model and its extensions will be discussed below.

Elastic scans are based on a fixed energy window method, which implies that the resulting displacements carry the time tag of the elastic instrumental resolution. For the back-scattering data of IN13 above, the relevant resolution time is τres ≅ 140 ps. Fixed energy window scans have been used with other methods. For myoglobin, the γ-resonance spectroscopy of the heme iron (Mössbauer) is often compared with neutron scattering, which is therefore displayed in Figure 1 [12, 13]. Although the γ- MSD scans resemble those obtained with neutrons, there are three main differences: (1) Mössbauer spectroscopy records the local displacements of the heme iron, while neutrons record the hydrogen atoms, which are distributed evenly across the protein, (2) the spatial scale, about 0.1 Å, is much smaller than for neutron scattering, which is in the range of a few Å. The γ-resonance thus allows one to measure protein global diffusion on a very small scale at very high resolution. From this method we know, that global diffusion is arrested in protein crystals and hydrated powders below 0.45 g /g [14]. (3) the instrumental resolution τresγ = 142 ns is by a factor of 1000 higher than with neutron backscattering (IN13), τresn ≅ 140 ps. Protein crystals have about the same water content as fully hydrated powders, suggesting similar dynamical properties. In contrast to neutron scattering, there is only a single transition with Mössbauer at Tγ ≅ 205 K. To investigate the effect of the solvent on the transition, we have performed γ-resonance experiments with myoglobin in a highly viscous sucrose solution [13]. As a result, the onset temperature is shifted to 240 K. This experiment shows that the heme interacts strongly with the surrounding solvent and is thus associated with type II motions. The difference by 35 degrees in the onset temperatures of γ/n in the hydrated/crystal case is a consequence of the different resolution times of the two methods. With this concept the temperature dependent iron displacement could be reproduced, with the 142 ns resolution time and the water relaxation time as input [10]. Frauenfelder et al. by contrast consider neutron and γ-resonance displacements as being essentially identical, which ignores the vastly different resolution times of the two methods [15]. A more detailed discussion of these discrepancies is given in reference [16]. Blinded by the narrow view of elastic scattering, one easily is led to incorrect conclusions. Similar experiments with different solvents have been performed with neutron scattering: Myoglobin in D-exchanged sucrose did not show a single transition and was classified as rigid [17]. By contrast, with myoglobin embedded in per-deuterated glucose glass, the type I transition could be fully observed as with dehydrated myoglobin [8, 9]. Sucrose contains non-exchangeable protons, which dominate the elastic intensity of the glassy myoglobin-solvent system, overwhelming the protein-internal motions.

After 30 years of effort, combining experiments and molecular dynamic simulations, all major questions should have been resolved. Instead, up to now, there is no accepted molecular model available in the literature, combining elastic and inelastic neutron scattering spectra of proteins. At present, this field is controlled by phenomenological models, which play a major role in the standard list of citations of neutron scattering work:

The Bicout-Zaccai model derives protein dynamics exclusively from elastic displacements and their temperature dependence as in Figure 1 [17, 18]. Reducing protein dynamics to harmonic oscillators, the central goal is to determine a global protein force constant from the displacement temperature slope. A change in the slope at a particular temperature is interpreted as indication of a softening transition of the general force constant [9]. Energy landscape models picture protein dynamics as single particle random walks across a complex surface. Frauenfelder has launched recently a massive attack on well-established neutron scattering theory, which is supposed to fail for complex systems [15, 19]. In this view, quasi-elastic neutron scattering spectra have to be analyzed by a heterogeneous superposition of resolution-limited narrow lines, dictated by the energy landscape. We show below that some of these conclusions are based on incorrect assumptions, ignoring for instance the effects of multiple scattering.

More detailed information can be derived from molecular dynamic simulations, by calculating the correlation function for each site. Protein structures are lacking translational symmetry, which suggests a broad distribution of molecular motions. In the limit of small momentum exchange, the scattering function may be expanded by a sum of Gaussian local site distributions. This is the ‘dynamical heterogeneity’ concept of protein sites [6, 20, 26]. Several distributions, Gamma, exponential, bimodal, derived by simulations, were found to be compatible with the data, the experimental curves however are site-averaged. Experimentally, we have deduced a bimodal distribution (8), which suggests a discrete set of processes.

In this article we follow the concept of a pluralistic methodology: It is shown, that the dynamic displacement distribution can determined theoretically by applying three methods: (1) a moment expansion of the distribution function, (2) a numerical transform of the elastic scattering functions and (3) an analytical model assuming two principal components of protein dynamics. The methods are explained by using a novel analysis of previously published data.
Methods
Neutron scattering requires big machines, reactors or accelerators, which are located in a dozen research facilities scattered over the world. The access is free to anybody, who can come up with good proposals for experiments. One of the goals of this article is to explain the method, trying to generate interest and understanding for its potential application in a biological context. With the advent of reactors and more recently of accelerators, neutrons have become a standard tool of investigating density fluctuations in condensed matter. The de Broglie wavelength of thermal neutrons, a few Å, is close to interatomic distances in liquids and solids. Thus interference effects occur, which yield information on the molecular structure of the scattering system. Second, since the neutron is uncharged, there is no Coulomb barrier to overcome. The scattering of nuclear forces can be treated by a delta-function Fermi pseudopotential [21-23]. For certain nuclides the scattering cross section is large, the hydrogen atom has a nearly ten times larger cross section than the other atoms of organic matter and the deuteron. In proteins typically 84% of the scattering fraction is due to the hydrogen atoms [24, 25]. Because of the low cross section of the deuteron, one can diminish the contribution of water by using D2O to about 4% at full hydration, 0.38g water per g protein [25]. Thirdly, the energy of thermal neutrons compares favorably with thermal excitations in biomolecules, which can be studied by energy exchange experiments on pico- to nanosecond time scale.

Below we focus on experiments performed mainly with the back-scattering spectrometer IN13 at the ILL in Grenoble [23]. Back-scattering by Bragg reflection maximizes the instrumental resolution (FWHM =7μeV), corresponding to a resolution time window, τres= ħ/FWHM ≅140 ps. Neutrons are selected from the primary beam by Bragg reflection at λ = 2.23 Å. The scattered beam hits the same type of analyzer crystal, whose lattice constants are modified by heating. A temperature scan of the mismatched crystal produces the spectral information from the scattered beam. The most important property of IN13 is the exceptionally large Q2 range up to 25 Å-2 combined with a high energy resolution due to the short primary wavelength. This combination allows in particular conclusive tests of spatial displacement distributions of molecular motions. A comparable back-scattering spectrometer, HFBS at NIST, has a Q2 range of only 4 Å-2.
Theoretical background [10, 21-23, 26]
The basic theoretical quantity of incoherent neutron scattering is the single particle density-density correlation function, ${G}_{s}\left(\stackrel{\to }{r},t\right)$ of displacements $\stackrel{\to }{r}\left(t\right)$ . Ignoring quantum effects, it may be interpreted as the conditional probability, averaged over all initial positions ${\stackrel{\to }{r}}_{0}$ , that a particle has moved across a distance $\stackrel{\to }{r}.$ during the time interval ‘t’.
where $p\left({\stackrel{\to }{r}}_{0},{t}_{0}/{\stackrel{\to }{r}}_{0}+\stackrel{\to }{r},{t}_{0}+t\right)$ denotes the conditional probability for this transition.

The Fourier transform of equation-(1) defines the selfintermediate scattering function ${\Phi }_{s}\left(\stackrel{\to }{Q},t\right)$ versus momentum exchange $\stackrel{\to }{Q}$ , which is measured in a scattering experiment:
The scattering function of equation-(2) can be expanded in powers of $\stackrel{\to }{Q}\cdot \stackrel{\to }{r}$ . and the spatial moments of $G\left(\stackrel{\to }{r},t\right)$ resulting in [8, 21-23, 26]:
For isotropic systems the average covers all orientations leading to:
For a Gaussian Gs(r, t) one has $〈{r}^{4}\left(t\right)〉=\frac{5}{3}〈{r}^{2}\left(t\right)〉$ which of course gives:
Often it is useful to split the intermediate scattering function into a time-dependent component F (Q, t) with F (Q, t → ∝) = 0 and a time-independent part, the EISF (Q) according to:
The EISF (Q) is also known as the “elastic incoherent structure factor” of the spectrum, since, in the frequency domain, it gives rise to elastic scattering. It is defined as the long time limit of the averaged scattering function according to equation- 2:
From the EISF (Q) one can reconstruct the powder averaged displacement distribution of proteins, based on experiments, which is the focus of this work. The Gaussian distribution of equation (5 ) yields at long times:
With $〈{\left(\Delta r\right)}_{G}^{2}〉=\frac{1}{2}〈{r}^{2}\left(t\to \infty \right)〉$ . On a semi-log scale versus Q2 one thus expects for the EISFG (Q) straight lines with slopes yielding the Gaussian MSD = <Δx2> = <ΔrG2>/3.

Experimentally, one is limited by the finite resolution time τres of the spectrometer. Thus instead of equation-7, one needs to consider a resolution corrected function, the REISF (Q), which accounts for the displacements at a particular time τres [10]:
Elastic scattering experiments are always performed in the frequency domain, which involves the Fourier transform of equation (2), given by the incoherent dynamic structure factor SR (Q, ω), convoluted with the instrumental resolution function R (ω, Δω = 1/τres):
Sqel (Q, ω) denotes the quasi-elastic spectrum, the Fourier transform of F (Q, t) in equation (9), the star is a short notion of the convolution integral. In the following we analyze the normalized elastic fraction REISF (Q, τres) with respect to a particular observation time, τres. A low temperature spectrum (10 K), whose shape is defined by the resolution function R (ω,Δω), serves to normalize the data obtained at the higher temperatures:
The normalized elastic intensity is then determined from the experiment:
The normalized elastic intensity in frequency space can be interpreted as an approximate intermediate scattering function in the time domain at the resolution time t ≅ τres, which is one basic result of “elastic resolution spectroscopy” [10, 27]:
τc(T) denotes the correlation time of a molecular process. The REISF (Q, τres, τc) differs from SNel (Q, τres, τc), since the latter includes a quasi-elastic intensity at ω = 0, while the former only records the true elastic intensity. This difference, usually ignored in elastic scattering studies, will be small only for small ratios τresc.
Results
Elastic backscattering experiments with D2O-hydrated myoglobin [4, 8]
The elastic intensity was determined with a fixed energy window method at a resolution of 7 μeV, full width at half maximum, applying the back-scattering spectrometer IN13 at the incident neutron wavelength of 2,23 Å: The primary beam crystal and the analyzer crystal are kept at the same temperature. The sample, 350 mg of D2O-hydrated horse myoglobin (SIGMA Chemical, h = 0.38 g/g), was continuously heated from 4 K to 330 K over a period of 24 h. The samples were measured in a thin walled, vacuum tight aluminum cell, diameter 50 mm and interspacing of 1 mm with the sample at 135° to the incident beam. With 350 mg of hydrated sample the neutron transmission was close to a tolerable 90%. Raw data were corrected for detector response and cell scattering. The elastic scattering curves, normalized at the lowest temperature according to equation-11, are displayed in figure 2. With normalization at 10 K, one removes the zero point vibrations, which are not relevant in the present context [4, 8]. However the second normalization step at Q = 0 Å-1 at each temperature as in references 4 and 19 was omitted. The data of figure 2 exhibit three main features:

(1) The elastic intensity decreases with increasing Q and temperature. The loss in intensity is balanced by an increasing quasi-elastic intensity and line broadening due to resolved molecular motions (equation 9).

(2) The elastic scattering curves on a logarithmic scale deviate from straight lines except at the lowest temperatures, revealing that the protein hydrogen displacements are not Gaussian distributed.

(3) Extrapolated to zero Q, the elastic intensity tends to decrease with increasing temperature, which is a clear sign of multiple scattering. These features are identical for nearly all hydrated and lyophilized proteins, irrespective of their structure.

The data in figure 2 vary smoothly with Q, the deviations from a straight line are subtle. Most workers are interested only in MSD values versus the temperature, which requires evaluating the slope at low Q [17, 18]. This can be done in many ways. Thus there is little consensus in the literature, how to properly analyze elastic scattering profiles. Since there is more information available than just MSD values, we approach the data of figure 2 from three different points of view.
The moment (Plazcek) expansion of elastic scattering data
The most general method is to perform a moment expansion without introducing specific assumptions about the nature of the molecular processes according equations 3&4. The moment expansion was previously applied to neutron scattering experiments in ref. 8, 28 and in the context of protein simulations and dynamical heterogeneity in reference 26.

The data of figure 2 are reasonably well approximated by a second order polynomial in Q2 ecording to equation (4):
While B (T) and C (T) involve the second and forth order spatial moment, the zero order term A (T) emerges in all experiments with a magnitude depending on the sample transmission. It thus reflects second and higher order scattering contributions, which are approximately angle independent [5, 29]. Figure 3 displays an example of how equation-(13) is adjusted to the data on a logarithmic scale applying different fitting intervals. Fits on a linear scale are not stable and increase at higher Q.

Figure 4 shows the resulting second moment (MSD) and the so-called Gauss deviation G (t), which is constructed from the second and the forth moment and can depend on time[8]:
Figure 3: Polynomial fit on a logarithmic scale according to equation 13 with different fitting intervals. Adjustments on a linear scale become unstable at higher Q. The very large Q range of IN13 is crucial to the success of this method.
Figure 4: Mean square displacements and Gauss deviation G (T) versus the temperature forIN13, triangles (τres= 140 ps) and IN6 (red circles, τres= 15 ps) from fits of the data in figure 2 to equation13.
For myoglobin, the Gauss deviation amounts to 0.75, which is significantly above 0.5, the values for a standard distribution. It is independent of the temperature and the fitting range. Protein displacements are thus either locally anisotropic or heterogeneous. The deviation becomes even larger at shorter observation times, suggesting anisotropy relaxation. The second moments <Δx2>= <Δr2>/3 are temperature dependent and display the well-known “dynamical transition” around 180 K at τres= 140 ps. Also shown are the short time displacements at τres ≅ 15 ps, measured with the time of flight spectrometer IN6 (ILL Grenoble). The displacements start to deviate from linearity above 200 K, but on a lower level. It is more difficult to identify a transition temperature. The full line denotes the vibrational MSD calculated from the density of states [5, 29].

Figure 5 shows that the zero order term of the expansion A(T) in equation-13 deviates from unity increasingly above 50K. Thus, the normalized elastic intensities do not extrapolate to unity at Q = 0 except at the lowest temperatures. This important result, which has been observed frequently, is not compatible with single scattering theory. Below we show for the first time, that A (T) can be reproduced temperature dependent by second scattering calculations. Multiple scattering effects are largely ignored in the bio-neutron scattering field. In 1990 we have published a protein vibrational density of states, which was corrected for multiple scattering [5]. Also the data in our analysis of displacement distributions, the MSD values were corrected for multiple scattering [8].
Figure 5: Zero-Q value A (T) derived from the polynomial fit to equation 13 (open circles) together with calculations of the multiple scattering intensities (full line). Also shown are the reconstructed elastic intensities at zero Q of dry GFP by Frauenfelder et al. (red circles) [19].
Direct numerical Fourier transform of elastic scattering data
By inverting the Fourier transform of equation (2) or its isotropic average of equation 7&8 numerically, one could in principle derive G(r, t ≅ τres) from experimental data. Figure 2 displays powder-averaged elastic scattering profiles from amorphous samples, thus only isotropic distributions can be obtained. After switching the powder average with the Fourier integral in equation (8), an isotropic displacement distribution is obtained according to:
with REISF (Q, τres) being symmetric in ± Q. Unfortunately, even the very large experimental Q-range of IN 13 is not sufficient to calculate meaningful Fourier transforms. The alternative is to approximate the experimental data by analytical functions, which can be easily transformed. The most appropriate choice, according to equation (7a), is a sum of Gaussian distributions. The elastic scattering profiles in figure 2 are well approximated by a sum of two Gaussians with temperature dependent parameters fi (T) and γi(T) [8]:
In Figure 6 the result of this procedure is displayed as 4π r2< G(r, tres)>power at various temperatures.

This quantity denotes the conditional probability that a particle, which started at the origin at t = 0, has moved to a shell between r and r+dr during an interval. t = τres.
Figure 6: Displacement distribution derived by analytical inversion of the data of figure 2 adjusted to equation 16, versus the temperature: 200, 220, 240, 260, 280 and 300 K. The classification as local diffusion and methyl torsion is only suggestive at this stage. The black curves reflect the hydrated sample at 0.38 g/g, while the red curve is derived from a lyophilized sample (< 0.05 g/g). Local diffusion is arrested, but rotational transitions are preserved as in the hydrated case.
The resulting displacement distribution exhibits two maxima, which evolve with the temperature: Below 180 K only a single maximum is recorded at r ≅ 0.2 Å, which broadens with increasing temperature. It reflects mostly vibrational motions. At about 240K, this maximum starts shifting to r ≅ 0.5 Å and its width increases. This effect is compatible with a Gaussian distribution of small scale displacements, which become resolved above this temperature. It is only observed with D2O-hydrated samples, dry samples do not show this feature [8]. Above 180 K a second maximum emerges around 1.5 Å, which can be found in all data sets independent of how the protein environment was prepared, hydrated, dehydrated or vitrified. Such large excursions point to rotational transitions.

This suggests that the bimodal distribution reflects two specific molecular processes. It should be stressed, that the method of inverting the Fourier transform using a sum of two Gaussians differs formally from the “dynamical heterogeneity” concept [20, 24, 26]. There a sum of Gaussians is used to explain the non-Gaussian nature of the scattering function by site heterogeneity. Since scattering is a local process, each atom “j” has its own dynamical structure factor. In this model all the local structure factors are assumed to be Gaussian with distributed second moments:
Intrinsic non-Gaussian scattering is thus excluded. This model is well adjusted to simulations, since the atomic scattering functions are available. To interpret experiments, one introduces groups of similar sites associated with MSD distribution functions g (α) [20, 24, 26]:
with α = <Δx2>. Exponential, Gamma and also bimodal distributions have been suggested in dynamical heterogeneity work of globular proteins [20]. The g (α) is a temperature independent distribution of weakly temperature dependent second moments <Δx2>. The well resolved bimodal distribution of figure. 6 suggests two types of processes, I and II, with different dynamic characteristics and dependence on the protein environment. This result is the main motivation to suggest a minimal model of protein dynamics based on just two processes.
A principle component model of protein dynamics
The third possibility to evaluate the dynamic displacement distribution is to adjust an analytical model to the data. Several attempts to derive dynamical models from a structural classification of hydrogen atoms and MD simulations have been published. Up to five components were proposed [24, 30-32]. It is fair to conclude, that the available data base and the many parameter fits were not sufficient to establish conclusive results. In the “three classes of motion” paper by Hong et al. a wide frequency range combining three spectrometers is presented, but the analysis is based a single averaged Q-value [32]. Dellerue et al. combine inelastic neutron scattering experiments with analytical models and simulations [31]. To represent local diffusion, they use the Dianoux-Volino model of free diffusion inside a sphere [33]. Because of the limited Q-range of the Mibemol spectrometer, they could not test the validity of their model. The authors display simulated intermediate scattering functions of main and side chain atoms, depending on the distance from the surface. Engler et al. derive a three-Gaussian model of displacements from the neutron structure of myoglobin [24]. By modelling the temperature dependence and accounting for the structural disorder, good fits of the elastic profiles in fig. 2 are reported. Unfortunately, we failed to reproduce these fits, in our view the rotational transitions are underestimated, which are not Gaussian distributed. It is thus not the lysine residue, which gives the strongest contribution to the displacements.

Motivated by the results of figure 6, we start with a bimodal distribution of sites associated with two kinds of motions: (I) internal rotational transitions and (II) local translational diffusive displacements. Thus a bimodal correlation function of principle components is assumed:
σI denotes the fractional cross section of the type I sites.

Equation-(19) has to be complemented by a term accounting for global protein diffusion, if the degree of hydration exceeds 0.4 g/g.

According to the neutron structure of myoglobin 25 % of the total number of hydrogen’s are organized in methyl groups, thus σI = σm ≅ 0.25 [24].

More specifically, type I motions are defined by a three site jump model of methyl groups reorienting by 120° jumps about their three-fold symmetry axis [8, 23]:
is the zero order Bessel function, with r = 1.03 Å is the length of the C-H bond. Exponential relaxation or a sharp barrier height is assumed for simplicity. Note that τrot = τMet/3 and that the EISFmet (Q) = Φrot (Q, t >> τrot) is not Gaussian.
By contrast, type II processes involve small local displacements of residues confined by a quasi- harmonic potential. Instead of the Dinanoux-Volino model of free diffusion inside a rigid sphere[33], a continuous confining potential and Brownian motion seems more physical. The intermediate scattering function of the Brownian oscillator associated with an Ornstein-Uhlenbeck process is derived in the Supplement. The result is [36]
δB2 = D / γ = KBT / (m ω0 2) is the translational mean square displacement. D is the local diffusion coefficient and γ is the damping constant. The rate is given by 1/τB= D /δB2. ΦB (Q, t >> τB) turns into a Gaussian elastic scattering function at long times:
The analytical model comprising two principal components is thus defined by equation-(19), combining ΦI (Q,t) = Φrot (Q,t) according to equation-(20), and ΦII (Q,t) = ΦB (Q,t) = Φtrans (Q,t), of equation-(21), which we call briefly the “TR model” of translation and rotation.
TR model of elastic scattering profiles
Elastic scattering is collected in the frequency domain using a fixed window method, ΔEres = ħ/τres at ω = 0: The normalized elastic intensity SNel (Q, τres, τc) is determined by equation-(11). Within the approximation of equation-12, assuming a deltacorrelated resolution function, this equals the intermediate scattering function at t = τres:
Figure 7a shows a decomposition of the elastic scattering profile at 270 K (figure 2) into a rotational and a Gaussian component according to the TR model. That the experimental data agree quite well with the predictions, is the first important test of the spatial properties of the TR model at a fixed temperature. The next step is to deduce the temperature dependence of the correlation times by fitting the data to the model equation as shown in figure 2. An Arrhenius plot of the resulting correlation times τrot (T) and tB = τtrans (T) are displayed in figure 7b. The third parameter, the translational MSD, δB2 ≅ 0.1 (±0.02) Å2, shown in figure 1, depends very little on the temperature. That the strong temperature dependent displacements can be reduced to a nearly temperature independent model displacement, is a striking success of the TR model. The correlation times thus carry the full temperature dependence of the model. Both correlation times, as determined from elastic scans, compare well with the spectral analysis of methyl rotational transitions (I) and hydration water (II) [8, 10, 37-42]. The average methyl barrier amounts to HI = 11 (±1) kJ/mol, with a pre-exponential of AI = 1.6 (± 0.5)1013s-1. Also, the translational correlation times, τtrans (T), superimpose within experimental error with those derived for protein hydration water by neutron spectroscopy [10, 37]. The type II barrier amounts to HII = 17 (±1) kJ/mol with a pre-exponential of AII = 5 (±1)1013 s-1.
Figure 7(a): Decomposition of the elastic scattering function of figure 2 at 270 K according to the TR model. The lines reflect the theoretical adjusted components of equ. 20 (t >> τrot) and equation (22).
7(b): Arrhenius plot of the correlation times τrot ( red triangles) and τtrans(black circles) determined from the TR fit in comparison with a spectral analysis of methyl group rotation(black diamonds and open circles) and hydration water (blue circles) [10, 37].
Time domain predictions of the TR model
The analysis of the elastic scattering data of figure 2 is now expanded to include the quasi-elastic spectra of IN13 and their associated time domain correlation functions, applying the previous considerations to the physiological temperature range. The relevant spectra were first published in 1989, together with a preliminary version of the minimal molecular model [4, 27, 43]. For the present purpose the spectral data were numerically Fourier transformed to the time domain.

The recorded dynamical structure factor, S (θ, ħω), versus scattering angle θ and energy exchange ħω, is first reorganized at a constant Q format, S (Q, ħω), which is subsequently symmetrized by the detailed-balance factor. The time domain correlation function Φs (Q, t) is determined numerically by turning the Fourier integral of S (Q, ω) into a sum of discrete points using the experimental spectrum S (Q, ωj):
To avoid aliasing effects the smooth spectra S (Q, ωj) are interpolated with a maximum number of n data points according to ${t}_{n}=n\cdot \frac{\pi }{{\omega }_{\mathrm{max}}}$ (FFT), where ωmax is the cut-off of frequency of the spectrum. As a consistency check the transformed time domain result is back-transformed to the frequency domain. To deconvolute the data from the resolution function, a time domain low temperature spectrum of Mb-D2O at 100 K was determined.

To exclude border line effects, the full time window of 140 ps was reduced to 50 ps. Beyond this limit, the noise is rapidly increasing due to low intensity spectral wings. We focus on a single temperature at 270 K, which is not affected by ice formation at h = 0.38 g/g. Apart from the 50 ps time window it is in particular the large Q-range of IN13, which imposes severe restrictions on the molecular model.

The property of time and frequency domain representation of spectra has been discussed in reference (10).

As Figure 8 shows, the agreement between experimental time domain data and the predictions of the TR molecular models is convincing. The derived time constants,

τrot ≅ 11 (±2) ps and τtrans ≅ 155 (±20) ps, agree with those determined from the elastic analysis within experimental error. They define two types of molecular motions well separated in time. The correlation times cannot be fixed with higher precision due to limitations of the time window. By contrast, the constraints, imposed on the fits by the model function and the wide Q-range are significant. For comparison, the dashed lines in figure 8 show the result for the Brownian oscillator model, equation (21), which predicts a much stronger Q-dependence.
Figure 8: Time domain density correlation function Φs (Q, t) of myoglobin- D2O (0.35 g/g), (IN13) at various Q values. The red lines are the predictions of the TR model with δ2= 0.11(± 0.02) A2, τrot ≅ 11 (±2)ps and τtrans ≅ 155 (±20) ps at 270 K. The dashed lines are the predictions of the Brownian oscillator model of equation 21
Figure 9 finally shows the intermediate scattering function of hydrated myoglobin determined by combining the transformed spectra of three spectrometers. The figure illustrates the advantage of the time domain analysis. Although the decay extends over at least three decades, it is straightforward to recognize two main processes. The initial decay below 1 ps reflects the vibrational dephasing of low frequency vibrations [4, 5]. The fit at Q = 1.95 Å-1 reproduces the data very well. Also shown are results at other Q-values with significant deviations. Again the time constants are compatible with the Arrhenius plot of figure 7b. The long time value of type I, ΦI (Q, t >> τrot ) ≅ 0.8 (dashed) approximates the predicted value assuming a partial cross section of σI = 0.28 and equation 23 at Q = 2Å-1. This indicates that type I reflects mostly methyl group rotational transitions. The time constants are τrot= 9 (±1) ps and τtrans = 135 (±10) ps at 300 K. The local diffusion coefficient of type II amounts to DB = τtran-1 δB2 ≅ 10 -7 cm2 /s, which
Figure 9: Correlation function of hydrated (0.35 g/g) myoglobin combining the transforms of spectra with three spectrometers at Q = 1.95 A-1, IN6 (blue crosses), IN13 (open blue squares) and IN10 (blue circles). Lines: fit to the TR model, τrot= 9(±2) ps, τtrans= 145(±10) ps, δ2= 0.11(±0.01) Å2. Dashed: predictions at alternative Q-values, short dashed line: methyl relaxation only (equation 20)
is by a factor of 10 less than the diffusion coefficient of hydration water, Dhyd = Γhyd/Q2, Dhyd ≅ 10-6 cm2/s. Bulk water is in the range of 2. 5×10-5 cm2/s [8]. The protein-water fluctuations occur on a similar time scale, but the squared average displacements of water and protein residues within this time interval differ by a factor of ten.
Multiple scattering corrections to single scattering theory [5, 23, 29]
The density correlation function of equation (1) is normalized due to particle conservation at all times, which applies also to its Fourier transform Φs (Q, t) at Q = 0:
Equation 24 shows, that elastic scattering functions, normalized to the lowest temperature, must extrapolate to unity at zero Q: Sel (0,T)/ Sel (0,Tlow) ≡ 1. Sel (Q, Tlow) defines a Q-independent constant in the absence of quasi-elastic scattering. It follows, that the observed temperature dependent zero-Q intensity of figure 5, A (T) ≤ 1, is incompatible with single scattering theory. Depending on the transmission, there is always a multiple scattering fraction, mainly of the type elastic-elastic, which creates a Q-independent background and thus additional intensity at Q = 0. Since the elastic single scattering intensity decreases with the temperature, the second scattering effect will become smaller as well (figure 5). We have explicitly calculated the effect of two sequential elastic scattering processes on the elastic scattering intensity at Q = 0 from an infinite plane slab sample of thickness d. As input, one needs the single scattering structure factor. For this purpose the experimental elastic scattering profiles of figure 2 were parametrized by two Gaussian functions at each temperature. The resulting second scattering intensities versus Q and temperature are displayed in figure 10.
Figure 10: Full line: the elastic scattering functions of figure 2 were parametrized by a sum of two Gauss functions, from which the second scattering was calculated versus the temperature (dashed). Double scattering is nearly Q-independent, yielding finite value at Q = 0, decreasing with increasing temperature.
Sel-el (Q) is approximately independent of Q, and decreases with the temperature. The associated Sel-el (Q = 0, T), displayed in figure 5, agrees rather well with the experimental A(T) values, determined from the moment expansion. The deviations at high temperature result from increasing quasi-elastic scattering. The reconstructed elastic intensity of the green fluorescent protein (GFP) extrapolated to Q = 0 by Frauenfelder et al. fit quite well into this picture [19]. The temperature dependence and the absolute values of AGFP (T) are compatible with multiple scattering, given the uncertainties of the extrapolation procedure and the unknown transmission.
Protein function
Inside biological cells proteins operate in concentrated solution. We have recently investigated, using neutron scattering, protein diffusion in concentrated systems including the diffusion of hemoglobin inside red blood cells, the latter facilitates the oxygen exchange in the lung [44]. Due to the dominant solvent scattering in this milieu, it is rather difficult to observe proteininternal motions on the time scale of bulk water. Hydrated samples at reduced solvent scattering and global diffusion are more suitable. We have compared hydrated samples with those in solution with proteins of different structure [45]. This raises the question, whether protein function is modified by increasing the concentration and by reducing the solvent content. Are hydrated proteins active? The motivation to study the dynamics of myoglobin was powered by the possibility to record its biological function within a wide range of temperatures and solvent conditions. My group combines functional studies with neutron scattering and Mössbauer spectroscopy [13, 46-49]. Here we focus on the effect of hydration on protein function. Myoglobin binds dioxygen reversibly, but also and more strongly other ligands like carbon monoxide. Photo-dissociation of myoglobin-
Figure 11: Flash photolysis experiments of hydrated myoglobin- CO films at different humidity and 300 K. The bound state (Fe-CO) is restored after photolysis employing a 20 ns laser flash at 532 nm. The ligand CO binds in two kinetic steps denoted by I (internal) and II (surface).
CO is achieved by a 20 ns NdYaG laser pulse at 532 nm (Figure 11). Using a home-made flash system, ligand recombination is recorded between 20 ns and 1 s on a logarithmic time scale [46]. The kinetic experiments can be explained by a simplified kinetic scheme:
After complete photolysis by a laser flash, the ligand CO initially resides inside the protein matrix. Fast recombination from internal states can occur in competition with ligand escape across the protein surface. Figure 11 shows the recombination kinetics of photo-dissociated myoglobin +CO in hydrated films, which were fabricated by spin coating. At full hydration (100 % humidity) a single process is observed, which resembles ligand recombination from the solvent in liquid samples [46, 47]. Decreasing the degree of hydration, reduces the amplitude of the solvent process, while fast internal recombination gains in importance. This implies a reduced escape rate across the surface at lower hydration. This trend is enhanced at 0% humidity, the escape rate to the solvent and the recombination rate is further diminished. Note the logarithmic scale of the dissociated fraction. But a limited escape occurs even in the lyophilized state. The internal binding process is thus little affected by further drying as expected for a type I processes. The kinetic steps involving the protein surface and the solvent are of type II. Most neutron scattering experiments were performed with samples exposed to 90-100 % of humidity. In addition 0% samples were investigated. In conclusion, the hydrated films show protein function, which approaches the native kinetic properties in solution at full hydration. Dehydration diminishes ligand escape across the protein surface, while internal recombination is only weakly affected. The rate constants of the elementary kinetic steps range from 100 ns to seconds, overall binding at full hydration occurs within milliseconds. This is much slower than the time scale of microscopic molecular dynamics observed with neutron scattering. For type II processes a direct correlation between microscopic dynamics and ligand escape was established. The escape rate depends on the viscosity ηs and the solvent relaxation time τB near the protein surface according to Kramers law of activated escape [46-48]:
The mechanism of internal ligand displacements and recombination with the heme iron is less well understood. It seems likely however, that ligand migration between internal cavities is assisted by rotational transitions of side chains. Molecular dynamic simulations indicate that the methyl groups that exhibit many rotational transitions are located near xenon cavities, which play a role in internal migration of ligands in myoglobin [42]. The authors also consider the rotational transitions of methyl groups and local diffusion as major excitations of protein dynamics.
Discussion
Energy-resolved neutron scattering allow to record spacetime density fluctuations of hydrogen atoms in biomolecules. The basic quantity is the hydrogen-weighted incoherent densitydensity correlation function Gs (r, t), describing single particle displacements of isotropic samples. In this article three methods were discussed, how to reconstruct dynamic displacement distributions in proteins from incoherent neutron scattering data. The most general approach is the moment expansion, which can provide the second and the forth moment of the distribution at fixed time. The second moment increases above a certain transition temperature, where the respective molecular process is resolved by the spectrometer. Consequently the second moment carries the time tag of the instrumental resolution time. Smaller displacements are recorded at 15 ps compared to 150 ps as shown in Figure 4. The non-Gaussian nature of the elastic scattering profile results in a Gauss-deviation of 0.75 well above the standard value of 0.5. This suggests asymmetric molecular displacements, which are not temperature dependent above 220 K. But the asymmetry decreases with time.

The second approach is focused on a numerical inversion of equation 2. Instead of using a direct numerical transform, we transform a sum Gaussian distributions, which was adjusted to the elastic scattering profiles of figure 2. Since already two well separated Gaussians can account for the data, we obtain a bimodal Gs(r, t), with displacements around 0.5 Å (type I) and 1.5 Å (type II). This result suggests that two specific molecular processes may be sufficient to reproduce the neutron scattering spectra of proteins. Thus a principal component analytical model was proposed, which accounts rather well for the data, both in the time domain but also versus momentum exchange. The latter is a unique property of neutron scattering. The two processes comprise the well-known methyl rotation at 10 ps and local diffusion of residues on a 150 ps time scale. The second momen δB2 ≅ 0.1 Å2 is approximately temperature- independent as Figure1 shows. The correlation times thus carry the full temperature dependence of the apparent displacements in figure 1: The onset temperatures disappear with the TR model. As consequence, the Zaccai-Bicout model, which suggests intrinsic discontinuities in the protein force constants at the transition temperatures, does not apply. Both molecular motions, methyl rotation and hydration water relaxation are well known processes determined by other methods [37, 38-42]. If quasi-elastic spectra of proteins were indeed inhomogenious, the conventional space-time approach of correlation functions would fail [19]. Moreover, in the Frauenfelder model an elastic peak does not exist. This would invalidate the dynamic analysis of the data in Figure 2. Moreover the elastic intensity extrapolated to zero momentum exchange must first be corrected for multiple scattering, before it can be used to disprove scattering theory. The “ELM” approach was not very successful in enhancing our understanding of molecular motions in proteins during the last 30 years. Not a single energy landscape of a protein could be established. We thus prefer the more efficient molecular approach of minimal complexity. The heterogeneous scattering model of Frauenfelder et al. thus provides “no case against scattering theory” [15, 19 50].

The dynamical heterogeneity, that we observe, concerns the existence of two specific molecular processes. The two kinds of motions were identified before in functional studies: Flash photolysis experiments of ligand binding to myoglobin show that internal ligand displacements are entirely decoupled from the solvent (type I), while ligand exchange processes across the protein solvent interface (type II) vary with the solvent viscosity[46-48]. Moreover, multiple flash experiments demonstrate that heterogeneous non-exponential kinetic turns into homogenous and exponential behavior above the glass temperature of the solvent [48]. According to Arieh Warshel, complex energy landscapes are not the reason of the catalytic power of proteins [1]. Our ligand binding experiments demonstrate however, that the exchange of substrate and product between the active site and the solvent requires structural flexibility and the assistance of type II fluctuations. Proteins are designed to provide a catalytic environment perfectly isolated from the solvent, which is supported by solvent independent type I processes.

With an improved data base one my discover further motional components: The arrow in figure 1 at 130 K points to such an unresolved transition. One likely possibility is the presence of a small amount of acetate in the buffer, since the fast acetyl methyl groups give rise to low temperature onsets [10, 39]. Time domain neutron scattering will probably play a more important role in the future analysis of high quality experiments. The unique potential of neutron scattering to derive time resolved mean square displacements was first demonstrated for protein hydration water [8, 28]. Here a direct comparison with simulations is possible, unexplored up to now.
Acknowledgements
The author appreciates the assistance of the instrumental scientists Winfried Petry and Bernhard Frick at FRM2 and the ILL in Grenoble. My former collaborators Marcus Settles, Frank Post and Martin Diehl contributed with ideas and participation in some experiments, which is gratefully acknowledged. Franz Fujara suggested the application of IN13 to biomolecules. The late Edgar Lüscher is responsible for my engagement in neutron scattering and his activity promoted the decision to build the FRM2 in Munich.
Supplement: derivation of equation (21) [35]
Starting with the Fokker-Planck equation of the Ornstein- Uhlenbeck process, Kneller has derived intermediate scattering functions for coherent and incoherent neutron scattering by macromolecules [34]. He demonstrated that the Langevin description on a coarse-grained time scale is identical with the Smoluchowski description, if friction dominates.

Over-damped motion in a simple harmonic potential V(x) = ½ mω02x2 is described by the one-dimensional Langevin- Smoluchowski equation [35]:
with a force constant KBO = mω02 and the friction coefficient γ. Γ(t) denotes the δ -correlated random force. The structural relaxation time is given by τB = γ/ω02. The local diffusion coefficient is defined by D = KBT / (mγ). This model was applied previously to neutron scattering spectra of proteins [36]. To derive the relevant distributions, we start with the conditional probability p(x, t| x0, t`) of equ. 1 determined by a one-dimensional Ornstein- Uhlenbeck process [35]:
complemented by the initial condition:
For scattering applications one deals with the Fourier transform ΦB (Q, t) of p(x, t|x0, t0):
which, if inserted into equation (28) yields:
The solution to equation (31) is the requested result of equation (21):
δB2 = D / γ = KBT / (m ω02) is the translational mean square displacement. The rate is given by 1/τB = D/δB 2. Inserting equation (32) into a powder averaged equation (30) yields:
Here the width of the equilibrium distribution is given by δB2 = KBT/ (mω02). The center of the distribution relaxes from r0 at t = 0 to r = 0 at long times, while the width of the distribution increases from zero to δB2 during this time interval. Inserting equation (33) into equation (1) yields the time dependent displacement distribution function of the powder-averaged three-dimensional Brownian oscillator:
This is a Gaussian displacement distribution with a time dependent width. The time evolution of 4πr2 GB(r, t/τB) is illustrated in figure 12 assuming δB2 = 0.1 Å2. The maximum shifts with time to larger displacements, reaching the stationary distribution at long times. This result can be compared with the Gaussian type II distribution of figure 6, which, as a function of temperature at fixed time, shows a similar behavior. Higher temperatures are associated with larger time ratios t/τB (T). Note that the distribution of vibrational displacements is convoluted with the type II distribution in figure 6, which is not present in figure 12.
Figure 12: Displacement distribution of the powder averaged Brownian oscillator 4πr2G (r, t/τB) according to equation (34) with δB2 = 0.1 Å2 at various time ratios, t/τB, the red curve represents the stationary distribution at t/τB >> 1.
ReferencesTop

Listing : ICMJE