[S2E2] Simulations
Abstract. We present an analysis of uncertainties in global measurements of the column averaged dry-air mole fraction of CO2 (XCO2) by the NASA Orbiting Carbon Observatory-2 (OCO-2). The analysis is based on our best estimates for uncertainties in the OCO-2 operational algorithm and its inputs, and uses simulated spectra calculated for the actual flight and sounding geometry, with measured atmospheric analyses. The simulations are calculated for land nadir and ocean glint observations. We include errors in measurement, smoothing, interference, and forward model parameters. All types of error are combined to estimate the uncertainty in XCO2 from single soundings, before any attempt at bias correction has been made. From these results we also estimate the "variable error" which differs between soundings, to infer the error in the difference of XCO2 between any two soundings. The most important error sources are aerosol interference, spectroscopy, and instrument calibration. Aerosol is the largest source of variable error. Spectroscopy and calibration, although they are themselves fixed error sources, also produce important variable errors in XCO2. Net variable errors are usually
[S2E2] Simulations
SARS-CoV-2 has intricate mechanisms for initiating infection, immune evasion/suppression and replication that depend on the structure and dynamics of its constituent proteins. Many protein structures have been solved, but far less is known about their relevant conformational changes. To address this challenge, over a million citizen scientists banded together through the Folding@home distributed computing project to create the first exascale computer and simulate 0.1 seconds of the viral proteome. Our adaptive sampling simulations predict dramatic opening of the apo spike complex, far beyond that seen experimentally, explaining and predicting the existence of 'cryptic' epitopes. Different spike variants modulate the probabilities of open versus closed structures, balancing receptor binding and immune evasion. We also discover dramatic conformational changes across the proteome, which reveal over 50 'cryptic' pockets that expand targeting options for the design of antivirals. All data and models are freely available online, providing a quantitative structural atlas.
Molecular dynamics simulations have the ability to capture the full ensemble of structures a protein adopts but require substantial computational resources. Such simulations capture an all-atom representation of the range of motions a protein undergoes. Modern datasets often consist of a few microseconds of simulation for a single protein, with a few noteworthy examples reaching millisecond timescales17,18. However, many important processes occur on slower timescales. Moreover, simulating every protein that is relevant to SARS-CoV-2 for biologically relevant timescales would require computational resources on a massive scale.
As expected, the propensities of the three S complexes to adopt an open state and bind ACE2 are very different. Structures from each ensemble were classified as competent to bind ACE2 if superimposing an ACE2-RBD structure on S did not result in any steric clashes between ACE2 and the rest of the S complex. We found that SARS-CoV-1 has the highest population of conformations that can bind to ACE2 without steric clashes, followed by SARS-CoV-2, while opening of NL63 is sufficiently rare that we did not observe ACE2-binding-competent conformations in our simulations (Fig. 2b). Interestingly, S proteins that are more likely to adopt structures that are competent to bind ACE2 are also more likely to adopt highly open structures (Fig. 2c).
a,b, Conformational ensemble of Mpro (monomeric) predicts cryptic pockets near the active site (AS) and dimerization interface (DI). Conformational states (black circles) are projected onto the solvent-accessible surface areas (SASAs) of residues surrounding either the active site or the dimerization interface. The starting structure for simulations (6Y2E) is shown as a red dot. Representative structures are depicted by cartoons and transparent surfaces. Domains I and II are coloured cyan and domain III is coloured grey. The loop of domain III, which covers the active-site residues and is seen to be highly dynamic, is coloured red. c,d, The conformational ensemble from our simulations of nucleoproteins is similar to the distribution of structures seen experimentally. Conformational states are projected onto the distance and angle between the positive finger and a nearby loop. Angles θ were calculated between vectors that point along each red segment in d, and distances d were calculated between their centres of mass. Cluster centres are represented as black circles, the starting structure for simulations (6VYO) is shown as a red dot and NMR structures are shown as solid blue dots. Representative structures are shown as cartoons.
While we have aggressively targeted research on SARS-CoV-2, Folding@home is a general platform for running molecular dynamics simulations at scale. Before the COVID-19 pandemic, Folding@home was already generating datasets that were orders of magnitude greater than some of those generated by conventional means. With our explosive growth, our compute power has increased by around 100-fold. Our work here highlights the incredible utility this compute power has to enable rapid understanding of health and disease, providing a rich source of structural data for accelerating the design of therapeutics. With the continued support of the citizen scientists that have made this work possible, we have the opportunity to make a profound impact on other global health crises such as cancer, neurodegenerative diseases and antibiotic resistance.
In Wetzel et al 2022, we describe this public data release (DR1) of the FIRE-2 simulations, available at flathub.flatironinstitute.org/fire. DR1 contains full snapshots from 46 different simulations, spanning massive to Milky Way-mass to ultra-faint galaxies, with snapshots across z = 0 to 6, and halo/galaxy catalogs as well as additional data products. We provide a comprehensive description of the FIRE-2 simulations and data products, and we describe various publicly available python analysis packages to make reading and using these simulations easier.
This DR1 extends our initial data release (DR0) of a subset of FIRE-2 simulations, which contained complete snapshots of 3 of our Latte simulations of Milky Way-like galaxies at z = 0, accompanied by our Ananke synthetic Gaia DR2-like surveys that we created from these simulations (Sanderson et al 2020), which are available via yt Hub at ananke.hub.yt.
The coronavirus that causes COVID-19 controls the production of key viral proteins through a process known as programmed ribosomal frameshifting. Frameshifting is triggered by a particular structure in the viral RNA, a pseudoknot, which is a promising drug target. Here we model the structure of this pseudoknot through atomistic molecular dynamics simulations. Surprisingly, we find that the pseudoknot can take on distinct fold topologies, two of which involve unusual threading of a single strand of RNA through helical junctions, something not seen before in frameshifting pseudoknots. All of the folds are generally consistent with previous experimental studies of the closely-related SARS coronavirus pseudoknot. These results should assist in the analysis and interpretation of future experimental studies of the pseudoknot structure, and support structure-based drug-discovery efforts.
Initial structures for input into MD simulations of the monomeric pseudoknot were obtained using multiple platforms for blind RNA structure prediction: SimRNA [19], Rosetta FARFAR2 [16,17], RNAComposer [20], RNAvista [21], MC-Sym [22], RNA2D3D [23], and Vfold [24]. For blind predictions, we assumed the secondary structure shown in Fig 1, based on previous characterization of the secondary structure of the SARS-CoV-1 pseudoknot [12]. Blind predictions of pseudoknot dimer structures were made using FARFAR2, which allows for dimer structure prediction, or constructed manually using the Molecular Operating Environment software based on monomeric models from other prediction platforms, which do not directly allow for dimer predictions.
Models from blind structure predictions were used as starting structures for all-atom MD simulations in explicit solvent using Amber 18 [25]. The models were protonated at pH 7 using Molecular Operating Environment. The pseudoknots were parameterized using the f99bsc0_chiOL3 force-field and were solvated in optimal point charge water boxes with minimum margins of 12 Å using the tleap module of Amber. The solvated systems were first neutralized using sodium ions, then their salinities were adjusted to 0.15 M NaCl using Joung-Cheatham monovalent ion parameters [26]. Each pseudoknot model was simulated under two conditions: without Mg2+ ions, or with six Mg2+ ions placed initially at the junction between S1 and S3 as well as along the backbone of S2. The solvated systems were energy-minimized then heated to 310 K with heavy restraints of 10 kcal/mol/A2 on the backbone phosphate atoms. These restraints were gradually removed and the unrestrained systems were then simulated on graphical processing units for 1 μs at constant pressure.
To examine if the blind predictions were dynamically stable, we used them to initiate extended all-atom molecular dynamics simulations. Each structure in Fig 2 was simulated for at least 1 μs in explicit solvent under two conditions: with NaCl only, or with both NaCl and Mg2+ ions. Both conditions were used because not all pseudoknots require Mg2+ ions to fold [28,29], and it is unclear if Mg2+ ions are essential for the SARS-CoV-2 pseudoknot. The first part of the simulation was treated as an equilibration phase and only the last 500 ns of the simulation was examined in each case. Because the simulations were dynamic, we clustered the structures occupied in the simulations by RMSD and examined the centroid (representative) structures of the three most occupied clusters. 041b061a72