Cryo em data processing

Cryo em data processing DEFAULT

Real-time cryo-electron microscopy data preprocessing with Warp


The acquisition of cryo-electron microscopy (cryo-EM) data from biological specimens must be tightly coupled to data preprocessing to ensure the best data quality and microscope usage. Here we describe Warp, a software that automates all preprocessing steps of cryo-EM data acquisition and enables real-time evaluation. Warp corrects micrographs for global and local motion, estimates the local defocus and monitors key parameters for each recorded micrograph or tomographic tilt series in real time. The software further includes deep-learning-based models for accurate particle picking and image denoising. The output from Warp can be fed into established programs for particle classification and 3D-map refinement. Our benchmarks show improvement in the nominal resolution, which went from 3.9 Å to 3.2 Å, of a published cryo-EM data set for influenza virus hemagglutinin. Warp is easy to install from and computationally inexpensive, and has an intuitive, streamlined user interface.

Access options

Subscribe to Journal

Get full journal access for 1 year

111,22 €

only 9,27 € per issue


All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.

Rent or Buy article

Get time limited or full article access on ReadCube.


Rent or Buy

All prices are NET prices.

Data availability

Figure 1 and Supplementary Fig. 1 use exemplary data from EMPIAR-10078. Figure 2 uses a cryo-EM image of RNA Pol II complexes, available from the authors upon request. Figure 3 and the benchmark section use data from EMPIAR-10097 re-analyzed in this study. The refined maps shown in Fig. 3a are available in Supplementary Data 1–4. The ‘Full Warp pipeline’ map shown in Fig. 3a has been deposited in EMDB as EMD-0025. Figure 4 and the benchmark section use data from EMPIAR-10061 re-analyzed in this study, the 1.86 Å map shown in Fig. 4a is available as Supplementary Data 5. Figure 5a uses a tomogram reconstructed from data from EMPIAR-10045. Figure 6 and the benchmark section use data from EMPIAR-10045 and EMPIAR-10164 re-analyzed in this study, the maps shown in Fig. 6a,b are available in Supplementary Data 6 and 7, respectively. Supplementary Fig. 2 uses exemplary data from EMPIAR-10061. Supplementary Fig. 3 uses exemplary data from EMPIAR-10097. Supplementary Fig. 5 uses in-house data, available upon request. Supplementary Fig. 6 uses exemplary data from EMPIAR-10078. Supplementary Fig. 7 uses exemplary data from (left) EMPIAR-10078, (center) in-house data available upon request, and (right) EMPIAR-10153. Training data for BoxNet can be accessed through

Code availability

Warp binaries, source code and user guide are available as Supplementary Software and can be downloaded from BoxNet source code can be downloaded from


  1. 1.

    Saibil, H. R., Grünewald, K. & Stuart, D. I. A national facility for biological cryo-electron microscopy. Acta Crystallogr. D.71, 127–135 (2015).

    CASPubMed Google Scholar

  2. 2.

    Suloway, C. et al. Automated molecular microscopy: the new Leginon system. J. Struct. Biol.151, 41–60 (2005).

    CASPubMed Google Scholar

  3. 3.

    Brilot, A. F. et al. Beam-induced motion of vitrified specimen on holey carbon film. J. Struct. Biol.177, 630–637 (2012).

    CASPubMedPubMed Central Google Scholar

  4. 4.

    Huang, Z., Baldwin, P. R., Mullapudi, S. & Penczek, P. A. Automated determination of parameters describing power spectra of micrograph images in electron microscopy. J. Struct. Biol.144, 79–94 (2003).

    PubMed Google Scholar

  5. 5.

    van Heel, M. Detection of objects in quantum-noise-limited images. Ultramicroscopy7, 331–341 (1982).

    Google Scholar

  6. 6.

    Li, X. et al. Electron counting and beam-induced motion correction enable near-atomic-resolution single-particle cryo-EM. Nat. Methods10, 584–590 (2013).

    CASPubMedPubMed Central Google Scholar

  7. 7.

    Grant, T. & Grigorieff, N. Measuring the optimal exposure for single particle cryo-EM using a 2.6 Å reconstruction of rotavirus VP6. eLife4, e06980 (2015).

    PubMedPubMed Central Google Scholar

  8. 8.

    Mastronarde, D. N. Automated electron microscope tomography using robust prediction of specimen movements. J. Struct. Biol.152, 36–51 (2005).

    PubMed Google Scholar

  9. 9.

    Zheng, S. Q. et al. MotionCor2: anisotropic correction of beam-induced motion for improved cryo-electron microscopy. Nat. Methods14, 331–332 (2017).

    CASPubMedPubMed Central Google Scholar

  10. 10.

    Rubinstein, J. L. & Brubaker, M. A. Alignment of cryo-EM movies of individual particles by optimization of image translations. J. Struct. Biol.192, 188–195 (2015).

    PubMed Google Scholar

  11. 11.

    McLeod, R. A., Kowal, J., Ringler, P. & Stahlberg, H. Robust image alignment for cryogenic transmission electron microscopy. J. Struct. Biol.197, 279–293 (2017).

    CASPubMed Google Scholar

  12. 12.

    Rohou, A. & Grigorieff, N. CTFFIND4: Fast and accurate defocus estimation from electron micrographs. J. Struct. Biol.192, 216–221 (2015).

    PubMedPubMed Central Google Scholar

  13. 13.

    Bell, J. M., Chen, M., Baldwin, P. R. & Ludtke, S. J. High resolution single particle refinement in EMAN2.1. Methods (San. Diego, Calif.)100, 25–34 (2016).

    CAS Google Scholar

  14. 14.

    Zhang, K. Gctf: Real-time CTF determination and correction. J. Struct. Biol.193, 1–12 (2016).

    CASPubMedPubMed Central Google Scholar

  15. 15.

    Scheres, S. H. Semi-automated selection of cryo-EM particles in RELION-1.3. J. Struct. Biol.189, 114–122 (2015).

    CASPubMedPubMed Central Google Scholar

  16. 16.

    Roseman, A. M. FindEM-a fast, efficient program for automatic selection of particles from electron micrographs. J. Struct. Biol.145, 91–99 (2004).

    CASPubMed Google Scholar

  17. 17.

    Chen, J. Z. & Grigorieff, N. SIGNATURE: a single-particle selection system for molecular electron microscopy. J. Struct. Biol.157, 168–173 (2007).

    CASPubMed Google Scholar

  18. 18.

    Sorzano, C. et al. Automatic particle selection from electron micrographs using machine learning techniques. J. Struct. Biol.167, 252–260 (2009).

    CASPubMedPubMed Central Google Scholar

  19. 19.

    Wang, F. et al. DeepPicker: A deep learning approach for fully automated particle picking in cryo-EM. J. Struct. Biol.195, 325–336 (2016).

    PubMed Google Scholar

  20. 20.

    Lander, G. C. et al. Appion: an integrated, database-driven pipeline to facilitate EM image processing. J. Struct. Biol.166, 95–102 (2009).

    CASPubMedPubMed Central Google Scholar

  21. 21.

    Biyani, N. et al. Focus: The interface between data collection and data processing in cryo-EM. J. Struct. Biol.198, 124–133 (2017).

    CASPubMed Google Scholar

  22. 22.

    de la Rosa-Trevin, J. M. et al. Scipion: A software framework toward integration, reproducibility and validation in 3D electron microscopy. J. Struct. Biol.195, 93–99 (2016).

    PubMed Google Scholar

  23. 23.

    Scheres, S. H. RELION: implementation of a Bayesian approach to cryo-EM structure determination. J. Struct. Biol.180, 519–530 (2012).

    CASPubMedPubMed Central Google Scholar

  24. 24.

    Punjani, A., Rubinstein, J. L., Fleet, D. J. & Brubaker, M. A. cryoSPARC: algorithms for rapid unsupervised cryo-EM structure determination. Nat. Methods14, 290 (2017).

    CASPubMedPubMed Central Google Scholar

  25. 25.

    Tan, Y. Z. et al. Addressing preferred specimen orientation in single-particle cryo-EM through tilting. Nat. Methods14, 793–796 (2017).

    CASPubMedPubMed Central Google Scholar

  26. 26.

    Hagen, W. J. H., Wan, W. & Briggs, J. A. G. Implementation of a cryo-electron tomography tilt-scheme optimized for high resolution subtomogram averaging. J. Struct. Biol.197, 191–198 (2017).

    PubMedPubMed Central Google Scholar

  27. 27.

    Campbell, M. G. et al. Movies of ice-embedded particles enhance resolution in electron cryo-microscopy. Structure20, 1823–1828 (2012).

    CASPubMedPubMed Central Google Scholar

  28. 28.

    Noble, A. J. et al. Routine single particle cryoem sample and grid characterization by tomography. eLife7, e34257 (2018).

    PubMedPubMed Central Google Scholar

  29. 29.

    Lecun, Y., Bottou, L., Bengio, Y. & Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE86, 2278–2324 (1998).

    Google Scholar

  30. 30.

    Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. Proc. 25th Int. Conf. Neural Inf. Process. Syst.1, 1097–1105 (2012).

    Google Scholar

  31. 31.

    He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770–778 (IEEE, 2016).

  32. 32.

    Abadi, M. et al. TensorFlow: a system for large-scale machine learning. Proceedings of the 12th USENIX conference on Operating Systems Design and Implementation 265–283 (IEEE, 2016).

  33. 33.

    Iudin, A., Korir, P. K., Salavert-Torres, J., Kleywegt, G. J. & Patwardhan, A. EMPIAR: a public archive for raw electron microscopy image data. Nat. Methods13, 387–388 (2016).

    CASPubMed Google Scholar

  34. 34.

    Berman, H. M. et al. The Protein Data Bank. Nucleic Acids Res.28, 235–242 (2000).

    CASPubMedPubMed Central Google Scholar

  35. 35.

    Wagner, T. et al. SPHIRE-crYOLO is a fast and accurate fully automated particle picker for cryo-EM. Commun. Biol.2, 218 (2019).

    PubMedPubMed Central Google Scholar

  36. 36.

    Zivanov, J. et al. RELION-3: new tools for automated high-resolution cryo-EM structure determination. eLife7, e42166 (2018).

    PubMedPubMed Central Google Scholar

  37. 37.

    Tagari, M., Newman, R., Chagoyen, M., Carazo, J. M. & Henrick, K. New electron microscopy database and deposition system. Trends Biochem. Sci.27, 589 (2002).

    CASPubMed Google Scholar

  38. 38.

    Henderson, R. Avoiding the pitfalls of single particle cryo-electron microscopy: Einstein from noise. Proc. Natl Acad. Sci. USA110, 18037–18041 (2013).

    CASPubMed Google Scholar

  39. 39.

    Bartesaghi, A. et al. 2.2 A resolution cryo-EM structure of beta-galactosidase in complex with a cell-permeant inhibitor. Science348, 1147–1151 (2015).

    CASPubMedPubMed Central Google Scholar

  40. 40.

    Bharat, T. A. & Scheres, S. H. Resolving macromolecular structures from electron cryo-tomography data using subtomogram averaging in RELION. Nat. Protoc.11, 2054–2065 (2016).

    CASPubMedPubMed Central Google Scholar

  41. 41.

    Turonova, B., Schur, F. K. M., Wan, W. & Briggs, J. A. G. Efficient 3D-CTF correction for cryo-electron tomography using NovaCTF improves subtomogram averaging resolution to 3.4A. J. Struct. Biol.199, 187–195 (2017).

    CASPubMedPubMed Central Google Scholar

  42. 42.

    Nocedal, J. Updating quasi-Newton matrices with limited storage. Math. Comput.35, 773–773 (1980).

    Google Scholar

  43. 43.

    Sorzano, C. O., Otero, A., Olmos, E. M. & Carazo, J. M. Error analysis in the determination of the electron microscopical contrast transfer function parameters from experimental power Spectra. BMC Struct. Biol.9, 18 (2009).

    PubMedPubMed Central Google Scholar

  44. 44.

    Penczek, P. A. et al. CTER—Rapid estimation of CTF parameters with error assessment. Ultramicroscopy140, 9–19 (2014).

    CASPubMedPubMed Central Google Scholar

  45. 45.

    Danev, R., Tegunov, D. & Baumeister, W. Using the Volta phase plate with defocus for cryo-EM single particle analysis. eLife6, e23006 (2017).

    PubMedPubMed Central Google Scholar

  46. 46.

    Voortman, L. M., Stallinga, S., Schoenmakers, R. H. M., Vliet, L. Jv & Rieger, B. A fast algorithm for computing and correcting the CTF for tilted, thick specimens in TEM. Ultramicroscopy111, 1029–1036 (2011).

    CASPubMed Google Scholar

  47. 47.

    Schur, F. K. et al. An atomic model of HIV-1 capsid-SP1 reveals structures regulating assembly and maturation. Science353, 506–508 (2016).

    CASPubMed Google Scholar

  48. 48.

    Xiong, Q., Morphew, M. K., Schwartz, C. L., Hoenger, A. H. & Mastronarde, D. N. CTF determination and correction for low dose tomographic tilt series. J. Struct. Biol.168, 378–387 (2009).

    PubMedPubMed Central Google Scholar

  49. 49.

    Bharat, T. A., Russo, C. J., Lowe, J., Passmore, L. A. & Scheres, S. H. Advances in Single-Particle Electron Cryomicroscopy Structure Determination applied to Sub-tomogram Averaging. Structure23, 1743–1753 (2015).

    CASPubMedPubMed Central Google Scholar

  50. 50.

    Hutchings, J., Stancheva, V., Miller, E. A. & Zanetti, G. Subtomogram averaging of COPII assemblies reveals how coat organization dictates membrane shape. Nat. Commun.9, 4154 (2018).

    PubMedPubMed Central Google Scholar

  51. 51.

    Russo, C. J. & Henderson, R. Ewald sphere correction using a single side-band image processing algorithm. Ultramicroscopy187, 26–33 (2018).

    CASPubMedPubMed Central Google Scholar

  52. 52.

    Grigorieff, N. FREALIGN: high-resolution refinement of single particle structures. J. Struct. Biol.157, 117–125 (2007).

    CASPubMed Google Scholar

  53. 53.

    Kunz, M. & Frangakis, A. S. Three-dimensional CTF correction improves the resolution of electron tomograms. J. Struct. Biol.197, 114–122 (2017).

    PubMed Google Scholar

  54. 54.

    Grant, T. & Grigorieff, N. Automatic estimation and correction of anisotropic magnification distortion in electron microscopes. J. Struct. Biol.192, 204–208 (2015).

    PubMedPubMed Central Google Scholar

  55. 55.

    Heymann, J. B., Chagoyen, M. & Belnap, D. M. Common conventions for interchange and archiving of three-dimensional electron microscopy information in structural biology. J. Struct. Biol.151, 196–207 (2005).

    PubMed Google Scholar

  56. 56.

    Ronneberger, O., Fischer, P. & Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In MICCAI 2015 Lecture Notes in Computer Science (eds N., Navab et al.) Vol 9351, 234–241 (Springer, 2015).

  57. 57.

    Vulovic, M. et al. Image formation modeling in cryo-electron microscopy. J. Struct. Biol.183, 19–32 (2013).

    CASPubMed Google Scholar

  58. 58.

    Rickgauer, J. P., Grigorieff, N. & Denk, W. Single-protein detection in crowded molecular environments in cryo-EM images. eLife6, e25648 (2017).

    PubMedPubMed Central Google Scholar

  59. 59.

    Mao, X.-J., Shen, C. & Yang, Y.-B. Image restoration using convolutional auto-encoders with symmetric skip connections. Adv. Neural Inform. Proc. Syst.29, 2802–2810 (2016).

    Google Scholar

  60. 60.

    Iizuka, S., Simo-Serra, E. & Ishikawa, H. Globally and locally consistent image completion. ACM Trans. Graph. (TOG)36, 107 (2017).

    Google Scholar

  61. 61.

    Lehtinen, J. et al. Noise2Noise: learning image restoration without clean data. Preprint at (2018).

  62. 62.

    Kremer, J. R., Mastronarde, D. N. & McIntosh, J. R. Computer visualization of three-dimensional image data using IMOD. J. Struct. Biol.116, 71–76 (1996).

    CASPubMedPubMed Central Google Scholar

Download references


We thank members of the Cramer lab for beta-testing early versions of Warp and providing feedback on bugs in the software. We thank C. Bernecky, S. Dodonova, W. Hagen, D. Lyumkis, C. Plaschka, J. Söding and Y. Z. Tan for critical reading of the manuscript. PC was supported by ERC Advanced Grant TRANSREGULON (grant agreement no. 693023) of the European Research Council, the Deutsche Forschungsgemeinschaft (SFB 860) and the Volkswagen Foundation.

Author information


  1. Max Planck Institute for Biophysical Chemistry, Department of Molecular Biology, Göttingen, Germany

    Dimitry Tegunov & Patrick Cramer


D.T. designed Warp’s architecture and all algorithms, and carried out all implementation and application. P.C. provided scientific environment, funding and additional interpretations and implications. D.T. and P.C. wrote the manuscript.

Corresponding authors

Correspondence to Dimitry Tegunov or Patrick Cramer.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Allison Doerr was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Integrated supplementary information

Supplementary Figure 1 User interface of Warp.

a, The processing settings (left) specify all steps and parameters for online data evaluation, correction and processing. The ‘Overview’ tab (right) presents all important processing results and lets the user specify selection filters to remove low-quality data. b, View of a single micrograph. In Fourier space (left), the simulated 2D CTF (i), the 1D power spectrum (PS) and its fit (ii), and the 2D PS (iii) are presented. The real space view (right) shows the aligned movie average with particle positions (green dots), motion tracks (white curves) and the defocus variation (transparent magenta-cyan overlay), and applies a deconvolution filter as well as denoising. Individual display elements can be shown or hidden. The navigation bar (bottom) shows the processing status for all items and allows to quickly switch between them as well as to manually exclude single items from processing.

Supplementary Figure 2 Deconvolution and denoising of a low-defocus micrograph.

a, A raw micrograph from EMPIAR-10061 acquired at 0.8 μm defocus. b, Same micrograph after applying deconvolution. Low-resolution contrast is boosted and the defocused signal is more localized, allowing to distinguish the particles better. c, Same micrograph after applying deconvolution and denoising with a noise2noise model retrained on this dataset. The shapes of individual 400-kDa proteins nearly invisible in the raw image can be distinguished clearly against the background. d, Shape and effect of the deconvolution filter. The filter largely reverses the effect of the first CTF peak, while also suppressing the lowest and higher frequencies.

Supplementary Figure 3 Motion and CTF model fitting by Warp.

The unaligned, defocused movie (i) is parametrized with a coarse grid (black dots), divided into patches for the alignment (ii), and power spectra of these patches are computed (iii) for CTF fitting. The motion model (iv) includes 2 components: global motion (cyan trajectory) with fine temporal and no spatial resolution, and local motion (magenta trajectories) with coarse temporal, and fine spatial resolution. Both components are optimized to minimize the squared difference between the individual patch frames and their aligned average. The spatially resolved CTF model (v) is optimized to minimize the squared difference between the power spectra (iii, upper left part of each patch) and the simulated local 2D CTF (iii, bottom right part of each patch). Here, the defocus gradient follows the 40° tilt of the specimen, with the notable exception of the hole edge in the bottom left corner.

Supplementary Figure 4 CTF fitting of flat, tilted and tilt series data.

Fitted spectra without (left column) and with (right column) a spatially resolved model. The samples are (a) flat (EMPIAR-10078), (b) tilted at 40° (EMPIAR-10097) and (c) a tilt series ranging from –60° to +60° (EMPIAR-10045). In all three cases, using a spatially resolved model allowed to fit the sample geometry more accurately, as evidenced by the clearer Thon rings in the rescaled, averaged 1D spectra. The fitting range (grey rectangle in the 1D spectra) was chosen well below the estimated resolution to avoid overfitting the higher number of parameters in the spatially resolved model.

Supplementary Figure 5 Unbiased particle picking with Warp’s BoxNet.

Examples of automated particle picking on samples not seen by BoxNet in training. For comparison, the same micrographs were picked with crYOLO’s generic model, and RELION’s Laplacian of Gaussian (LoG) method. Micrographs were selected from in-house data to make sure they were absent in crYOLO’s knowledge base. BoxNet reliably recognizes almost all particles (yellow), and masks out all artifacts (purple). LoG is often confused by high-contrast edges and ethane impurities. crYOLO performs better than LoG, but is also routinely confused by ethane impurities and protein aggregates, and misses many of the small particles (bottom row).

Supplementary Figure 6 Neural network architecture of BoxNet.

Rectangles depict the intermediate tensor dimensions. Their width and height are proportional to the number of channels and the spatial extent, respectively. Thick arrows represent convolution operations. Their format is encoded as ‘(Kx R), LxMxN /O’, where K is the number of consecutive ResNet blocks, or absent in case of a single convolution operation; L and M are the dimensions of the convolution kernel; N is the number of kernels, resulting in N channels in the output; O is the stride length (1 = no change, 2 = downsampling by factor of 2, 0.5 = upsampling by factor of 2 through transposed convolution). The stride parameter is only applied to the first convolution in a chain of ResNet blocks, whereas all subsequent convolutions use stride = 1. The contractive part of the network is colored in cyan, the expanding part in magenta. The final image shows the result of applying a per-pixel ArgMax operator to the result of the last convolution to obtain the spatial distribution of the three labels the model is trained to predict: background (black), particle (yellow), artifact (purple).

Supplementary Figure 7 Examples of data used to train BoxNet.

Examples of micrographs presented to BoxNet as input (top row), and the per-pixel labels used as the desired output during training (bottom row). The pixel classes predicted by BoxNet are background (black), particles (yellow), and artifacts (purple).

About this article

Verify currency and authenticity via CrossMark

Cite this article

Tegunov, D., Cramer, P. Real-time cryo-electron microscopy data preprocessing with Warp. Nat Methods16, 1146–1152 (2019).

Download citation

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Further reading

  • Multi-particle cryo-EM refinement with M visualizes ribosome-antibiotic complex at 3.5 Å in cells

    • Dimitry Tegunov
    • , Liang Xue
    • , Christian Dienemann
    • , Patrick Cramer
    •  & Julia Mahamid

    Nature Methods (2021)

  • Structure and inhibition mechanism of the human citrate transporter NaCT

    • David B. Sauer
    • , Jinmei Song
    • , Bing Wang
    • , Jacob K. Hilton
    • , Nathan K. Karpowich
    • , Joseph A. Mindell
    • , William J. Rice
    •  & Da-Neng Wang

    Nature (2021)

  • Asymmetric opening of the homopentameric 5-HT3A serotonin receptor in lipid bilayers

    • Yingyi Zhang
    • , Patricia M. Dijkman
    • , Rongfeng Zou
    • , Martina Zandl-Lang
    • , Ricardo M. Sanchez
    • , Luise Eckhardt-Strelau
    • , Harald Köfeler
    • , Horst Vogel
    • , Shuguang Yuan
    •  & Mikhail Kudryashev

    Nature Communications (2021)

  • Linker histone defines structure and self-association behaviour of the 177 bp human chromatosome

    • Sai Wang
    • , Vinod K. Vogirala
    • , Aghil Soman
    • , Nikolay V. Berezhnoy
    • , Zhehui Barry Liu
    • , Andrew S. W. Wong
    • , Nikolay Korolev
    • , Chun-Jen Su
    • , Sara Sandin
    •  & Lars Nordenskiöld

    Scientific Reports (2021)

  • Structural basis for late maturation steps of the human mitoribosomal large subunit

    • Miriam Cipullo
    • , Genís Valentín Gesé
    • , Anas Khawaja
    • , B. Martin Hällberg
    •  & Joanna Rorbach

    Nature Communications (2021)


Cryo-EM Software

UNIX/Linux Software

“UNIX” refers to a family of operating systems that include IRIS, Solaris, Tru64 and Linux. UNIX was designed to be a multi-user, shared, networked operating environment with concepts such as Users, Groups, Permissions and Network-Shared Resources built in to the core of its design.

The Introduction to UNIX manual is geared to beginners but is useful for novice and veteran users alike.

Here is a brief list of Linux commands that most users find helpful. Facility users and staff also recommend as a starting point for additional scripts.

Software tools for Image processing

EM analysis software is available to facility users via the SBGrid.

Full data processing requires using the ACCRE cluster. Users must register for a cluster account.

Learn more about Software Tools for Molecular Microscopy on WikiBooks.

EM Software Suite Options

RELION (for REgularised LIkelihood OptimisatioN, pronounced “rely-on”) is a stand-alone computer program that employs an empirical Bayesian approach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy (cryo-EM).  Find a brief Relion manual for data processing.

EMAN2 is a broadly based greyscale scientific image processing suite with a primary focus on processing data from transmission electron microscopes.

Scipion is an image processing framework for obtaining 3D models of macromolecular complexes using Electron Microscopy (3DEM). It integrates several software packages and presents a unified interface for both biologists and developers.

The cross-software GUI wrapper works with Xmipp, Relion, EMAN2 and other software suites.

  1. Deluxe inflatable goat
  2. Griffin & howe
  3. Bestway ladder instructions
  4. Instagram video downloader firefox

Methods to account for movement and flexibility in cryo-EM data processing

1. Campbell M.G., Cheng A., Brilot A.F., Moeller A., Lyumkis D., Veesler D. Movies of ice-embedded particles enhance resolution in electron cryo-microscopy. Structure. 2012;20:1823–1828.[PMC free article] [PubMed] [Google Scholar]

2. Bai X.C., Fernandez I.S., McMullan G., Scheres S.H. Ribosome structures to near-atomic resolution from thirty thousand cryo-EM particles. eLife. 2013;2:e00461.[PMC free article] [PubMed] [Google Scholar]

3. Veesler D., Campbell M.G., Cheng A., Fu C.-Y., Murez Z., Johnson J.E. Maximizing the potential of electron cryomicroscopy data collected using direct detectors. J. Struct. Biol. 2013;184:193–202.[PMC free article] [PubMed] [Google Scholar]

4. Campbell M.G., Veesler D., Cheng A., Potter C.S., Carragher B. 2.8 Å resolution reconstruction of the Thermoplasma acidophilum 20S proteasome using cryo-electron microscopy. eLife. 2015;4[PMC free article] [PubMed] [Google Scholar]

5. Campbell M.G., Kearney B.M., Cheng A., Potter C.S., Johnson J.E., Carragher B. Near-atomic resolution reconstructions using a mid-range electron microscope operated at 200 kV. J. Struct. Biol. 2014;188:183–187.[PMC free article] [PubMed] [Google Scholar]

6. Bai X.-C., Yan C., Yang G., Lu P., Ma D., Sun L. An atomic structure of human γ-secretase. Nature. 2015;525:212–217.[PMC free article] [PubMed] [Google Scholar]

7. Grant T., Grigorieff N. Measuring the optimal exposure for single particle cryo-EM using a 2.6 Å reconstruction of rotavirus VP6. eLife. 2015;4:e06980.[PMC free article] [PubMed] [Google Scholar]

8. Paulsen C.E., Armache J.-P., Gao Y., Cheng Y., Julius D. Structure of the TRPA1 ion channel suggests regulatory mechanisms. Nature. 2015;520:511–517.[PMC free article] [PubMed] [Google Scholar]

9. Li X., Mooney P., Zheng S., Booth C.R., Braunfeld M.B., Gubbens S. Electron counting and beam-induced motion correction enable near-atomic-resolution single-particle cryo-EM. Nat. Methods. 2013;10:584–590.[PMC free article] [PubMed] [Google Scholar]

10. Scheres S.H.W. RELION: implementation of a Bayesian approach to cryo-EM structure determination. J. Struct. Biol. 2012;180:519–530.[PMC free article] [PubMed] [Google Scholar]

11. Scheres S.H.W. Beam-induced motion correction for sub-megadalton cryo-EM particles. eLife. 2014;3 e03665. [PMC free article] [PubMed] [Google Scholar]

12. Rubinstein J.L., Brubaker M.A. Alignment of cryo-EM movies of individual particles by optimization of image translations. J. Struct. Biol. 2015;192:188–195. [PubMed] [Google Scholar]

13. Ge J., Li W., Zhao Q., Li N., Chen M., Zhi P. Architecture of the mammalian mechanosensitive Piezo1 channel. Nature. 2015;527:64–69. [PubMed] [Google Scholar]

14. Scheres S.H.W. Semi-automated selection of cryo-EM particles in RELION-1.3. J. Struct. Biol. 2015;189:114–122.[PMC free article] [PubMed] [Google Scholar]

15. Tang G., Peng L., Baldwin P.R., Mann D.S., Jiang W., Rees I. EMAN2: An extensible image processing suite for electron microscopy. J. Struct. Biol. 2007;157:38–46. [PubMed] [Google Scholar]

16. Zhao J., Benlekbir S., Rubinstein J.L. Electron cryomicroscopy observation of rotational states in a eukaryotic V-ATPase. Nature. 2015;521:241–245. [PubMed] [Google Scholar]

17. Burgess S.A., Walker M.L., Thirumurugan K., Trinick J., Knight P.J. Use of negative stain and single-particle image processing to explore dynamic properties of flexible macromolecules. J. Struct. Biol. 2004;147:247–258. [PubMed] [Google Scholar]

18. Chowdhury S., Ketcham S.A., Schroer T.A., Lander G.C. Structural organization of the dynein–dynactin complex bound to microtubules. Nat. Struct. Mol. Biol. 2015;22:345–347.[PMC free article] [PubMed] [Google Scholar]

19. Radics J., Königsmaier L., Marlovits T.C. Structure of a pathogenic type 3 secretion system in action. Nature Publishing Group. 2013;21:82–87. [PubMed] [Google Scholar]

20. Lu P., Bai X.-C., Ma D., Xie T., Yan C., Sun L. Three-dimensional structure of human γ-secretase. Nature. 2014;512:166–170.[PMC free article] [PubMed] [Google Scholar]

21. Song C.F., Papachristos K., Rawson S., Huss M., Wieczorek H., Paci E. Flexibility within the rotor and stators of the vacuolar H+-ATPase. PLoS ONE. 2013;8:e82207.[PMC free article] [PubMed] [Google Scholar]

22. Richardson R.A., Papachristos K., Read D.J., Harlen O.G., Harrison M., Paci E. Understanding the apparent stator-rotor connections in the rotary ATPase family using coarse-grained computer modeling. Proteins. 2014;82:3298–3311. [PubMed] [Google Scholar]

23. Stewart A.G., Lee L.K., Donohoe M., Chaston J.J., Stock D. The dynamic stator stalk of rotary ATPases. Nat. Commun. 2012;3:687.[PMC free article] [PubMed] [Google Scholar]

24. Rawson S., Phillips C., Huss M., Tiburcy F., Wieczorek H., Trinick J. Structure of the vacuolar H(+)-ATPase rotary motor reveals new mechanistic insights. Structure. 2015;23:461–471.[PMC free article] [PubMed] [Google Scholar]

25. Kucukelbir A., Sigworth F.J., Tagare H.D. Quantifying the local resolution of cryo-EM density maps. Nat. Methods. 2013;11:63–65.[PMC free article] [PubMed] [Google Scholar]

Cryo-EM data pre-processing at full Warp (Dmitry Tegunov)

Hands on Methods for High Resolution Cryo-Electron Microscopy Structures of Heterogeneous Macromolecular Complexes


The complete understanding of how macromolecular complexes fulfill their intricate roles in the cell is the central theme in molecular biology. At the molecular level, precise knowledge of the structure of macromolecules in the cell critically contributes to the elucidation of their functional mechanism. The aim of structural biology is thus to, ideally, determine the 3D arrangement of the atoms of macromolecular complexes. Among the many different structural biology techniques, electron cryomicroscopy (cryo-EM) combined with single particle averaging techniques has recently emerged as a preferred option for solving near-atomic resolution structures of large macromolecular complexes and, is particularly effective in the case of heterogeneous samples, less amenable to other techniques (Bai et al., 2015a).

In cryo-EM flash frozen protein samples are irradiated with coherent electron beams and 2D projection images of the atomic density of many molecules in different orientations are recorded. Cryo-EM images of biological specimens typically display high noise levels due to the extremely low electron dose required to avoid radiation damage. To counter this effect and improve the signal to noise ratio it becomes necessary to align and average hundreds to thousands of single particle images corresponding to the same molecule orientation (Henderson and McMullan, 2013).

The last few years have seen the development of a new generation of direct electron detectors of peerless sensitivity and speed (McMullan et al., 2014, 2016), which coupled to other improvements in electron microscopes, allow for the acquisition of images of unprecedented high quality. Discovery and correction of beam-induced particle motion (Brilot et al., 2012; Campbell et al., 2012; Li et al., 2013; Scheres, 2014; Grant and Grigorieff, 2015; Zheng et al., 2017), automatic data acquisition routines, the development of new image processing strategies and faster computers (such as the use of GPUs) have made possible to attain near atomic-resolution cryo-EM maps with ease (Scheres, 2012; Yang et al., 2012; Elmlund et al., 2013; Grigorieff, 2016; Kimanius et al., 2016; Ludtke, 2016; Merk et al., 2016; Reboul et al., 2016).

Analysis of macromolecular complex structures solved during the so-called cryo-EM resolution revolution (Kühlbrandt, 2014) display atomic resolution details making it possible to elucidate important features that contribute to reveal their molecular mechanism. More importantly, those near-atomic resolution structures can be obtained from the most challenging specimens, including large macromolecular machines with dynamic composition and conformations (Nogales and Scheres, 2015; Fernandez-Leiro and Scheres, 2016).

Heterogeneity Causes Resolution Anisotropy

Sample heterogeneity is an intrinsic feature of most macromolecular complexes as part of their dynamic mechanisms of action. Many cellular processes are driven by macromolecular complexes, which, far from being static, undergo functionally relevant conformational and compositional changes during their functional cycles. Thus, the control of this variability source in single particle image processing becomes of key importance to extract the maximum amount of useful information about the protein complex of interest.

In principle, structural variability poses a difficult challenge to overcome for 3D structure determination by cryo-EM, limiting severely the attainable resolution (Scheres, 2016). Regions of the structure that display heterogeneous features are incorrectly aligned and averaged with other images causing the loss of coherent signal. The ultimate consequence is that regions showing the highest heterogeneity become blurred or even invisible in the reconstruction, leading to less detailed and even incomplete maps.

There are several sources of sample heterogeneity that must be taken into consideration in order to deal with them (Figures 1A–C). Macromolecular assemblies are dynamic machines, which undergo conformational rearrangement in one or more protein subunits in order to perform their function (Alberts, 1998). This protein flexibility might be discrete or, most frequently, a continuum of different conformations of a particular protein subunit or multiple subunits moving independently from each other. The protein conformational landscape is also complicated by the inevitable co-existence in the sample of partially assembled complexes, partial subunit occupancy and/or even different protein subunit composition due to complex assembly dynamics (Figure 1). A particularly important case of protein complex heterogeneity is represented by symmetry mismatches.

Figure 1. (A–C) Cryo-EM images are usually acquired on heterogeneous samples. Contaminants (light blue), denatured or partially disassemble complexes (light red) as well as different protein complex conformations (gold) are imaged together with the specimen (gray). (D) General approach workflow to handle sample heterogeneity during image processing. Processes (black boxes), iterative process (green boxes), and specific input data (blue boxes) are indicated. The consensus reconstruction is obtained from an initially heterogeneous dataset and it will be used as the initial reference for more specific refinement protocols to deal with sample heterogeneity.

Moreover, non-physiological heterogeneity may be introduced during sample and cryo-EM specimen preparation. On the one hand, most if not all samples cannot be purified to absolute homogeneity and contaminants are carried through the preparation process. On the other hand, cryo-EM sample preparation implies freezing the sample within a thin layer of amorphous water ice, introducing a stress factor that risks the occurrence of artifacts such as breakage of protein-protein interactions within the multi-subunit complexes, dissociation of the complex, and/or partial or complete denaturation caused by the collision of proteins with the extended air-water interface (Taylor and Glaeser, 2008). Last but not least important, are the errors introduced during image processing that will also affect the final reconstruction resolution. Among these, maybe the most significant will be bad particle selection due to limitations in the particle picking algorithms coupled to lack of user supervision (Henderson, 2013).

Identification of Sample Heterogeneity

In cryo-EM, the resolution value provides a general criterion to evaluate the quality of solved macromolecular complexes structures and it is commonly estimated by the Fourier Shell Correlation method (Saxton and Baumeister, 1982; van Heel and Stöffler-Meilicke, 1985; Harauz and van Heel, 1986). Estimated resolution value also helps to define the structural details (e.g., secondary structure vs. aminoacid side chains) merited by a given reconstruction that can readily be interpreted and discussed. In this sense, the presence of sample heterogeneity is commonly manifested by impairment on achieving a high-resolution map. In general, global measurements of resolution do not report internal variation of map quality, as this resolution value will remain unaffected by small changes in the map. In this sense, a detailed inspection of the density map might reveal regions whose details agree with the global resolution value obtained while other regions are blurred. Those areas of lower resolution are usually caused by unaccounted sample heterogeneity. Thus, close examination of the local resolution variation has become critical to recognize and localize heterogeneity. Local resolution estimation was pioneered by ResMap (Kucukelbir et al., 2014), although currently most modern image processing packages include a dedicated local resolution routine [e.g., cisTEM Grant et al., 2018, cryoSPARC Punjani et al., 2017, RELION Scheres, 2012].

Upon detection of sample heterogeneity, this must in turn be characterized and, more importantly, classified in order to improve the complex reconstruction. In this sense, 2D and, in particular, 3D classification have been extensively employed to deal with partial occupancy and asymmetric reconstructions.

2D and 3D Classification to Identify and Classify Sample Heterogeneity

Sample heterogeneity can, in some cases, be recognized just by visual inspection of the obtained maps in the form of blurry regions in the density. However, for a detailed and more accurate analysis, dedicated software tools are necessary. Historically, many different approaches have been developed to perform the classification of heterogeneous cryo-EM data. Reference-based methods (Rossmann and Blow, 1962) were progressively substituted by statistical-based algorithms, including multivariate statistical analysis (MSA), and principal component analysis (PCA) (van Heel and Frank, 1981; Klaholz, 2015; Haselbach et al., 2017). Among the different 2D and 3D classification algorithms, those that rely on a regularized likelihood optimization algorithm like RELION have been proved to reliably deal with sample heterogeneity (Sigworth, 1998; Scheres et al., 2005; Scheres, 2012; Song et al., 2013; Chowdhury et al., 2015). Although successful, the maximum likelihood approach has some methodological limitations to be considered. On the one hand, these methods are prone to be trapped in local minima. This is particularly relevant when dealing with intrinsically noisy cryo-EM images and local optimization algorithms have been implemented in several image processing packages in order to overcome this problem (Sorzano et al., 2010; Elmlund et al., 2013; Punjani et al., 2017). In the particular case of macromolecular motions of continuous nature, discrete classes defined during ML-based image classifications will not address every conformation of the macromolecular complex. Alternative methods based on manifold embedding have also been proposed in order to map continuous heterogeneity in a data-specific coordinate system (Frank and Ourmazd, 2016).

For 2D classifications the RELION algorithm marginalizes only over in-plane orientations, which implies an inherent limitation in the capacity for structural sample heterogeneity identification. The user has to interpret the experimental projection images and this might lead to a miss-interpretation of the structural heterogeneity. Still this algorithm is very useful to identify bad particle alignment caused by protein partially unfolded, incorrectly selected or too closely selected particles.

On the other hand, the RELION 3D classification both contributes to the selection of a particle dataset suitable for high-resolution reconstruction and also to further identification of sample heterogeneity since the algorithm marginalizes over both the orientational and class assignments of the particles images (Scheres et al., 2007). Even if the number of classes is lower than the actual number of structures contained in the data set, an initial 3D classification would help to classify large differences in the structure. In order to find the proper number of classes for a given data set, as well as to assess reproducibility and consistency of the classification, it is recommended to test several rounds of classification using different initial references and different number of classes (Scheres, 2012). Subsequent 3D classifications with increasingly more exhaustive angular searches, as well as with the use of specific soft-edge mask, will shed light on smaller structural differences among the classes, contributing both to obtain a more homogeneous particle dataset for the high-resolution reconstruction and to identify sample heterogeneity (Scheres, 2016). FrealignX 3D classification included in the cisTEM software (Grant et al., 2018) represents a valuable alternative. Apart from using a ML approach (Lyumkis et al., 2013) it allows, as in RELION 3D classification, to restrict and focus the classification on a masked-in sub-volume. Another package, CryoSPARC (Punjani et al., 2017) includes a heterogeneous refinement routine that not only classifies but also refines heterogeneous 3D structures. Unlike RELION, CryoSPARC 3D classification and refinement requires a number of user-provided initial references to be refined simultaneously with the classified particle datasets. A more informative description of the conformational landscape of a macromocular complex can be obtained by a combination of 3D classification procedures with principal component analysis. An energy landscape is calculated where the conformational variations are plotted in a coordinate system (Clare et al., 2012; Haselbach et al., 2017).

Focused Processing on Heterogeneous Structures

A general approach would take advantage of the 3D classifications to independently refine a particular class that could represent a homogeneous identity within the heterogeneous dataset (Figure 1D) (Bai et al., 2015b; Nguyen et al., 2016). Other, more focused approaches to treat heterogeneous sample are known as localized reconstruction methods and allow a more detailed analysis of small differences within the structures (Ilca et al., 2015; D'Imprima et al., 2017; Nakane et al., 2018).

As mentioned above, both masked refinement and masked classification are suitable tools to deal with the presence of multiple structures within the same data set. Both protocols start by applying a 3D mask to the reference at every iteration, including the density of interest while the rest of the complex is masked out. Thus, creating an adequate mask for the processing turns out to be essential. Moreover, validation of the reconstruction obtained should critically involve analysis of any masking effects.

Historically, masks have been applied during image processing to remove the solvent noise surrounding a molecular envelope, mainly at the alignment stage, and after 3D refinement when the gold-standard FSC calculation is used (Henderson et al., 2012; Scheres, 2012; Scheres and Chen, 2012). However, applying a mask is not only helpful to improve the resolution of the constant part of the complex but it is also the only effective way to isolate and extract heterogeneous regions to be treated as independent single particles for classification and high-resolution refinement purposes. This approach makes it important to closely test and monitor the created masks so they work as intended without introducing processing artifacts (Rawson et al., 2016).

Importantly, any generated masks should have soft edges rather than sharp edges in order to prevent artifacts in the Fourier domain during FSC estimations (Chen et al., 2013). Another important aspect to be considered in designing a mask for a 3D refinement is that including detailed features in the mask will lead to overfitting during the refinement, which causes an increase in the FSC curve that does not reflect the true signal-to-noise ratio and a consequent overestimation of the resolution (Rosenthal and Henderson, 2003; Chen et al., 2013). Thus, the reference volume that is used to create the mask is usually low-pass filtered.

Focused Refinement Allows Reaching High Resolution Information in cryo-EM Reconstructions

Masked 3D refinement is a suitable tool to deal with datasets that contain flexible structures within the same protein complex. This method allows focusing the refinement protocol into a specific region within the subject structure (Louder et al., 2016; Nguyen et al., 2016). Through a masking operation that excludes everything but the region of interest through all the refinement iterations, only the target region signal is used for particle alignment and the map that comes out the refinement corresponds solely to the masked region (Nguyen et al., 2016; Nakane et al., 2018). The interface between the subvolumes might not be well-resolved, so it is important to analyse them in parallel with the unmasked refinement map of the entire volume (Scheres, 2016).

It is worth mentioning that refinement in single particle analysis of cryo-EM data is understood as the refinement of particle orientations with respect to an initial reference. In the course of the masked 3D refinement, the designed mask is applied to the 3D structures so the given reference projections, that will be compared to the experimental images, only contain information about the region of interest. However, in this original approach, the subvolume mask is not applied to the experimental 2D images so they contain information about the entire particle. Thus, an inconsistency in comparing both reference and experimental projections occurs and the density of the experimental data that is not present in the reference projection acts as noise that might impaired orientational and class assignments (Bai et al., 2015b; Ilca et al., 2015). In order to overcome this problem, and based on previous structural studies of symmetry mismatches in bacteriophage phi29 (Morais et al., 2003) and flaviviruses (Zhang et al., 2007), a density subtraction protocol was developed (Figure 2A). Subtraction of the experimental signal that is masked out in the 3D reference is carried out at the 2D images level so that the comparison inconsistency mentioned is minimized. It is highly recommended to validate that the density was correctly subtracted by calculating a 3D reconstruction of the subtracted particles. Finally, the new subtracted experimental images dataset can be used both for masked 3D refinement and classification where some parameters such as orientational searches can be fine-tuned to improve particle alignment (Nguyen et al., 2015; Scheres, 2016). Very recently, a new method based on the signal-subtraction and masked 3D refinement protocol has been specifically designed for simultaneous refinement of a number of regions of interest within the same protein complex.

Figure 2. (A) Schematic representation of the density subtraction protocol. Generated mask from the initial map are apply to obtain both projections of the map region that will be subtracted from the experimental images and projections of the region that will be subjected to masked 3D classification and refinement [based on (Bai et al., 2015b)]. (B) General workflow to extract as much structural information from the pseudosymmetric features contained in a macromolecular complex.

Another consideration is that macromolecular complexes can often display continuous motions rather than discrete conformational stages, and the understanding of those motions might be critical to functionally characterize the protein complex. Normal mode analysis has been used to deduce macromolecular motions for low-resolution maps (Tama et al., 2002) and the previously discussed masked 3D classification and refinement protocols allow reconstruction of the bodies that have relative motions within the macromolecular complex. Recent contributions in the field allow for the reconstruction of much smaller regions of a given map (Schilbach et al., 2017). Based on this, a new method called multibody refinement, uses the entire dataset in the iterative 3D refinement of previously defined individual bodies (Nguyen et al., 2015; Nakane et al., 2018). Moreover, it also provides a principal components analysis to monitor relative motions between the reconstructed bodies. CryoSPARC's non-uniform refinement is also aimed to improve resolution of disordered or flexible regions by reducing over-fitting tendency of disordered regions (Punjani et al., 2017).

Pseudo-symmetry in Macromolecular Complexes as an Advantageous Feature for Single Particle Refinement

A fair number of macromolecular complexes present some kind of internal symmetry or pseudo-symmetry, when more than one copy of the same structural element is not related by the normal symmetry of the complex. Symmetry is both a relevant feature in the biological function of the complex and a very useful feature that can be taken advantage of in single particle image processing. The application of symmetry during the 3D refinement results in practice in the multiplication of the number of useful particles or the structural elements that are symmetrically related.

The signal might be boosted through the application of symmetry in the low-resolution range, while it would impair protein alignment and impede attaining high-resolution when incorrect symmetry operators are imposed. Some complexes can be considered symmetric at low resolution with asymmetric features becoming observable only at high-resolution. Therefore, it is crucial to determine and demonstrate the nature of the macromolecular complex symmetry so the final volume corresponds to the actual protein structure.

Beyond the mere examination of a model that has more than one copy of a chain, several methods to identify internal symmetry and pseudo-symmetry in a structure have been developed. In particular, Phenix has implemented a tool based on direct search for patterns of density that are present in more than one place in the map (Terwilliger, 2013). The software refines the correlation values between the identified symmetric element and the rest of the map so final orientations and translations that defined the pseudo-symmetry are provided. Once the pseudo-symmetry is defined, and based on its nature, there are different approaches to apply it during structure refinement (Figure 2B).

Classical Approach to Pseudo-symmetric Maps

Prior to the development of the 3D masked classification and refinement to improve resolution for pseudo-symmetric complexes, parallel refinements were usually carried out with and without symmetry application. The aim was to impose symmetry during the refinement of the molecular complex, enforcing signal for its symmetric part. The density map improvement would be obtained at the cost of blurring the asymmetric regions. In the end a chimeric map would be built using the symmetric density map obtained from the 3D refinement where the symmetry was imposed, and the asymmetric density map where the symmetry was not considered. Several macromolecular complex structures have been solved thanks to this approach (Coloma et al., 2009).

An evident issue of the method described appears at the interface between both independently refined maps where the information between the connected areas will be lost. Moreover, due to the misalignment of the asymmetric part during the symmetry-imposed 3D refinement, the achieved resolution will be necessarily limited. Over time, new methods have been developed to overcome these limitations.

Density Maps of Identical Regions of a Macromolecular Complex can be Aligned and Average to Boost Structure Resolution

When a pseudo-symmetric protein complex contains more than one copy of the same object, it is possible to manually apply the symmetry directly to the symmetric area of the 3D reconstruction. The reasoning behind this method is the same as in the subtomogram averaging approach (Wan and Briggs, 2016). Subvolumes of each symmetric element of the macromolecular complex could be extracted from the overall structure map, aligned and averaged. Density signal will be then boosted and higher resolution maps obtained. Tools for subvolume extraction, maps alignment and averaging can be found in many image processing packages, as well as in 3D structures visualization software such as Chimera (Pettersen et al., 2004).

The potential of this method is greater when the alignment and average of the identical regions of the macromolecular complex can be performed at the level of 2D experimental images rather than using 3D reconstructed volumes. This idea has allowed the development of new powerful methods such as the symmetry-expansion approach (Ilca et al., 2015; Kimanius et al., 2016; Zivanov et al., 2018).

Symmetry-expansion Approach

In a symmetric molecular complex, individual and symmetric elements of the structure can be conformationally heterogeneous, making the complex asymmetric. In this case, each density element is subtracted from the overall structure and treated as single particles that are related by the overall symmetry of the complex. The expanded dataset is obtained by applying to the 2D experimental images the corresponding transformation calculated from the known overall symmetry of the complex coupled with the subtraction operation. The structure of the asymmetric feature is then obtained through a combination of 3D masked classification and refinement strictly with local angular searches (Zhou et al., 2015).

This approach was originally applied by Briggs et al. (2005) to refine the vertices of the Kelp fly virus capsids and was initially implemented in RELION for cyclic symmetries (Kimanius et al., 2016). More recently, a new version of this method allows dealing also with molecular complexes whose asymmetrical feature is not related to the overall symmetry of the complex (Zivanov et al., 2018). Given a density map, this local symmetry method creates one mask per identical region and defines a unique operator that allows rotation of a masked region to match with another identical region that are symmetry-related. Initial operators are estimated from global orientation searches on the three euler angles and then, they are refined using progressively finer angular and translational searches. Final optimized operators are combined to apply the local symmetry to the subsequent 3D classification and reconstruction. This method can be also iteratively applied to improve the final structure.


Since the earliest applications of the technique, the compositional and conformational dynamics of protein complexes have represented a challenge for EM image processing and single particle analysis. Nowadays this paradigm has begun to change thanks to the development of a new generation of computational tools that take full advantage of the recent revolutionary advances in EM hardware and data acquisition protocols. As for what lies ahead for the field, there is still much room for further improvements on current strategies and the development of wholly new strategies for image data analysis.

Author Contributions

The author confirms being the sole contributor of this work and has approved it for publication.

Conflict of Interest Statement

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.


Alberts, B. (1998). The cell as a collection of protein machines: preparing the next generation of molecular biologists. Cell 92, 291–294.

PubMed Abstract | Google Scholar

Bai, X. C., Rajendra, E., Yang, G., Shi, Y., and Scheres, S. H. (2015b). Sampling the conformational space of the catalytic subunit of human γ-secretase. Elife 4:e11182. doi: 10.7554/eLife.11182

PubMed Abstract | CrossRef Full Text | Google Scholar

Briggs, J. A., Huiskonen, J. T., Fernando, K. V., Gilbert, R. J., Scotti, P., Butcher, S. J., et al. (2005). Classification and three-dimensional reconstruction of unevenly distributed or symmetry mismatched features of icosahedral particles. J. Struct. Biol. 150, 332–339. doi: 10.1016/j.jsb.2005.03.009

PubMed Abstract | CrossRef Full Text | Google Scholar

Brilot, A. F., Chen, J. Z., Cheng, A., Pan, J., Harrison, S. C., Potter, C. S., et al. (2012). Beam-induced motion of vitrified specimen on holey carbon film. J. Struct. Biol. 177, 630–637. doi: 10.1016/j.jsb.2012.02.003

PubMed Abstract | CrossRef Full Text | Google Scholar

Campbell, M. G., Cheng, A., Brilot, A. F., Moeller, A., Lyumkis, D., Veesler, D., et al. (2012). Movies of ice-embedded particles enhance resolution in electron cryo-microscopy. Structure 20, 1823–1828. doi: 10.1016/j.str.2012.08.026

PubMed Abstract | CrossRef Full Text | Google Scholar

Chen, S., McMullan, G., Faruqi, A. R., Murshudov, G. N., Short, J. M., Scheres, S. H., et al. (2013). High-resolution noise substitution to measure overfitting and validate resolution in 3D structure determination by single particle electron cryomicroscopy. Ultramicroscopy 135, 24–35. doi: 10.1016/j.ultramic.2013.06.004

PubMed Abstract | CrossRef Full Text | Google Scholar

Chowdhury, S., Ketcham, S. A., Schroer, T. A., and Lander, G. C. (2015). Structural organization of the dynein–dynactin complex bound to microtubules. Nat. Struct. Mol. Biol. 22, 345–347. doi: 10.1038/nsmb.2996

PubMed Abstract | CrossRef Full Text | Google Scholar

Clare, D. K., Vasishtan, D., Stagg, S., Quispe, J., Farr, G. W., Topf, M., et al. (2012). ATP-triggered conformational changes delineate substrate-binding and -folding mechanics of the GroEL chaperonin. Cell 149, 113–123. doi: 10.1016/j.cell.2012.02.047

PubMed Abstract | CrossRef Full Text | Google Scholar

Coloma, R., Valpuesta, J. M., Arranz, R., Carrascosa, J. L., Ortín, J., and Martín-Benito, J. (2009). The structure of a biologically active influenza virus ribonucleoprotein complex. PLoS Pathog. 5:e1000491. doi: 10.1371/journal.ppat.1000491

PubMed Abstract | CrossRef Full Text | Google Scholar

D'Imprima, E., Salzer, R., Bhaskara, R. M., Sánchez, R., Rose, I., Kirchner, L., et al. (2017). Cryo-EM structure of the bifunctional secretin complex of Thermus thermophilus. Elife 6:e30483. doi: 10.7554/eLife.30483

PubMed Abstract | CrossRef Full Text | Google Scholar

Elmlund, H., Elmlund, D., and Bengio, S. (2013). PRIME: probabilistic initial 3D model generation for single-particle cryo-electron microscopy. Structure 21, 1299–1306. doi: 10.1016/j.str.2013.07.002

PubMed Abstract | CrossRef Full Text | Google Scholar

Frank, J., and Ourmazd, A. (2016). Continuous changes in structure mapped by manifold embedding of single-particle data in cryo-EM. Methods 100, 61–67. doi: 10.1016/j.ymeth.2016.02.007

PubMed Abstract | CrossRef Full Text | Google Scholar

Grant, T., and Grigorieff, N. (2015). Measuring the optimal exposure for single particle cryo-EM using a 2.6 Å reconstruction of rotavirus VP6. Elife 4:e06980. doi: 10.7554/eLife.06980

PubMed Abstract | CrossRef Full Text | Google Scholar

Harauz, G., and van Heel, M. (1986). Exact filters for general geometry 3-dimensional reconstruction. Optik 73, 146–156.

Google Scholar

Haselbach, D., Schrader, J., Lambrecht, F., Henneberg, F., Chari, A., and Stark, H. (2017). Long-range allosteric regulation of the human 26S proteasome by 20S proteasome-targeting cancer drugs. Nat. Commun. 8:15578. doi: 10.1038/ncomms15578

PubMed Abstract | CrossRef Full Text | Google Scholar

Henderson, R. (2013). Avoiding the pitfalls of single particle cryo-electron microscopy: Einstein from noise. Proc. Natl. Acad. Sci. U.S.A. 110, 18037–18041. doi: 10.1073/pnas.1314449110

PubMed Abstract | CrossRef Full Text | Google Scholar

Henderson, R., and McMullan, G. (2013). Problems in obtaining perfect images by single-particle electron cryomicroscopy of biological structures in amorphous ice. Microscopy 62, 43–50. doi: 10.1093/jmicro/dfs094

PubMed Abstract | CrossRef Full Text | Google Scholar

Henderson, R., Sali, A., Baker, M. L., Carragher, B., Devkota, B., Downing, K. H., et al. (2012). Outcome of the first electron microscopy validation task force meeting. Structure 20, 205–214. doi: 10.1016/j.str.2011.12.014

PubMed Abstract | CrossRef Full Text | Google Scholar

Ilca, S. L., Kotecha, A., Sun, X., Poranen, M. M., Stuart, D. I., and Huiskonen, J. T. (2015). Localized reconstruction of subunits from electron cryomicroscopy images of macromolecular complexes. Nat. Commun. 6:8843. doi: 10.1038/ncomms9843

PubMed Abstract | CrossRef Full Text | Google Scholar

Kimanius, D., Forsberg, B. O., Scheres, S. H., and Lindahl, E. (2016). Accelerated cryo-EM structure determination with parallelisation using GPUs in RELION-2. Elife 5:e18722. doi: 10.7554/eLife.18722

PubMed Abstract | CrossRef Full Text | Google Scholar

Klaholz, B. P. (2015). Structure sorting of multiple macromolecular states in heterogeneous Cryo-EM samples by 3D multivariate statistical analysis. Opt. J. Stat. 5, 820–836. doi: 10.4236/ojs.2015.57081

CrossRef Full Text | Google Scholar

Li, X., Mooney, P., Zheng, S., Booth, C. R., Braunfeld, M. B., Gubbens, S., et al. (2013). Electron counting and beam-induced motion correction enable near-atomic-resolution single-particle cryo-EM. Nat. Methods 10:584. doi: 10.1038/nmeth.2472

PubMed Abstract | CrossRef Full Text | Google Scholar

Louder, R. K., He, Y., López-Blanco, J. R., Fang, J., Chacón, P., and Nogales, E. (2016). Structure of promoter-bound TFIID and model of human pre-initiation complex assembly. Nature 531,604–609. doi: 10.1038/nature17394

PubMed Abstract | CrossRef Full Text | Google Scholar

Lyumkis, D., Brilot, A. F., Theobald, D. L., and Grigorieff, N. (2013). Likelihood-based classification of cryo-EM images using FREALIGN. J. Struct. Biol. 183, 377–388. doi: 10.1016/j.jsb.2013.07.005

PubMed Abstract | CrossRef Full Text | Google Scholar

McMullan, G., Faruqi, A. R., Clare, D., and Henderson, R. (2014). Comparison of optimal performance at 300keV of three direct electron detectors for use in low dose electron microscopy. Ultramicroscopy 147, 156–163. doi: 10.1016/j.ultramic.2014.08.002

PubMed Abstract | CrossRef Full Text | Google Scholar

Merk, A., Bartesaghi, A., Banerjee, S., Falconieri, V., Rao, P., Davis, M. I., et al. (2016). Breaking Cryo-EM resolution barriers to facilitate drug discovery. Cell 165, 1698–1707. doi: 10.1016/j.cell.2016.05.040

PubMed Abstract | CrossRef Full Text | Google Scholar

Morais, M. C., Kanamaru, S., Badasso, M. O., Koti, J. S., Owen, B. A., McMurray, C. T., et al. (2003). Bacteriophage phi29 scaffolding protein gp7 before and after prohead assembly. Nat. Struct. Biol. 10, 572–576. doi: 10.1038/nsb939

PubMed Abstract | CrossRef Full Text | Google Scholar

Nakane, T., Kimanius, D., Lindahl, E., and Scheres, S. H. (2018). Characterisation of molecular motions in cryo-EM single-particle data by multi-body refinement in RELION. Elife 7:e36861. doi: 10.7554/eLife.36861

PubMed Abstract | CrossRef Full Text | Google Scholar

Nguyen, T. H., Galej, W. P., Bai, X. C., Savva, C. G., Newman, A. J., Scheres, S. H., et al. (2015). The architecture of the spliceosomal U4/U6.U5 tri-snRNP. Nature 523, 47–52. doi: 10.1038/nature14548

PubMed Abstract | CrossRef Full Text | Google Scholar

Nguyen, T. H. D., Galej, W. P., Bai, X. C., Oubridge, C., Newman, A. J., Scheres, S. H. W., et al. (2016). Cryo-EM structure of the yeast U4/U6.U5 tri-snRNP at 3.7 Å resolution. Nature 530, 298–302. doi: 10.1038/nature16940

PubMed Abstract | CrossRef Full Text | Google Scholar

Pettersen, E. F., Goddard, T. D., Huang, C. C., Couch, G. S., Greenblatt, D. M., Meng, E. C., et al. (2004). UCSF Chimera–a visualization system for exploratory research and analysis. J. Comput. Chem. 25, 1605–1612. doi: 10.1002/jcc.20084

PubMed Abstract | CrossRef Full Text | Google Scholar

Punjani, A., Rubinstein, J. L., Fleet, D. J., and Brubaker, M. A. (2017). cryoSPARC: algorithms for rapid unsupervised cryo-EM structure determination. Nat. Methods 14, 290–296. doi: 10.1038/nmeth.4169

PubMed Abstract | CrossRef Full Text | Google Scholar

Rawson, S., Iadanza, M. G., Ranson, N. A., and Muench, S. P. (2016). Methods to account for movement and flexibility in cryo-EM data processing. Methods 100, 35–41. doi: 10.1016/j.ymeth.2016.03.011

PubMed Abstract | CrossRef Full Text | Google Scholar

Reboul, C. F., Bonnet, F., Elmlund, D., and Elmlund, H. (2016). A stochastic hill climbing approach for simultaneous 2D alignment and clustering of cryogenic electron microscopy images. Structure 24, 988–996. doi: 10.1016/j.str.2016.04.006

PubMed Abstract | CrossRef Full Text | Google Scholar

Rosenthal, P. B., and Henderson, R. (2003). Optimal determination of particle orientation, absolute hand, and contrast loss in single-particle electron cryomicroscopy. J. Mol. Biol. 333, 721–745. doi: 10.1016/j.jmb.2003.07.013

PubMed Abstract | CrossRef Full Text | Google Scholar

Rossmann, M. G., and Blow, D. M. (1962). The detection of sub-units within the crystallographic asymmetric unit. Acta Crystallogr. 15, 24–31. doi: 10.1107/S0365110X62000067

CrossRef Full Text | Google Scholar

Saxton, W. O., and Baumeister, W. (1982). The correlation averaging of a regularly arranged bacterial cell envelope protein. J. Microsc. 127, 127–138. doi: 10.1111/j.1365-2818.1982.tb00405.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Scheres, S. H., Gao, H., Valle, M., Herman, G. T., Eggermont, P. P., Frank, J., et al. (2007). Disentangling conformational states of macromolecules in 3D-EM through likelihood optimization. Nat. Methods 4, 27–29. doi: 10.1038/nmeth992

PubMed Abstract | CrossRef Full Text | Google Scholar

Scheres, S. H., Valle, M., and Carazo, J. M. (2005). Fast maximum-likelihood refinement of electron microscopy images. Bioinformatics 21, 243–244. doi: 10.1093/bioinformatics/bti1140

PubMed Abstract | CrossRef Full Text | Google Scholar

Schilbach, S., Hantsche, M., Tegunov, D., Dienemann, C., Wigge, C., Urlaub, H., et al. (2017). Structures of transcription pre-initiation complex with TFIIH and Mediator. Nature 551, 204–209. doi: 10.1038/nature24282

PubMed Abstract | CrossRef Full Text | Google Scholar

Song, C. F., Papachristos, K., Rawson, S., Huss, M., Wieczorek, H., Paci, E., et al. (2013). Flexibility within the rotor and stators of the vacuolar H+-ATPase. PLoS ONE 8:e82207. doi: 10.1371/journal.pone.0082207

PubMed Abstract | CrossRef Full Text | Google Scholar

Sorzano, C. O., Bilbao-Castro, J. R., Shkolnisky, Y., Alcorlo, M., Melero, R., Caffarena-Fernández, G., et al. (2010). A clustering approach to multireference alignment of single-particle projections in electron microscopy. J. Struct. Biol. 171, 197–206. doi: 10.1016/j.jsb.2010.03.011

PubMed Abstract | CrossRef Full Text | Google Scholar

Tama, F., Wriggers, W., and Brooks, C. L. (2002). Exploring global distortions of biological macromolecules and assemblies from low-resolution structural information and elastic network theory. J. Mol. Biol. 321, 297–305. doi: 10.1016/S0022-2836(02)00627-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Taylor, K. A., and Glaeser, R. M. (2008). Retrospective on the early development of cryoelectron microscopy of macromolecules and a prospective on opportunities for the future. J. Struct. Biol. 163, 214–223. doi: 10.1016/j.jsb.2008.06.004

PubMed Abstract | CrossRef Full Text | Google Scholar

Terwilliger, T. C. (2013). Finding non-crystallographic symmetry in density maps of macromolecular structures. J. Struct. Funct. Genomics 14, 91–95. doi: 10.1007/s10969-013-9157-7

PubMed Abstract | CrossRef Full Text | Google Scholar

van Heel, M., and Frank, J. (1981). Use of multivariate statistics in analysing the images of biological macromolecules. Ultramicroscopy 6, 187–194.

PubMed Abstract | Google Scholar

van Heel, M., and Stöffler-Meilicke, M. (1985). Characteristic views of E. coli and B. stearothermophilus 30S ribosomal subunits in the electron microscope. EMBO J. 4, 2389–2395.

PubMed Abstract | Google Scholar

Yang, C., Ji, G., Liu, H., Zhang, K., Liu, G., Sun, F., et al. (2012). Cryo-EM structure of a transcribing cypovirus. Proc. Natl. Acad. Sci. U.S.A. 109, 6118–6123. doi: 10.1073/pnas.1200206109

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, Y., Kaufmann, B., Chipman, P. R., Kuhn, R. J., and Rossmann, M. G. (2007). Structure of immature West Nile virus. J. Virol. 81, 6141–6145. doi: 10.1128/JVI.00037-07

PubMed Abstract | CrossRef Full Text | Google Scholar

Zheng, S. Q., Palovcak, E., Armache, J. P., Verba, K. A., Cheng, Y., and Agard, D. A. (2017). MotionCor2: anisotropic correction of beam-induced motion for improved cryo-electron microscopy. Nat. Methods 14, 331–332. doi: 10.1038/nmeth.4193

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhou, M., Li, Y., Hu, Q., Bai, X. C., Huang, W., Yan, C., et al. (2015). Atomic structure of the apoptosome: mechanism of cytochrome c- and dATP-mediated activation of Apaf-1. Genes Dev. 29, 2349–2361. doi: 10.1101/gad.272278.115

PubMed Abstract | CrossRef Full Text | Google Scholar

Zivanov, J., Nakane, T., Forsberg, B., Kimanius, D., Hagen, W. J., Lindahl, E., et al. (2018). RELION-3: new tools for automated high-resolution cryo-EM structure determination. eLife 7:e42166. doi: 10.1101/421123

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: cryo-electron microscopy, single particle processing, macromolecular complexes, heterogeneity, refinement, resolution, psuedosymmetry

Citation: Serna M (2019) Hands on Methods for High Resolution Cryo-Electron Microscopy Structures of Heterogeneous Macromolecular Complexes. Front. Mol. Biosci. 6:33. doi: 10.3389/fmolb.2019.00033

Received: 10 October 2018; Accepted: 24 April 2019;
Published: 15 May 2019.

Edited by:

Marta Carroni, Science for Life Laboratory (SciLifeLab), Sweden

Copyright © 2019 Serna. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Marina Serna, [email protected]


Processing cryo em data

Unlock the potential of cryo-EM withcryoSPARC


Extract valuable insight from single particle cryo-EM data

CryoSPARC combines powerful innovations in 3D reconstruction algorithms with specially designed software to provide a streamlined end-to-end single particle cryo-EM workflow. Rapidly solve high-resolution structures of biologically important targets, with advanced tools for membrane proteins, heterogeneous samples, and flexible molecules. Process 3D refinements in minutes on GPU.

Read more about the scientific advances in cryoSPARC in Nature Methods, check out hundreds of studies that have used cryoSPARC, and subscribe to our newsletter to stay up to date.

crySPARC user interface
Cryo-EM17 lecture 07: Data processing strategy - Rafael Fernandez-Leiro

EM Data Processing and Interpretation

EM Data Processing and Interpretation

Electron Microscopy (EM) is a powerful magnification tool utilizing focused beams of electrons to obtain topographical, morphological and compositional information for a large variety of biological samples, which makes it invaluable in many science and industry applications. Advances in sample preparation methods and development of EM instrumentation have allowed electron micrographs to be acquired with ever-increasing resolution. However, the processing and analysis of EM data remains a difficult step, which relies heavily on expertise and the use of proper software tools.

The EM platform at Creative Biostructure covers a wide spectrum of techniques for sample preparation, imaging, as well as image analysis in 2D and 3D. Equipped with high-end image analysis workstations, our major focus is the study of large macromolecules, but we can also implement the ultrastructural analysis of microorganisms, nanoparticles, and cellular/tissue specimens. We have extensive experiences in automated and manual image and data analysis methods and a good master of software tools for statistical analysis and mathematical modelling.

Typical types of our services include:

  • TEM/SEM image enhancement, segmentation, signal to noise ratio optimization
  • Nanoparticle morphology, circularity, homogeneity and size distribution
  • Electron tomography of cellular structures with subnanometer resolution
  • Automatic particle picking and multi-class classification
  • 2D class averaging and 3D reconstruction
  • Localization and identification of specific macromolecules in tomographic volumes
  • Fitting of individual structures into cryo-EM maps of macromolecular complexes
  • Hybrid approaches combined with available structural information
  • Customizable analysis based on requests

EM data processing and analysis Figure 1. EM data processing and analysis

Creative Biostructure uses various leading software packages for advanced EM data evaluation and analysis. Our staff can also help you define optimal experimental conditions and sample preparation methods for your biological samples, ensuring the collection of highest-quality data for interpretation. We promise to work closely with our customers to provide tailored EM solutions for a wide spectrum of biological samples.

Please feel free to contact us for a detailed quote.

Ordering Process

Ordering Process


  1. Carazo JM, et al. (2015) “Three-dimensional reconstruction methods in Single Particle Analysis from transmission electron microscopy data”. Arch Biochem Biophys 581:39-48.

For Research Use Only. Not for use in diagnostic or therapeutic procedures.


You will also be interested:

< Back to Blog

Learn About CryoEM Data Storage & Processing

Shimon Ben David. May 8, 2020
Learn About CryoEM Data Storage & Processing


What is Cryo-EM?

In order to further understand what Cryo-EM is, we need to discuss another topic first: drug discovery.

Pharmaceutical companies are producing medicine and drugs that perform multiple functions. These can be pain killers, a cure to a specific disease and much more.

Another term we should discuss is called “proteins” – You can think of proteins as little engines that have specific functions within our body – for example fighting viruses or transmitting messages.

In order to produce an effective drug, researchers need to see the structure of proteins inside our cells so that they can design a drug that will bind effectively to a specific protein type – for example design a drug that will bind to a pain receptor protein to block the pain. imagine pieces of a puzzle that can fit perfectly to each other compared to disjointed pieces that almost fit.

The better the researchers can see the protein the better the drug can be, which means it is more effective or will require a smaller dosage which is obviously healthier to the human body and especially the liver.

Understanding the above we can go back to the main topic. Cryo-EM which is short for Cryogenic Electron Microscopy is part of a larger field of research called Structure Based Drug Design (SBDD). This is the process taking organic tissues, freezing them, then bombarding them with radiation. This generates multiple pictures of the proteins themselves. During this process the tissue can move as well, even though it is frozen, and that is why it can be compared to taking multiple pictures of an object from multiple angles while it is moving, with all of the challenges associated with it. 

Scientists can then use these 2D pictures to generate 3D images of the proteins. Since the protein is moving during this time period it actually creates a movie like output. Using that, researchers can then design a drug that can bond better with that protein.

Cryo-em Pipeline

The Cryo-EM process is composed out of multiple steps that can vary according to the exact pipeline additionally there will be multiple frameworks in use such as Relion, CryoSPARC, CTFFIND, and more. In general these steps would include Motion correction, CTF Estimation, Particle picking, and particle extraction.

Cryo-em Data Processing

The Cryo-EM scopes are very expensive and in high demand and therefore are usually used 24×7 and produce large amounts of data. The data then needs to pass multiple steps in order to get to the end 3D movie like result. The ability to go over large amounts of data can also allow for better images of the proteins which can then improve the generated drug.

Since this is a fairly new field it is already adapted to newer technologies and therefore Cryo-EM frameworks such as Relion and CryoSPARC can already leverage GPUs for accelerating their workload.

Data Storage Requirements for Cryo-EM

A single CryoEM run can capture thousands of images and can generate anywhere between 1TB to 10TB of raw datasets. Each step in the pipeline usually half that amount of data (removing bad images, removing low resolutions, duplicates, etc…) the main storage challenges for this pipeline are the high degree of variability between the data sizes and the access patterns of each step. While it begins as a high throughput sequential access IO pattern use case with each step it moves to a smaller size random IO pattern. Additionally multiple GPU servers need to go over the data in parallel in an effective way in order to minimize the processing time. also the storage needs to scale to PBs as to not require the researchers to delete the original data.

Since this is a pipeline that is composed out of multiple steps using multiple access patterns to the data it takes a modern parallel file system such as WekaFS that can accommodate for the different access patterns and sizes as well as to the number of files that need to be analyzed in order to get to the end result. WekaFS’s ability to accelerate GPU workloads can further decrease the time required to complete the process and allow researchers to perform more pipelines and get to more accurate results.


Click here to learn more about how Weka accelerates CryO-EM pipelines.


You may also like:
Comparing Network Attached Storage (NAS) Solutions: Isilon vs. Flashblade vs. Weka
Worldwide Scale-out File-Based Storage 2019 Vendor Assessment Report
5 Reasons Why IBM Spectrum Scale is Not Suitable for AI Workloads


1357 1358 1359 1360 1361