ap

Tetra­aqua­(ethane-1,2-di­amine-κ2N,N')nickel(II) naphthalene-1,5-di­sulfonate dihydrate

The reaction of ethane-1,2-di­amine (en, C2H8N2), the sodium salt of naphthalene-1,5-di­sulfonic acid (H2NDS, C10H8O6S2), and nickel sulfate in an aqueous solution resulted in the formation of the title salt, [Ni(C2H8N2)(H2O)4](C10H6O6S2)·2H2O or [Ni(en)(H2O)4](NDS)·2H2O. In the asymmetric unit, one half of an [Ni(en)(H2O)4]2+ cation and one half of an NDS2− anion, and one water mol­ecule of crystallization are present. The Ni2+ cation in the complex is positioned on a twofold rotation axis and exhibits a slight tetra­gonal distortion of the cis-NiO4N2 octa­hedron, with an Ni—N bond length of 2.0782 (16) Å, and Ni—O bond lengths of 2.1170 (13) Å and 2.0648 (14) Å. The anion is completed by inversion symmetry. In the extended structure, the cations, anions, and non-coordinating water mol­ecules are connected by inter­molecular N—H⋯O and O—H⋯O hydrogen bonding, as well as C—H⋯π inter­actions, forming a three-dimensional network.




ap

4-(1H-2,3-Dihydro­naphtho­[1,8-de][1,3,2]di­aza­borinin-2-yl)-1-ethylpyridin-1-ium iodide

The title compound, C17H17BN3I, is a type of di­aza­borinane featuring substitution at the 1, 2, and 3 positions of the nitro­gen–boron six-membered heterocycle. The organic mol­ecule has a planar structure, the dihedral angle between the pyridyl ring and the fused ring system being 3.46 (4)°. In the crystal, mol­ecules are stacked in a head-to-tail manner. The iodide ion makes close contacts with three organic mol­ecules and supports the alternating stack.




ap

4-(1H-2,3-Dihydronaphtho­[1,8-de][1,3,2]di­aza­borinin-2-yl)-1-ethylpyridin-1-ium iodide monohydrate

The cation of the title hydrated salt, C17H17BN3+·I−·H2O, is a di­aza­borinane featuring substitution at the 1, 2, and 3 positions in the nitro­gen–boron six-membered heterocycle. The cation is approximately planar with a dihedral angle between the pyridyl ring and the di­aza­borinane ring system of 5.40 (5)°. In the crystal, the cations stack along [100] in an alternating head-to-tail manner, while the iodide ion and water mol­ecule form one-dimensional hydrogen-bonded chains beside the cation stack. The cation stacks and I−–water chains are crosslinked by N—H⋯I and N—H⋯O hydrogen bonds.




ap

Bis(8-hy­droxy­quinolinium) naphthalene-1,5-di­sulfonate tetra­hydrate

The inter­action between 8-hy­droxy­quinoline (8HQ, C9H7NO) and naphthalene-1,5-di­sulfonic acid (H2NDS, C10H8O6S2) in aqueous media results in the formation of the salt hydrate bis­(8-hy­droxy­quinolinium) naphthalene-1,5-di­sulfonate tetra­hydrate, 2C9H8NO+·C10H6O6S22−·4H2O. The asymmetric unit comprises one protonated 8HQ+ cation, half of an NDS2– dianion symmetrically disposed around a center of inversion, and two water mol­ecules. Within the crystal structure, these components are organized into chains along the [010] and [10overline{1}] directions through O—H⋯O and N—H⋯O hydrogen-bonding inter­actions, forming a di-periodic network parallel to (101). Additional stabilizing inter­actions such as C—H⋯O, C—H⋯π, and π–π inter­actions extend this arrangement into a tri-periodic network structure




ap

2-(Pyridin-4-yl)-2,3-di­hydro-1H-naphtho­[1,8-de][1,3,2]di­aza­borinine

The title compound, C15H12BN3, is a type of di­aza­borinane featuring substitution at 1, 2, and 3 positions in the nitro­gen–boron six-membered heterocycle. It is comprised of two almost planar units, the pyridyl ring and the Bdan (dan = 1,8-di­aminona­phtho) group, which subtend a dihedral angle of 24.57 (5)°. In the crystal, the mol­ecules are linked into R44(28) hydrogen-bonding networks around the fourfold inversion axis, giving cyclic tetra­mers. The mol­ecules form columnar stacks along the c axis.




ap

(1R,2S,4aR,6S,8R,8aS)-1-(3-Hy­droxy­propano­yl)-1,3,6,8-tetra­methyl-1,2,4a,5,6,7,8,8a-octa­hydronaphthalene-2-carb­oxy­lic acid

The mol­ecular structure of C18H28O4, (+)-diplodiatoxin, is described, whereby the absolute configuration of the structure of diplodiatoxin has been confirmed by single-crystal X-ray diffraction. Diplodiatoxin crystallizes in the chiral P43212 space group with one mol­ecule in the asymmetric unit.




ap

Applying 3D ED/MicroED workflows toward the next frontiers

We report on the latest advancements in Microcrystal Electron Diffraction (3D ED/MicroED), as discussed during a symposium at the National Center for CryoEM Access and Training housed at the New York Structural Biology Center. This snapshot describes cutting-edge developments in various facets of the field and identifies potential avenues for continued progress. Key sections discuss instrumentation access, research applications for small mol­ecules and biomacromolecules, data collection hardware and software, data reduction software, and finally reporting and validation. 3D ED/MicroED is still early in its wide adoption by the structural science community with ample opportunities for expansion, growth, and innovation.




ap

Synthesis, spectroscopic and crystallographic characterization of various cymantrenyl thio­ethers [Mn{C5HxBry(SMe)z}(PPh3)(CO)2]

Starting from [Mn(C5H4Br)(PPh3)(CO)2] (1a), the cymantrenyl thio­ethers [Mn(C5H4SMe)(PPh3)(CO)2] (1b) and [Mn{C5H4–nBr(SMe)n}(PPh3)(CO)2] (n = 1 for com­pound 2, n = 2 for 3 and n = 3 for 4) were obtained, using either n-butyllithium (n-BuLi), lithium diiso­propyl­amide (LDA) or lithium tetra­methyl­piperidide (LiTMP) as base, followed by electrophilic quenching with MeSSMe. Stepwise consecutive reaction of [Mn(C5Br5)(PPh3)(CO)2] with n-BuLi and MeSSMe led finally to [Mn{C5(SMe)5}(PPh3)(CO)2] (11), only the fifth com­plex to be reported containing a perthiol­ated cyclo­penta­dienyl ring. The mol­ecular and crystal structures of 1b, 3, 4 and 11 were determined and were studied for the occurrence of S⋯S and S⋯Br inter­actions. It turned out that although some inter­actions of this type occurred, they were of minor importance for the arrangement of the mol­ecules in the crystal.




ap

How to grow crystals for X-ray crystallography

Growing high-quality crystals remains a necessary part of crystallography and many other techniques. This article tabulates and describes several techniques and variations that will help individuals grow high-quality crystals in preparation for crystallographic techniques and other endeavors, such as form screening. The discussion is organized to focus on low-tech approaches available in any laboratory.




ap

Synthesis of organotin(IV) heterocycles containing a xanthenyl group by a Barbier approach via ultrasound activation: synthesis, crystal structure and Hirshfeld surface analysis

A series of organotin heterocycles of general formula [{Me2C(C6H3CH2)2O}SnR2] [R = methyl (Me, 4), n-butyl (n-Bu, 5), benzyl (Bn, 6) and phenyl (Ph, 7)] was easily synthesized by a Barbier-type reaction assisted by the sonochemical activation of metallic magnesium. The 119Sn{1H} NMR data for all four com­pounds confirm the presence of a central Sn atom in a four-coordinated environment in solution. Single-crystal X-ray diffraction studies for 17,17-dimethyl-7,7-di­phenyl-15-oxa-7-stanna­tetra­cyclo­[11.3.1.05,16.09,14]hepta­deca-1,3,5(16),9(14),10,12-hexa­­ene, [Sn(C6H5)2(C17H16O)], 7, at 100 and 295 K con­firmed the formation of a mono­nuclear eight-membered heterocycle, with a conformation depicted as boat–chair, resulting in a weak Sn⋯O inter­action. The Sn and O atoms are surrounded by hydro­phobic C—H bonds. A Hirshfeld surface analysis of 7 showed that the eight-membered heterocycles are linked by weak C—H⋯π, π–π and H⋯H noncovalent inter­actions. The pairwise inter­action energies showed that the cohesion between the heterocycles are mainly due to dispersion forces.




ap

Further evaluation of the shape of atomic Hirshfeld surfaces: M⋯H contacts and homoatomic bonds

It is well known that Hirshfeld surfaces provide an easy and straightforward way of analysing inter­molecular inter­actions in the crystal environment. The use of atomic Hirshfeld surfaces has also demonstrated that such surfaces carry information related to chemical bonds which allow a deeper evaluation of the structures. Here we briefly summarize the approach of atomic Hirshfeld surfaces while further evaluating the kind of information that can be retrieved from them. We show that the analysis of the metal-centre Hirshfeld surfaces from structures refined via Hirshfeld Atom Refinement (HAR) allow accurate evaluation of contacts of type M⋯H, and that such contacts can be related to the overall shape of the surfaces. The com­pounds analysed were tetra­aqua­bis­(3-carb­oxy­propionato)metal(II), [M(C4H3O4)2(H2O)4], for metal(II)/M = manganese/Mn, cobalt/Co, nickel/Ni and zinc/Zn. We also evaluate the sensitivity of the surfaces by an investigation of seemingly flat surfaces through analysis of the curvature functions in the direction of C—C bonds. The obtained values not only demonstrate variations in curvature but also show a correlation with the hybridization of the C atoms involved in the bond.




ap

2,4-Di­aryl­pyrroles: synthesis, characterization and crystallographic insights

Three 2,4-di­aryl­pyrroles were synthesized starting from 4-nitro­butano­nes and the crystal structures of two derivatives were analysed. These are 4-(4-meth­oxy­phen­yl)-2-(thio­phen-2-yl)-1H-pyrrole, C15H13NOS, and 3-(4-bromo­phen­yl)-2-nitroso-5-phenyl-1H-pyrrole, C16H11BrN2O. Although pyrroles without sub­stituents at the α-position with respect to the N atom are very air sensitive and tend to polymerize, we succeeded in growing an adequate crystal for X-ray diffraction analysis. Further derivatization using sodium nitrite afforded a nitrosyl pyrrole derivative, which crystallized in the triclinic space group Poverline{1} with Z = 6. Thus, herein we report the first crystal structure of a nitrosyl pyrrole. Inter­estingly, the co-operative hydrogen bonds in this NO-substituted pyrrole lead to a trimeric structure with bifurcated halogen bonds at the ends, forming a two-dimensional (2D) layer with inter­stitial voids having a radius of 5 Å, similar to some reported macrocyclic porphyrins.




ap

Methods in mol­ecular photocrystallography

Over the last three decades, the technology that makes it possible to follow chemical processes in the solid state in real time has grown enormously. These studies have important implications for the design of new functional materials for applications in optoelectronics and sensors. Light–matter inter­actions are of particular importance, and photocrystallography has proved to be an important tool for studying these inter­actions. In this technique, the three-dimensional structures of light-activated mol­ecules, in their excited states, are determined using single-crystal X-ray crystallography. With advances in the design of high-power lasers, pulsed LEDs and time-gated X-ray detectors, the increased availability of synchrotron facilities, and most recently, the development of XFELs, it is now possible to determine the structures of mol­ecules with lifetimes ranging from minutes down to picoseconds, within a single crystal, using the photocrystallographic technique. This review discusses the procedures for conducting successful photocrystallographic studies and outlines the different methodologies that have been developed to study structures with specific lifetime ranges. The com­plexity of the methods required increases considerably as the lifetime of the excited state shortens. The discussion is supported by examples of successful photocrystallographic studies across a range of timescales and emphasises the importance of the use of com­plementary analytical techniques in order to understand the solid-state processes fully.




ap

Introducing the Best practice in crystallography series

 




ap

Photocrystallography – common or exclusive?

 




ap

Deep residual networks for crystallography trained on synthetic data

The use of artificial intelligence to process diffraction images is challenged by the need to assemble large and precisely designed training data sets. To address this, a codebase called Resonet was developed for synthesizing diffraction data and training residual neural networks on these data. Here, two per-pattern capabilities of Resonet are demonstrated: (i) interpretation of crystal resolution and (ii) identification of overlapping lattices. Resonet was tested across a compilation of diffraction images from synchrotron experiments and X-ray free-electron laser experiments. Crucially, these models readily execute on graphics processing units and can thus significantly outperform conventional algorithms. While Resonet is currently utilized to provide real-time feedback for macromolecular crystallography users at the Stanford Synchrotron Radiation Lightsource, its simple Python-based interface makes it easy to embed in other processing frameworks. This work highlights the utility of physics-based simulation for training deep neural networks and lays the groundwork for the development of additional models to enhance diffraction collection and analysis.




ap

The High-Pressure Freezing Laboratory for Macromolecular Crystallography (HPMX), an ancillary tool for the macromolecular crystallography beamlines at the ESRF

This article describes the High-Pressure Freezing Laboratory for Macromolecular Crystallography (HPMX) at the ESRF, and highlights new and complementary research opportunities that can be explored using this facility. The laboratory is dedicated to investigating interactions between macromolecules and gases in crystallo, and finds applications in many fields of research, including fundamental biology, biochemistry, and environmental and medical science. At present, the HPMX laboratory offers the use of different high-pressure cells adapted for helium, argon, krypton, xenon, nitrogen, oxygen, carbon dioxide and methane. Important scientific applications of high pressure to macromolecules at the HPMX include noble-gas derivatization of crystals to detect and map the internal architecture of proteins (pockets, tunnels and channels) that allows the storage and diffusion of ligands or substrates/products, the investigation of the catalytic mechanisms of gas-employing enzymes (using oxygen, carbon dioxide or methane as substrates) to possibly decipher intermediates, and studies of the conformational fluctuations or structure modifications that are necessary for proteins to function. Additionally, cryo-cooling protein crystals under high pressure (helium or argon at 2000 bar) enables the addition of cryo-protectant to be avoided and noble gases can be employed to produce derivatives for structure resolution. The high-pressure systems are designed to process crystals along a well defined pathway in the phase diagram (pressure–temperature) of the gas to cryo-cool the samples according to the three-step `soak-and-freeze method'. Firstly, crystals are soaked in a pressurized pure gas atmosphere (at 294 K) to introduce the gas and facilitate its inter­actions within the macromolecules. Samples are then flash-cooled (at 100 K) while still under pressure to cryo-trap macromolecule–gas complexation states or pressure-induced protein modifications. Finally, the samples are recovered after depressurization at cryo-temperatures. The final section of this publication presents a selection of different typical high-pressure experiments carried out at the HPMX, showing that this technique has already answered a wide range of scientific questions. It is shown that the use of different gases and pressure conditions can be used to probe various effects, such as mapping the functional internal architectures of enzymes (tunnels in the haloalkane dehalogenase DhaA) and allosteric sites on membrane-protein surfaces, the interaction of non-inert gases with proteins (oxygen in the hydrogenase ReMBH) and pressure-induced structural changes of proteins (tetramer dissociation in urate oxidase). The technique is versatile and the provision of pressure cells and their application at the HPMX is gradually being extended to address new scientific questions.




ap

From femtoseconds to minutes: time-resolved macromolecular crystallography at XFELs and synchrotrons

Over the last decade, the development of time-resolved serial crystallography (TR-SX) at X-ray free-electron lasers (XFELs) and synchrotrons has allowed researchers to study phenomena occurring in proteins on the femtosecond-to-minute timescale, taking advantage of many technical and methodological breakthroughs. Protein crystals of various sizes are presented to the X-ray beam in either a static or a moving medium. Photoactive proteins were naturally the initial systems to be studied in TR-SX experiments using pump–probe schemes, where the pump is a pulse of visible light. Other reaction initiations through small-molecule diffusion are gaining momentum. Here, selected examples of XFEL and synchrotron time-resolved crystallography studies will be used to highlight the specificities of the various instruments and methods with respect to time resolution, and are compared with cryo-trapping studies.




ap

AlphaFold-assisted structure determination of a bacterial protein of unknown function using X-ray and electron crystallography

Macromolecular crystallography generally requires the recovery of missing phase information from diffraction data to reconstruct an electron-density map of the crystallized molecule. Most recent structures have been solved using molecular replacement as a phasing method, requiring an a priori structure that is closely related to the target protein to serve as a search model; when no such search model exists, molecular replacement is not possible. New advances in computational machine-learning methods, however, have resulted in major advances in protein structure predictions from sequence information. Methods that generate predicted structural models of sufficient accuracy provide a powerful approach to molecular replacement. Taking advantage of these advances, AlphaFold predictions were applied to enable structure determination of a bacterial protein of unknown function (UniProtKB Q63NT7, NCBI locus BPSS0212) based on diffraction data that had evaded phasing attempts using MIR and anomalous scattering methods. Using both X-ray and micro-electron (microED) diffraction data, it was possible to solve the structure of the main fragment of the protein using a predicted model of that domain as a starting point. The use of predicted structural models importantly expands the promise of electron diffraction, where structure determination relies critically on molecular replacement.




ap

A service-based approach to cryoEM facility processing pipelines at eBIC

Electron cryo-microscopy image-processing workflows are typically composed of elements that may, broadly speaking, be categorized as high-throughput workloads which transition to high-performance workloads as preprocessed data are aggregated. The high-throughput elements are of particular importance in the context of live processing, where an optimal response is highly coupled to the temporal profile of the data collection. In other words, each movie should be processed as quickly as possible at the earliest opportunity. The high level of disconnected parallelization in the high-throughput problem directly allows a completely scalable solution across a distributed computer system, with the only technical obstacle being an efficient and reliable implementation. The cloud computing frameworks primarily developed for the deployment of high-availability web applications provide an environment with a number of appealing features for such high-throughput processing tasks. Here, an implementation of an early-stage processing pipeline for electron cryotomography experiments using a service-based architecture deployed on a Kubernetes cluster is discussed in order to demonstrate the benefits of this approach and how it may be extended to scenarios of considerably increased complexity.




ap

EMinsight: a tool to capture cryoEM microscope configuration and experimental outcomes for analysis and deposition

The widespread adoption of cryoEM technologies for structural biology has pushed the discipline to new frontiers. A significant worldwide effort has refined the single-particle analysis (SPA) workflow into a reasonably standardized procedure. Significant investments of development time have been made, particularly in sample preparation, microscope data-collection efficiency, pipeline analyses and data archiving. The widespread adoption of specific commercial microscopes, software for controlling them and best practices developed at facilities worldwide has also begun to establish a degree of standardization to data structures coming from the SPA workflow. There is opportunity to capitalize on this moment in the maturation of the field, to capture metadata from SPA experiments and correlate the metadata with experimental outcomes, which is presented here in a set of programs called EMinsight. This tool aims to prototype the framework and types of analyses that could lead to new insights into optimal microscope configurations as well as to define methods for metadata capture to assist with the archiving of cryoEM SPA data. It is also envisaged that this tool will be useful to microscope operators and facilities looking to rapidly generate reports on SPA data-collection and screening sessions.




ap

Tomo Live: an on-the-fly reconstruction pipeline to judge data quality for cryo-electron tomography workflows

Data acquisition and processing for cryo-electron tomography can be a significant bottleneck for users. To simplify and streamline the cryo-ET workflow, Tomo Live, an on-the-fly solution that automates the alignment and reconstruction of tilt-series data, enabling real-time data-quality assessment, has been developed. Through the integration of Tomo Live into the data-acquisition workflow for cryo-ET, motion correction is performed directly after each of the acquired tilt angles. Immediately after the tilt-series acquisition has completed, an unattended tilt-series alignment and reconstruction into a 3D volume is performed. The results are displayed in real time in a dedicated remote web platform that runs on the microscope hardware. Through this web platform, users can review the acquired data (aligned stack and 3D volume) and several quality metrics that are obtained during the alignment and reconstruction process. These quality metrics can be used for fast feedback for subsequent acquisitions to save time. Parameters such as Alignment Accuracy, Deleted Tilts and Tilt Axis Correction Angle are visualized as graphs and can be used as filters to export only the best tomograms (raw data, reconstruction and intermediate data) for further processing. Here, the Tomo Live algorithms and workflow are described and representative results on several biological samples are presented. The Tomo Live workflow is accessible to both expert and non-expert users, making it a valuable tool for the continued advancement of structural biology, cell biology and histology.




ap

STOPGAP: an open-source package for template matching, subtomogram alignment and classification

Cryo-electron tomography (cryo-ET) enables molecular-resolution 3D imaging of complex biological specimens such as viral particles, cellular sections and, in some cases, whole cells. This enables the structural characterization of molecules in their near-native environments, without the need for purification or separation, thereby preserving biological information such as conformational states and spatial relationships between different molecular species. Subtomogram averaging is an image-processing workflow that allows users to leverage cryo-ET data to identify and localize target molecules, determine high-resolution structures of repeating molecular species and classify different conformational states. Here, STOPGAP, an open-source package for subtomogram averaging that is designed to provide users with fine control over each of these steps, is described. In providing detailed descriptions of the image-processing algorithms that STOPGAP uses, this manuscript is also intended to serve as a technical resource to users as well as for further community-driven software development.




ap

Identifying and avoiding radiation damage in macromolecular crystallography

Radiation damage remains one of the major impediments to accurate structure solution in macromolecular crystallography. The artefacts of radiation damage can manifest as structural changes that result in incorrect biological interpretations being drawn from a model, they can reduce the resolution to which data can be collected and they can even prevent structure solution entirely. In this article, we discuss how to identify and mitigate against the effects of radiation damage at each stage in the macromolecular crystal structure-solution pipeline.




ap

What shapes template-matching performance in cryogenic electron tomography in situ?

The detection of specific biological macromolecules in cryogenic electron tomography data is frequently approached by applying cross-correlation-based 3D template matching. To reduce computational cost and noise, high binning is used to aggregate voxels before template matching. This remains a prevalent practice in both practical applications and methods development. Here, the relation between template size, shape and angular sampling is systematically evaluated to identify ribosomes in a ground-truth annotated data set. It is shown that at the commonly used binning, a detailed subtomogram average, a sphere and a heart emoji result in near-identical performance. These findings indicate that with current template-matching practices macromolecules can only be detected with high precision if their shape and size are sufficiently different from the background. Using theoretical considerations, the experimental results are rationalized and it is discussed why primarily low-frequency information remains at high binning and that template matching fails to be accurate because similarly shaped and sized macromolecules have similar low-frequency spectra. These challenges are discussed and potential enhancements for future template-matching methodologies are proposed.




ap

Pillar data-acquisition strategies for cryo-electron tomography of beam-sensitive biological samples

For cryo-electron tomography (cryo-ET) of beam-sensitive biological specimens, a planar sample geometry is typically used. As the sample is tilted, the effective thickness of the sample along the direction of the electron beam increases and the signal-to-noise ratio concomitantly decreases, limiting the transfer of information at high tilt angles. In addition, the tilt range where data can be collected is limited by a combination of various sample-environment constraints, including the limited space in the objective lens pole piece and the possible use of fixed conductive braids to cool the specimen. Consequently, most tilt series are limited to a maximum of ±70°, leading to the presence of a missing wedge in Fourier space. The acquisition of cryo-ET data without a missing wedge, for example using a cylindrical sample geometry, is hence attractive for volumetric analysis of low-symmetry structures such as organelles or vesicles, lysis events, pore formation or filaments for which the missing information cannot be compensated by averaging techniques. Irrespective of the geometry, electron-beam damage to the specimen is an issue and the first images acquired will transfer more high-resolution information than those acquired last. There is also an inherent trade-off between higher sampling in Fourier space and avoiding beam damage to the sample. Finally, the necessity of using a sufficient electron fluence to align the tilt images means that this fluence needs to be fractionated across a small number of images; therefore, the order of data acquisition is also a factor to consider. Here, an n-helix tilt scheme is described and simulated which uses overlapping and interleaved tilt series to maximize the use of a pillar geometry, allowing the entire pillar volume to be reconstructed as a single unit. Three related tilt schemes are also evaluated that extend the continuous and classic dose-symmetric tilt schemes for cryo-ET to pillar samples to enable the collection of isotropic information across all spatial frequencies. A fourfold dose-symmetric scheme is proposed which provides a practical compromise between uniform information transfer and complexity of data acquisition.




ap

Introduction of the Capsules environment to support further growth of the SBGrid structural biology software collection

The expansive scientific software ecosystem, characterized by millions of titles across various platforms and formats, poses significant challenges in maintaining reproducibility and provenance in scientific research. The diversity of independently developed applications, evolving versions and heterogeneous components highlights the need for rigorous methodologies to navigate these complexities. In response to these challenges, the SBGrid team builds, installs and configures over 530 specialized software applications for use in the on-premises and cloud-based computing environments of SBGrid Consortium members. To address the intricacies of supporting this diverse application collection, the team has developed the Capsule Software Execution Environment, generally referred to as Capsules. Capsules rely on a collection of programmatically generated bash scripts that work together to isolate the runtime environment of one application from all other applications, thereby providing a transparent cross-platform solution without requiring specialized tools or elevated account privileges for researchers. Capsules facilitate modular, secure software distribution while maintaining a centralized, conflict-free environment. The SBGrid platform, which combines Capsules with the SBGrid collection of structural biology applications, aligns with FAIR goals by enhancing the findability, accessibility, interoperability and reusability of scientific software, ensuring seamless functionality across diverse computing environments. Its adaptability enables application beyond structural biology into other scientific fields.




ap

Deep-learning map segmentation for protein X-ray crystallographic structure determination

When solving a structure of a protein from single-wavelength anomalous diffraction X-ray data, the initial phases obtained by phasing from an anomalously scattering substructure usually need to be improved by an iterated electron-density modification. In this manuscript, the use of convolutional neural networks (CNNs) for segmentation of the initial experimental phasing electron-density maps is proposed. The results reported demonstrate that a CNN with U-net architecture, trained on several thousands of electron-density maps generated mainly using X-ray data from the Protein Data Bank in a supervised learning, can improve current density-modification methods.




ap

Validation of electron-microscopy maps using solution small-angle X-ray scattering

The determination of the atomic resolution structure of biomacromolecules is essential for understanding details of their function. Traditionally, such a structure determination has been performed with crystallographic or nuclear resonance methods, but during the last decade, cryogenic transmission electron microscopy (cryo-TEM) has become an equally important tool. As the blotting and flash-freezing of the samples can induce conformational changes, external validation tools are required to ensure that the vitrified samples are representative of the solution. Although many validation tools have already been developed, most of them rely on fully resolved atomic models, which prevents early screening of the cryo-TEM maps. Here, a novel and automated method for performing such a validation utilizing small-angle X-ray scattering measurements, publicly available through the new software package AUSAXS, is introduced and implemented. The method has been tested on both simulated and experimental data, where it was shown to work remarkably well as a validation tool. The method provides a dummy atomic model derived from the EM map which best represents the solution structure.




ap

A snapshot love story: what serial crystallography has done and will do for us

Serial crystallography, born from groundbreaking experiments at the Linac Coherent Light Source in 2009, has evolved into a pivotal technique in structural biology. Initially pioneered at X-ray free-electron laser facilities, it has now expanded to synchrotron-radiation facilities globally, with dedicated experimental stations enhancing its accessibility. This review gives an overview of current developments in serial crystallography, emphasizing recent results in time-resolved crystallography, and discussing challenges and shortcomings.




ap

Managing macromolecular crystallographic data with a laboratory information management system

Protein crystallography is an established method to study the atomic structures of macromolecules and their complexes. A prerequisite for successful structure determination is diffraction-quality crystals, which may require extensive optimization of both the protein and the conditions, and hence projects can stretch over an extended period, with multiple users being involved. The workflow from crystallization and crystal treatment to deposition and publication is well defined, and therefore an electronic laboratory information management system (LIMS) is well suited to management of the data. Completion of the project requires key information on all the steps being available and this information should also be made available according to the FAIR principles. As crystallized samples are typically shipped between facilities, a key feature to be captured in the LIMS is the exchange of metadata between the crystallization facility of the home laboratory and, for example, synchrotron facilities. On completion, structures are deposited in the Protein Data Bank (PDB) and the LIMS can include the PDB code in its database, completing the chain of custody from crystallization to structure deposition and publication. A LIMS designed for macromolecular crystallography, IceBear, is available as a standalone installation and as a hosted service, and the implementation of key features for the capture of metadata in IceBear is discussed as an example.




ap

The crystal structure of Shethna protein II (FeSII) from Azotobacter vinelandii suggests a domain swap

The Azotobacter vinelandii FeSII protein forms an oxygen-resistant complex with the nitrogenase MoFe and Fe proteins. FeSII is an adrenodoxin-type ferredoxin that forms a dimer in solution. Previously, the crystal structure was solved [Schlesier et al. (2016), J. Am. Chem. Soc. 138, 239–247] with five copies in the asymmetric unit. One copy is a normal adrenodoxin domain that forms a dimer with its crystallographic symmetry mate. The other four copies are in an `open' conformation with a loop flipped out exposing the 2Fe–2S cluster. The open and closed conformations were interpreted as oxidized and reduced, respectively, and the large conformational change in the open configuration allowed binding to nitrogenase. Here, the structure of FeSII was independently solved in the same crystal form. The positioning of the atoms in the unit cell is similar to the earlier report. However, the interpretation of the structure is different. The `open' conformation is interpreted as the product of a crystallization-induced domain swap. The 2Fe–2S cluster is not exposed to solvent, but in the crystal its interacting helix is replaced by the same helix residues from a crystal symmetry mate. The domain swap is complicated, as it is unusual in being in the middle of the protein rather than at a terminus, and it creates arrangements of molecules that can be interpreted in multiple ways. It is also cautioned that crystal structures should be interpreted in terms of the contents of the entire crystal rather than of one asymmetric unit.




ap

Crystallographic fragment-binding studies of the Mycobacterium tuberculosis trifunctional enzyme suggest binding pockets for the tails of the acyl-CoA substrates at its active sites and a potential substrate-channeling path between them

The Mycobacterium tuberculosis trifunctional enzyme (MtTFE) is an α2β2 tetrameric enzyme in which the α-chain harbors the 2E-enoyl-CoA hydratase (ECH) and 3S-hydroxyacyl-CoA dehydrogenase (HAD) active sites, and the β-chain provides the 3-ketoacyl-CoA thiolase (KAT) active site. Linear, medium-chain and long-chain 2E-enoyl-CoA molecules are the preferred substrates of MtTFE. Previous crystallographic binding and modeling studies identified binding sites for the acyl-CoA substrates at the three active sites, as well as the NAD binding pocket at the HAD active site. These studies also identified three additional CoA binding sites on the surface of MtTFE that are different from the active sites. It has been proposed that one of these additional sites could be of functional relevance for the substrate channeling (by surface crawling) of reaction intermediates between the three active sites. Here, 226 fragments were screened in a crystallographic fragment-binding study of MtTFE crystals, resulting in the structures of 16 MtTFE–fragment complexes. Analysis of the 121 fragment-binding events shows that the ECH active site is the `binding hotspot' for the tested fragments, with 41 binding events. The mode of binding of the fragments bound at the active sites provides additional insight into how the long-chain acyl moiety of the substrates can be accommodated at their proposed binding pockets. In addition, the 20 fragment-binding events between the active sites identify potential transient binding sites of reaction intermediates relevant to the possible channeling of substrates between these active sites. These results provide a basis for further studies to understand the functional relevance of the latter binding sites and to identify substrates for which channeling is crucial.




ap

Cryo2RT: a high-throughput method for room-temperature macromolecular crystallography from cryo-cooled crystals

Advances in structural biology have relied heavily on synchrotron cryo-crystallography and cryogenic electron microscopy to elucidate biological processes and for drug discovery. However, disparities between cryogenic and room-temperature (RT) crystal structures pose challenges. Here, Cryo2RT, a high-throughput RT data-collection method from cryo-cooled crystals that leverages the cryo-crystallography workflow, is introduced. Tested on endothiapepsin crystals with four soaked fragments, thaumatin and SARS-CoV-2 3CLpro, Cryo2RT reveals unique ligand-binding poses, offers a comparable throughput to cryo-crystallography and eases the exploration of structural dynamics at various temperatures.




ap

Likelihood-based interactive local docking into cryo-EM maps in ChimeraX

The interpretation of cryo-EM maps often includes the docking of known or predicted structures of the components, which is particularly useful when the map resolution is worse than 4 Å. Although it can be effective to search the entire map to find the best placement of a component, the process can be slow when the maps are large. However, frequently there is a well-founded hypothesis about where particular components are located. In such cases, a local search using a map subvolume will be much faster because the search volume is smaller, and more sensitive because optimizing the search volume for the rotation-search step enhances the signal to noise. A Fourier-space likelihood-based local search approach, based on the previously published em_placement software, has been implemented in the new emplace_local program. Tests confirm that the local search approach enhances the speed and sensitivity of the computations. An interactive graphical interface in the ChimeraX molecular-graphics program provides a convenient way to set up and evaluate docking calculations, particularly in defining the part of the map into which the components should be placed.




ap

Comparison of two crystal polymorphs of NowGFP reveals a new conformational state trapped by crystal packing

Crystal polymorphism serves as a strategy to study the conformational flexibility of proteins. However, the relationship between protein crystal packing and protein conformation often remains elusive. In this study, two distinct crystal forms of a green fluorescent protein variant, NowGFP, are compared: a previously identified monoclinic form (space group C2) and a newly discovered ortho­rhombic form (space group P212121). Comparative analysis reveals that both crystal forms exhibit nearly identical linear assemblies of NowGFP molecules interconnected through similar crystal contacts. However, a notable difference lies in the stacking of these assemblies: parallel in the monoclinic form and perpendicular in the orthorhombic form. This distinct mode of stacking leads to different crystal contacts and induces structural alteration in one of the two molecules within the asymmetric unit of the orthorhombic crystal form. This new conformational state captured by orthorhombic crystal packing exhibits two unique features: a conformational shift of the β-barrel scaffold and a restriction of pH-dependent shifts of the key residue Lys61, which is crucial for the pH-dependent spectral shift of this protein. These findings demonstrate a clear connection between crystal packing and alternative conformational states of proteins, providing insights into how structural variations influence the function of fluorescent proteins.




ap

Robust and automatic beamstop shadow outlier rejection: combining crystallographic statistics with modern clustering under a semi-supervised learning strategy

During the automatic processing of crystallographic diffraction experiments, beamstop shadows are often unaccounted for or only partially masked. As a result of this, outlier reflection intensities are integrated, which is a known issue. Traditional statistical diagnostics have only limited effectiveness in identifying these outliers, here termed Not-Excluded-unMasked-Outliers (NEMOs). The diagnostic tool AUSPEX allows visual inspection of NEMOs, where they form a typical pattern: clusters at the low-resolution end of the AUSPEX plots of intensities or amplitudes versus resolution. To automate NEMO detection, a new algorithm was developed by combining data statistics with a density-based clustering method. This approach demonstrates a promising performance in detecting NEMOs in merged data sets without disrupting existing data-reduction pipelines. Re-refinement results indicate that excluding the identified NEMOs can effectively enhance the quality of subsequent structure-determination steps. This method offers a prospective automated means to assess the efficacy of a beamstop mask, as well as highlighting the potential of modern pattern-recognition techniques for automating outlier exclusion during data processing, facilitating future adaptation to evolving experimental strategies.




ap

Utilizing anomalous signals for element identification in macromolecular crystallography

AlphaFold2 has revolutionized structural biology by offering unparalleled accuracy in predicting protein structures. Traditional methods for determining protein structures, such as X-ray crystallography and cryo-electron microscopy, are often time-consuming and resource-intensive. AlphaFold2 provides models that are valuable for molecular replacement, aiding in model building and docking into electron density or potential maps. However, despite its capabilities, models from AlphaFold2 do not consistently match the accuracy of experimentally determined structures, need to be validated experimentally and currently miss some crucial information, such as post-translational modifications, ligands and bound ions. In this paper, the advantages are explored of collecting X-ray anomalous data to identify chemical elements, such as metal ions, which are key to understanding certain structures and functions of proteins. This is achieved through methods such as calculating anomalous difference Fourier maps or refining the imaginary component of the anomalous scattering factor f''. Anomalous data can serve as a valuable complement to the information provided by AlphaFold2 models and this is particularly significant in elucidating the roles of metal ions.




ap

CHiMP: deep-learning tools trained on protein crystallization micrographs to enable automation of experiments

A group of three deep-learning tools, referred to collectively as CHiMP (Crystal Hits in My Plate), were created for analysis of micrographs of protein crystallization experiments at the Diamond Light Source (DLS) synchrotron, UK. The first tool, a classification network, assigns images into categories relating to experimental outcomes. The other two tools are networks that perform both object detection and instance segmentation, resulting in masks of individual crystals in the first case and masks of crystallization droplets in addition to crystals in the second case, allowing the positions and sizes of these entities to be recorded. The creation of these tools used transfer learning, where weights from a pre-trained deep-learning network were used as a starting point and repurposed by further training on a relatively small set of data. Two of the tools are now integrated at the VMXi macromolecular crystallography beamline at DLS, where they have the potential to absolve the need for any user input, both for monitoring crystallization experiments and for triggering in situ data collections. The third is being integrated into the XChem fragment-based drug-discovery screening platform, also at DLS, to allow the automatic targeting of acoustic compound dispensing into crystallization droplets.




ap

Structure and stability of an apo thermophilic esterase that hydrolyzes polyhydroxybutyrate

Pollution from plastics is a global problem that threatens the biosphere for a host of reasons, including the time scale that it takes for most plastics to degrade. Biodegradation is an ideal solution for remediating bioplastic waste as it does not require the high temperatures necessary for thermal degradation and does not introduce additional pollutants into the environment. Numerous organisms can scavenge for bioplastics, such as polylactic acid (PLA) or poly-(R)-hydroxybutyrate (PHB), which they can use as an energy source. Recently, a promiscuous PHBase from the thermophilic soil bacterium Lihuaxuella thermophila (LtPHBase) was identified. LtPHBase can accommodate many substrates, including PHB granules and films and PHB block copolymers, as well as the unrelated polymers polylactic acid (PLA) and polycaprolactone (PCL). LtPHBase uses the expected Ser–His–Asp catalytic triad for hydrolysis at an optimal enzyme activity near 70°C. Here, the 1.75 Å resolution crystal structure of apo LtPHBase is presented and its chemical stability is profiled. Knowledge of its substrate preferences was extended to different-sized PHB granules. It is shown that LtPHBase is highly resistant to unfolding, with barriers typical for thermophilic enzymes, and shows a preference for low-molecular-mass PHB granules. These insights have implications for the long-term potential of LtPHBase as an industrial PHB hydrolase and shed light on the evolutionary role that this enzyme plays in bacterial metabolism.




ap

Analysis of crystallographic phase retrieval using iterative projection algorithms

For protein crystals in which more than two thirds of the volume is occupied by solvent, the featureless nature of the solvent region often generates a constraint that is powerful enough to allow direct phasing of X-ray diffraction data. Practical implementation relies on the use of iterative projection algorithms with good global convergence properties to solve the difficult nonconvex phase-retrieval problem. In this paper, some aspects of phase retrieval using iterative projection algorithms are systematically explored, where the diffraction data and density-value distributions in the protein and solvent regions provide the sole constraints. The analysis is based on the addition of random error to the phases of previously determined protein crystal structures, followed by evaluation of the ability to recover the correct phase set as the distance from the solution increases. The properties of the difference-map (DM), relaxed–reflect–reflect (RRR) and relaxed averaged alternating reflectors (RAAR) algorithms are compared. All of these algorithms prove to be effective for crystallographic phase retrieval, and the useful ranges of the adjustable parameter which controls their behavior are established. When these algorithms converge to the solution, the algorithm trajectory becomes stationary; however, the density function continues to fluctuate significantly around its mean position. It is shown that averaging over the algorithm trajectory in the stationary region, following convergence, improves the density estimate, with this procedure outperforming previous approaches for phase or density refinement.




ap

The interoperability of crystallographic data and databases

Interoperability of crystallographic data with other disciplines is essential for the smooth and rapid progress of structure-based science in the computer age. Within crystallography and closely related subject areas, there is already a high level of conformance to the generally accepted FAIR principles (that data be findable, accessible, interoperable and reusable) through the adoption of common information exchange protocols by databases, publishers, instrument vendors, experimental facilities and software authors. Driven by the success within these domains, the IUCr has worked closely with CODATA (the Committee on Data of the International Science Council) to help develop the latter's commitment to cross-domain integration of discipline-specific data. The IUCr has, in particular, emphasized the need for standards relating to data quality and completeness as an adjunct to the FAIR data landscape. This can ensure definitive reusable data, which in turn can aid interoperability across domains. A microsymposium at the IUCr 2023 Congress provided an up-to-date survey of data interoperability within and outside of crystallography, expounded using a broad range of examples.




ap

Data reduction in protein serial crystallography

Serial crystallography (SX) has become an established technique for protein structure determination, especially when dealing with small or radiation-sensitive crystals and investigating fast or irreversible protein dynamics. The advent of newly developed multi-megapixel X-ray area detectors, capable of capturing over 1000 images per second, has brought about substantial benefits. However, this advancement also entails a notable increase in the volume of collected data. Today, up to 2 PB of data per experiment could be easily obtained under efficient operating conditions. The combined costs associated with storing data from multiple experiments provide a compelling incentive to develop strategies that effectively reduce the amount of data stored on disk while maintaining the quality of scientific outcomes. Lossless data-compression methods are designed to preserve the information content of the data but often struggle to achieve a high compression ratio when applied to experimental data that contain noise. Conversely, lossy compression methods offer the potential to greatly reduce the data volume. Nonetheless, it is vital to thoroughly assess the impact of data quality and scientific outcomes when employing lossy compression, as it inherently involves discarding information. The evaluation of lossy compression effects on data requires proper data quality metrics. In our research, we assess various approaches for both lossless and lossy compression techniques applied to SX data, and equally importantly, we describe metrics suitable for evaluating SX data quality.




ap

Cocrystals of a coumarin derivative: an efficient approach towards anti-leishmanial cocrystals against MIL-resistant Leishmania tropica

Leishmaniasis is a neglected parasitic tropical disease with numerous clinical manifestations. One of the causative agents of cutaneous leishmaniasis (CL) is Leishmania tropica (L. tropica) known for causing ulcerative lesions on the skin. The adverse effects of the recommended available drugs, such as amphotericin B and pentavalent antimonial, and the emergence of drug resistance in parasites, mean the search for new safe and effective anti-leishmanial agents is crucial. Miltefosine (MIL) was the first recommended oral medication, but its use is now limited because of the rapid emergence of resistance. Pharmaceutical cocrystallization is an effective method to improve the physicochemical and biological properties of active pharmaceutical ingredients (APIs). Herein, we describe the cocrystallization of coumarin-3-carb­oxy­lic acid (CU, 1a; 2-oxobenzo­pyrane-3-carb­oxy­lic acid, C10H6O4) with five coformers [2-amino-3-bromo­pyridine (1b), 2-amino-5-(tri­fluoro­methyl)-pyridine (1c), 2-amino-6-methyl­pyridine (1d), p-amino­benzoic acid (1e) and amitrole (1f)] in a 1:1 stoichiometric ratio via the neat grinding method. The cocrystals 2–6 obtained were characterized via single-crystal X-ray diffraction, powder X-ray diffraction, differential scanning calorimetry and thermogravimetric analysis, as well as Fourier transform infrared spectroscopy. Non-covalent interactions, such as van der Waals, hydrogen bonding, C—H⋯π and π⋯π interactions contribute significantly towards the packing of a crystal structure and alter the physicochemical and biological activity of CU. In this research, newly synthesized cocrystals were evaluated for their anti-leishmanial activity against the MIL-resistant L. tropica and cytotoxicity against the 3T3 (normal fibroblast) cell line. Among the non-cytotoxic cocrystals synthesized (2–6), CU:1b (2, IC50 = 61.83 ± 0.59 µM), CU:1c (3, 125.7 ± 1.15 µM) and CU:1d (4, 48.71 ± 0.75 µM) appeared to be potent anti-leishmanial agents and showed several-fold more anti-leishmanial potential than the tested standard drug (MIL, IC50 = 169.55 ± 0.078 µM). The results indicate that cocrystals 2–4 are promising anti-leishmanial agents which require further exploration.




ap

Transferable Hirshfeld atom model for rapid evaluation of aspherical atomic form factors

Form factors based on aspherical models of atomic electron density have brought great improvement in the accuracies of hydrogen atom parameters derived from X-ray crystal structure refinement. Today, two main groups of such models are available, the banks of transferable atomic densities parametrized using the Hansen–Coppens multipole model which allows for rapid evaluation of atomic form factors and Hirshfeld atom refinement (HAR)-related methods which are usually more accurate but also slower. In this work, a model that combines the ideas utilized in the two approaches is tested. It uses atomic electron densities based on Hirshfeld partitions of electron densities, which are precalculated and stored in a databank. This model was also applied during the refinement of the structures of five small molecules. A comparison of the resulting hydrogen atom parameters with those derived from neutron diffraction data indicates that they are more accurate than those obtained with the Hansen–Coppens based databank, and only slightly less accurate than those obtained with a version of HAR that neglects the crystal environment. The advantage of using HAR becomes more noticeable when the effects of the environment are included. To speed up calculations, atomic densities were represented by multipole expansion with spherical harmonics up to l = 7, which used numerical radial functions (a different approach to that applied in the Hansen–Coppens model). Calculations of atomic form factors for the small protein crambin (at 0.73 Å resolution) took only 68 s using 12 CPU cores.




ap

Droplet microfluidics for time-resolved serial crystallography

Serial crystallography requires large numbers of microcrystals and robust strategies to rapidly apply substrates to initiate reactions in time-resolved studies. Here, we report the use of droplet miniaturization for the controlled production of uniform crystals, providing an avenue for controlled substrate addition and synchronous reaction initiation. The approach was evaluated using two enzymatic systems, yielding 3 µm crystals of lysozyme and 2 µm crystals of Pdx1, an Arabidopsis enzyme involved in vitamin B6 biosynthesis. A seeding strategy was used to overcome the improbability of Pdx1 nucleation occurring with diminishing droplet volumes. Convection within droplets was exploited for rapid crystal mixing with ligands. Mixing times of <2 ms were achieved. Droplet microfluidics for crystal size engineering and rapid micromixing can be utilized to advance time-resolved serial crystallography.




ap

KINNTREX: a neural network to unveil protein mechanisms from time-resolved X-ray crystallography

Here, a machine-learning method based on a kinetically informed neural network (NN) is introduced. The proposed method is designed to analyze a time series of difference electron-density maps from a time-resolved X-ray crystallographic experiment. The method is named KINNTREX (kinetics-informed NN for time-resolved X-ray crystallography). To validate KINNTREX, multiple realistic scenarios were simulated with increasing levels of complexity. For the simulations, time-resolved X-ray data were generated that mimic data collected from the photocycle of the photoactive yellow protein. KINNTREX only requires the number of intermediates and approximate relaxation times (both obtained from a singular valued decomposition) and does not require an assumption of a candidate mechanism. It successfully predicts a consistent chemical kinetic mechanism, together with difference electron-density maps of the intermediates that appear during the reaction. These features make KINNTREX attractive for tackling a wide range of biomolecular questions. In addition, the versatility of KINNTREX can inspire more NN-based applications to time-resolved data from biological macromolecules obtained by other methods.




ap

Chaperone-mediated MHC-I peptide exchange in antigen presentation

This work focuses on molecules that are encoded by the major histocompatibility complex (MHC) and that bind self-, foreign- or tumor-derived peptides and display these at the cell surface for recognition by receptors on T lymphocytes (T cell receptors, TCR) and natural killer (NK) cells. The past few decades have accumulated a vast knowledge base of the structures of MHC molecules and the complexes of MHC/TCR with specificity for many different peptides. In recent years, the structures of MHC-I molecules complexed with chaperones that assist in peptide loading have been revealed by X-ray crystallography and cryogenic electron microscopy. These structures have been further studied using mutagenesis, molecular dynamics and NMR approaches. This review summarizes the current structures and dynamic principles that govern peptide exchange as these relate to the process of antigen presentation.




ap

A step towards 6D WAXD tensor tomography

X-ray scattering/diffraction tensor tomography techniques are promising methods to acquire the 3D texture information of heterogeneous biological tissues at micrometre resolution. However, the methods suffer from a long overall acquisition time due to multi-dimensional scanning across real and reciprocal space. Here, a new approach is introduced to obtain 3D reciprocal information of each illuminated scanning volume using mathematic modeling, which is equivalent to a physical scanning procedure for collecting the full reciprocal information required for voxel reconstruction. The virtual reciprocal scanning scheme was validated by a simulated 6D wide-angle X-ray diffraction tomography experiment. The theoretical validation of the method represents an important technological advancement for 6D diffraction tensor tomography and a crucial step towards pervasive applications in the characterization of heterogeneous materials.




ap

The importance of definitions in crystallography

This paper was motivated by the articles `Same or different – that is the question' in CrystEngComm (July 2020) and `Change to the definition of a crystal' in the IUCr Newsletter (June 2021). Experimental approaches to crystal comparisons require rigorously defined classifications in crystallography and beyond. Since crystal structures are determined in a rigid form, their strongest equivalence in practice is rigid motion, which is a composition of translations and rotations in 3D space. Conventional representations based on reduced cells and standardizations theoretically distinguish all periodic crystals. However, all cell-based representations are inherently discontinuous under almost any atomic displacement that can arbitrarily scale up a reduced cell. Hence, comparison of millions of known structures in materials databases requires continuous distance metrics.