Import CIF File
Previous  Top  Next

Import CIF File [File menu (Startup) or File menu (Graphics)]

See the Import File dialog for general aspects of importing atomic-structure data files.

ATOMS uses only a small number of the possible data items in a CIF file and recovers only the most basic information.

Cell setting. If a cell setting (crystal class) is not present or is undecipherable, ATOMS will try to recover the crystal class from the unit-cell parameters.

Extensions to CIF files and special versions. Some database operators have taken to producing special versions of CIF files, with extensions which may or may not have been approved and which may not readable without special knowledge or assumptions. Despite these extensions and several types of errors with respect to the original specifications (see below), ATOMS should be able to read most types of CIF file.

mmCIF files. These are files for macromolecules, and despite being designated CIF (and usually having the .cif filename extension) the atomic coordinates are Cartesian, and data are given for a complete molecule or molecules, not the asymmetric unit of a crystal. The Protein Data Bank (see PDB files) usually exports CIF files as mmCIF (molecular, Cartesian).

Importing mmCIF files as Crystals. ATOMS can use the information in the _atoms_sites.fract_transf_matrix (Cartesian to crystal transformation matrix) lines to recover the fractional coordinates and space-group symmetry for the original crystal. In some cases this information is not present - if it is present it will be used if you have selected the Crystal boundary option in the Import File dialog. Note that the transformation matrix given in the file (normally derived originally from a PDF file) is sometimes not sufficiently precise (because of limitation to 6 places after the decimal point) to recover the original unit-cell parameters precisely. The cell parameters used in ATOMS are those derived from the transformation matrix. After importing, you should check the cell parameters in the Title/Axes dialog and correct the crystal system - because of this problem the cell parameters are always imported as triclinic. If this type of file is imported as a molecule, the PDB information on hetatoms, etc. is retained, and the PDB quick bonds option (Bonds dialog in the Input1 Menu) can be used to locate bonds, which cuts down drastically on the time required. It can also take a great deal of time to isolate molecules with the Molecules in Crystal boundary option, if you choose to switch to that after importing - if you need the molecule(s) rather than crystal unit cells, it is better to import as a molecule.

The Cambridge Crystallographic Data Centre generates CIF files (not mmCIF files) which may contain all the atoms in an asymmetric unit, plus other atoms related by symmetry to complete all molecules in the structure. The choice of Boundary Option in the Import File dialog determines how the structure is handled, as in the CCDC import option. If Molecules in Crystal is chosen, all the given atoms are accepted and no symmetry is applied. For this option, the crystal axes are used, not Cartesian axes. For the Default Unit Cell option, the list of valid atoms is cut off when an atom label is encountered with a non-numeric last character. If this results in an incorrect atom list, choose Molecules in Crystal, delete any unwanted atoms and reset the Symmetry and Boundary options.

If possible, CCDC and PDB files should be downloaded and imported in the "native" formats, rather than as CIF - in these file types, information can only be lost in the process of translation to CIF. In other cases where there may be a choice of formats, such as ICSD, a CIF file may be more successfully imported than the native version.

Errors in CIF files. Many CIF files which are written by databases and other software violate one or more of the CIF syntax rules (Acta Cryst. 1991, A47, 655) and may in some cases be unreadable without modification. Some common errors:

1) Line length greater than 80 characters.
2) Failure to enclose character fields in single quotes
3) Fields ("data names") too long - for example "data_" field longer than 32 characters. The "data_" line should be truncated to 36 characters.

Many of the lines in a typical file are not used by ATOMS and may simply be deleted if the file is not read correctly.