Chemical structure searches are among those patent searches that can provide tons of information to innovators working with a novel compound. It is being widely used by innovators from various verticals of the industry including bio-technology, bio-chemical, textiles and many others. Therefore, it is important for individuals to understand the nitty-gritty of chemical structure searches and the basic step in the whole process is to get acquainted with various terms involved in it. This article underlines some important terms which we come across while doing a comprehensive and full chemical structure search and imbibing those terms could help us immensely.
SMILES- The full form of SMILES is “Simplified Molecular-Input Line-Entry System” and is a specification in form of a line notation for describing the structure of chemical species short ASCII strings. SMILES strings can be imported by most molecule editors for conversion back into two-dimensional drawings or three-dimensional models of the molecules. The term SMILES is also commonly used to refer to both a single SMILES string and a number of SMILES strings; the exact meaning is usually apparent from the context. For example, Methyl isocyanate CH3–N=C=O can be represented as CN=C=O.
Representation of various chemical species in SMILES
- Atoms- Atoms are represented by the standard abbreviation of the chemical elements, in square brackets, such as [Au] for gold. Brackets may be omitted in the common case of atoms which are in the “organic subset” of B, C, N, O, P, S, F, Cl, Br, or I. A bond is represented using one of the symbols ‘.’ ‘-‘ ‘=’ ‘#’ ‘$’ ‘:’ ‘/’ or ‘\’.
- Rings- Ring structures are written by breaking each ring at an arbitrary point (although some choices will lead to a more legible SMILES than others) to make an acyclic structure and adding numerical ring closure labels to show connectivity between non-adjacent atoms.
CAS Registry- CAS REGISTRY is a trademark of Chemical Abstracts Service (CAS) REGISTRY Systemmaintained and governed by American Chemical Society. Chemical Abstracts Service is the most authoritative collection of disclosed chemical substance information, containing more than 130 million organic and inorganic substances and 67 million sequences (view current numbers).
CAS Registry Number- A CAS Registry Number, also referred to as CASRN or CAS Number, is a unique numerical identifier assigned by Chemical Abstracts Service (CAS) to every chemical substance described in the open scientific literature (currently including those described from at least 1957 through the present), including organic and inorganic compounds, minerals, isotopes, alloys and non-structurable materials (UVCBs, of unknown, variable composition, or biological origin). For example, 64-17-5 is the CAS number allocated to Ethanol.
InChI- TheIUPAC International Chemical Identifier is a textual identifier for chemical substances, designed to provide a standard way to encode molecular information and to facilitate the search for such information in databases and on the web. The identifiers describe chemical substances in terms of layers of information — the atoms and their bond connectivity, tautomeric information, isotope information, stereochemistry, and electronic charge information. Not all layers have to be provided; for instance, the tautomer layer can be omitted if that type of information is not relevant to the particular application. For example, InChI formula for Alprazolam is InChI=1/C17H13ClN4/c1-11-20-21-16-10-19-17(12-5-3-2-4-6-12)14-9-13(18)7-8-15(14)22(11)16/h2-9H
Systematic names- A systematic name is a name given in a systematic way to one unique group, organism, object or chemical substance, out of a specific population or collection. Systematic names are usually part of a nomenclature. For example, many common chemicals are still referred to by their common or trivial names, even by chemists. An example of a common name is acetone, which has the systematic name 2-propanone.