Talk:Simplified molecular-input line-entry system

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
WikiProject Chemistry (Rated Start-class, Low-importance)
WikiProject iconThis article is within the scope of WikiProject Chemistry, a collaborative effort to improve the coverage of chemistry on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Start-Class article Start  This article has been rated as Start-Class on the project's quality scale.
 Low  This article has been rated as Low-importance on the project's importance scale.


To be honest, I'm a SMILES novice, though I've known of its existence for years.

An unambiguous depiction of the atomic structure of molecules is crucial to successfully depicting and communicating about chemistry. Images are often indispensible to such depiction. However, there are a number of issues surrounding the generation and inclusion of images in the Wikipedia (transcluded links hosted offsite aren't part of the central encyclopedia corpus, easy software support of image loading isn't available or if available brings concern of abuse, images seem to generate more copyright concern [whether justified or not] than words contributed, etc.)

It seems that SMILES strings can serve as a very compact adjunct for depicting some molecular structures.

I encourage my fellow Wikipedian science authors to consider the use of SMILES strings in their work.


Yes, it would certainly be a nice piece of information to add. Less for depiction perhaps, but it would certainly make information more searchable. Search by substructure would certainly be a nice thing to have.

Shyamal 12:56, 30 Jun 2004 (UTC)

Leaving atoms out[edit]

I noticed that the SMILES representations for Citric Acid and Calcium citrate do not show the hydrogen atoms consistently. What is the deal? Say you draw a structure without H's, do you show the SMILES for the structure you drew, or for the molecule as it exists in reality? --Slashme 09:45, 27 October 2005 (UTC)

I can't find the examples you mention, however, in general, there are multiple ways to represent almost any given structure using almost any desired notation system. For SMILES, it's okay to omit an explicit H when the atom is attached to the "organic subset" and the number of them is obvious (from valence rules). It's also okay to include them explicitly if you want, either in the square brackets or as explicit attached substituents. That's pretty close to the rule for most structural diagrams, too: you can omit hydrogen from carbon (but not the rest of the SMILES organic subset!) if you want, or you can include some or all of them if it makes the structure more clearly illustrate whatever you're trying to illustrate with it. DMacks 04:52, 26 January 2006 (UTC)

Page title[edit]

Any reason why SMILES redirects to Simplified molecular input line entry specification instead of the other way around? I assume SMILES is the most common name. National Aeronautics and Space Administration redirects to NASA, American Standard Code for Information Interchange redirects to ASCII, etc. Not a major problem, but it looks inconsistent. 19:17, 14 June 2006 (UTC)

SLN Link[edit]

The link to SLN leads to a page a boutdurch sign language. Should it not be changed for something of the sort of ‘SLN (Tripos)’? 11:57, 17 September 2006 (UTC)

According to be bold! you are encouraged to edit things on your own. Anyway, I have created a SYBYL Line Notation article which is now cross-linked in both directions. Kind regards, JKW 20:29, 17 September 2006 (UTC)

comparisons of molecules=[edit]

Is there any software that can given the smile data on two different molecules can output a number value which tells how close each compound is to one another (graph distance?)?

External Links[edit]

To my mind, we should include ones directly related to SMILES itself: SMILES and SMARTS, and the tutorial seems in keeping with WP:EL. The Parsing SMILES technical page seems on-topic as well. The links to various converters and products that happen to understand SMILES seem inappropriate. It gets off the topic of SMILES as a syntax and system into just a collection of ext-links (specifically discouraged by WP:EL). The list is woefully incomplete both for products listed and whole classes of things (SMILES is at least as important as a database tool than just converting to/from 2D structures) and draws focus towards a few specific implementations that may not even be the most featured/general/useful ones (inescapable POV or undue weight issues). DMacks 00:15, 24 January 2007 (UTC)

That's along the lines of what I was thinking, although I don't really know the topic. I just dropped by to clean up after a conflict of interest editor who seemed to be over-linking his own work to a number of articles. I noticed the other links seemed to be straying so added the cleanup tag. JonHarder talk 01:11, 24 January 2007 (UTC)

Inorganic molecules[edit]

I'm an inorganic chemist, and after reading the current form of the article I don't feel able to estimate if (or how) this works for transition metal complexes (not necessarily mononuclear... say big polyoxometalates). Inorganic mixed-valence systems would also pose an interesting problem. Probably the article should give an idea of to what extent this works for inorganic systems. -- 17:02, 15 February 2007 (UTC)

Better description[edit]

The description needs to be improved, so it's less vague / more complete. A diagram, or rather, a few diagrams would also be nice. Shinobu 13:12, 10 June 2007 (UTC)

Canonical SMILES and Isomeric SMILES[edit]

I was just passing through and noticed the word isotope here. Pardon my ignorance but shouldn't this be isomer? Pterre 23:32, 4 October 2007 (UTC)

Corrected. Shyamal 01:24, 5 October 2007 (UTC)
Re-corrected back to original per daylight's website descriptions. DMacks 02:04, 5 October 2007 (UTC)
As I said, pardon my ignorance. How about adding some words (perhaps in the Examples section) to make it clear to the casual reader that this really does mean isotope and is not a typo? Pterre 12:15, 6 October 2007 (UTC)
No worries, naming is pretty non-intuitive sometimes. I'm not sure what more we could say that would make it clear that we mean what we say (it's just a definition of a term), but would love to hear suggestions. DMacks 19:49, 6 October 2007 (UTC)
I have changed the wording as I also found it rather confusing. Please check for correctness. --Slashme (talk) 12:25, 16 January 2008 (UTC)

Where to start[edit]

Is there any standard on which atom to start on when describing a molecule? I assume this is specified in Canonical SMILES? --Slashme (talk) 12:25, 16 January 2008 (UTC)

I believe there is no unique way to "canonicalize". Think of it as a set of rules like those used for constructing IUPAC names. Shyamal (talk) 12:44, 22 July 2010 (UTC)


The section on isotopes says that C14 benzene is [14c]1ccccc1. Shouldn't that be [14c]1[14c][14c][14c][14c][14c]1? --Slashme (talk) 12:17, 16 January 2008 (UTC)


Suggest copying diagram from the French version of the page. (talk) 14:47, 13 August 2008 (UTC)

1) Disambig; 2) History authorship[edit]

1) Which is preferred on the Smiles disambiguation page -- SMILES is a "chemistry notation"? or a "chemical notation"?

2) On the historical record of SMILES' authorship. I quoted referenced daylight~dave smiles acknowledgements. The now 2nd paragraph reads as follows:

> "The original SMILES specification was initiated by David Weininger in the 1980s. Acknowledged for their parts in the early development were "Gilman Veith and Rose Russo (USEPA) and Albert Leo and Corwin Hansch(Pomona College) for supporting the work, and Arthur Weininger (Pomona; Daylight CIS) and Jeremy Scofield (Cedar River Software, Renton, WA) for assistance in programming the system."[1] The Environmental Protection Agency funded the initial project to develop SMILES.[2] "<

Only Dave & Corwin have current wiki pages at moment. Yohananw (talk) 00:24, 25 June 2013 (UTC)

Inappropriately included URLs[edit]

Under examples, several references to 'see depiction' pointing to stale external addresses, e.g. — Preceding unsigned comment added by (talk) 19:25, 1 June 2015 (UTC)

Yes, they are available with, but someone should substitute all of them with commons formula images. --Itu (talk) 12:31, 25 February 2016 (UTC)

External links modified[edit]

Hello fellow Wikipedians,

I have just added archive links to one external link on Simplified molecular-input line-entry system. Please take a moment to review my edit. If necessary, add {{cbignore}} after the link to keep me from modifying it. Alternatively, you can add {{nobots|deny=InternetArchiveBot}} to keep me off the page altogether. I made the following changes:

When you have finished reviewing my changes, please set the checked parameter below to true to let others know.

As of February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{sourcecheck}} (last update: 15 July 2018).

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—cyberbot IITalk to my owner:Online 14:26, 27 February 2016 (UTC)

P-carborane in SMILES[edit]

I was trying to figure out how to make p-carborane (and more generally an icosahedral molecule), and I came across the barrier that when you use a % for a ring number, like %10, you can't use other ring numbers on that one atom (eg. C1%10 isn't allowed). Does anyone know any work arounds to making a figure this way? Thottenjäger (talk) 21:00, 19 December 2017 (UTC)

Molecule codes[edit]

I have discovered that, at least for JMOL, SMILES has short codes that can stand for molecules. These do not appear to combine (i.e. cannot be used as functional groups) (e.g. CCCyn and CynCC don't work for Propionitrile), and I think that they should be mentioned in the article. I have begun to compile a table of them:

Code Molecule
Cyn Cyanide
Pur Purosine
Py Pyridine
Pyr Pyruvic acid
Lac Lactic acid
Sal Salicylic acid
Gu Guanidine
ClF and Clf a glitchy combination of iron, sulfur, and hydrogen
Gua Glutaric acid
Gud Uridine diphosphate glucose
Suc Sucrose
Rop Propionamide
Ace Acetamide
For Formamide
Fom Fosmidomycin
Pyl Phenylethane
Cys Cysteine
Arg DL-arginine
His Histidine
Lys Lysine
Apr Adenosine diphosphate ribose
Asr Arsanilic acid
Arn 1-imino-5-pentanone
Asn Asparagine
Apn Apholate
Asp Aspartic acid
Asa Acetylsalicylic acid
Act Acetic acid
Prp Phosphoribosyl pyrophosphate
Glu and Glt Glutamic acid
Ser Serine
Ala Aminolevulinic acid hydrochloride
Thr Threonine
Gly Glycine
Pro Proline
Prl Proflavine
Ilo N5-ethanimidoylornithine
Ile Isoleucine
Leu Leucine
Et Ethidium
Eth Ethane
Pop Pyrophosphoric acid
But Butonate
Pnt Pentamidine
Pet 1,5-di(4-amidinophenoxy)-3-oxa-pentane
Hx Hypoxanthine
Hex Hexane
Hpt Hexamethylphosphoramide
Hp Hematoporphyrin
Heme Heme B (my best guess, as something seems to be wrong with the 20-carbon conjugated system)
Met Methionine
Phe Phenylalanine
Phn Phenanthroline
Trp Tryptophan
Tyr Tyrosine
Val Valine
Vl Vinyl laurate
Gud globotriose with one hydroxyl group replaced by a fluorine atom
Fru Fructose
Glc Glucose
Gal Galactose
Ct Cholestane-3,5,6-triol
Cyt Cytosine
Ad 2,5-dimethyl-2,5-di(tert-butylperoxy)hexane
Cy Cyclophosphamide
Ade Adenine
Adn Adenosine
Ctn Cytidine
Gun Guanine
Gns glucose with a hydroxyl group replaced with a sulfonamide group
Nic 2-Nitroisocitrate
Thy Thymine
Tha Tacrine
The a sterol
Thu 5,6-Dihydrodeoxyuridine
Urc Uric acid
Ura Uracil
Uri Uridine
Azt and Zdv Zidovudine
Dn and Dnoc Dinitro-ortho-cresol
Dnc 3,5-dinitrocatechol
Het 4-methyl-5-thiazoleethanol
Cha 2-Amino-3-cyclohexylpropan-1-ol
Sa Succinylacetone
Ste Stearic acid
Va Vanillic acid
Ve VE (nerve agent)
Vx VX (nerve agent)
Sar Sarcosinamide
Vg VG (nerve agent)
Vp Arginine Vasopressin and Lysine Vasopressin
Vr VR (nerve agent)
Mal Maltose
Ma 10,13,17-Trimethyl-17-oxidanyl-7,8,9,11,12,14,15,16-octahydro-6H-cyclopenta[aphenanthren-3-one]
Ml 5-Methyl-2-[3,4,5-trihydroxy-6-(hydroxymethyl)oxan-2-yl-1,2-oxazol-3-one]
Gb Sarin
Tab 2-Benzyl[2-chloro-5-(2,4-diamino-6-ethyl-5-pyrimidinyl)phenylazoaminoethyl acetate]
Tbn Tubercidine
Som Methyl phosphinic acid
Smn Mandelic acid
Gf Cyclosarin
Xa Xylosyladenine
Tt and Ttc Triphenyltetrazolium chloride
Xan Xanthine
Ttt 1,3,5-trinitroso-1,3,5-triazinane
Tnt Trinitrotoluene
Tatb Triaminotrinitrobenzene
Ng Nitroglycerin
Cor 2,4-Diamino-1,5-diphenylpentan-3-ol
Crl 3-Nona-4,7-dienoyloxirane-2-carboxamide
Nap Nadide phosphate
Nad Nicotinamide adenine dinucleotide
Fad Flavin adenine dinucleotide
Rbf Riboflavin
Thm Thymidine
Ths 5'-(dithiophosphono)-thymidine
Tht Thioflavin T
Thf Tetrahydrofuran
Fur 5-Fluorouracil arabinoside
Frn what appears to be a protein with proline attached to phenylalanine attached to something that's probably lysine

Feel free to add to the table. Care to differ or discuss with me? The Nth User 02:25, 25 March 2019 (UTC)