Ecoinvent Chemical Structures

Date: Dec 2023 –

Why, what, how?

When building LCA inventories of chemical processes, it is sometimes necessary to use proxy chemicals to represent inputs and outputs that cannot be found in the database. When this is the case, it is important to ensure that the proxy chemicals are as similar as possible to the real chemicals that they are replacing.

Usually, one would need to search on the internet to see the structures of the chemicals, which are sometimes listed in the database under non-systematic names that give no hint of the structure. This gets rather annoying and makes it easier to miss the best match or choose the wrong proxy.

To make this process easier, I have processed the ecoinvent 3.10 database to extract the chemical abstract (CAS) numbers and (where possible) match each one to its structure. This allows one to do a quick visual scan through the structures to find the most appropriate proxy. Additional chemical information is also obtained, such as the SMILES string and the InChI code, formulas, and molecular weights and synonyms. An important caveat here is, of course, that while the structures of two chemicals may be similar, the production processes and environmental impacts may be quite different. The structure, can of course, be a useful starting point for narrowing the search.

The source code, data and images related to processing the database and collecting the structures is available on Github. The changes needed to implement this with brightway2-io have now been merged into the main branch with pull request 237.

The chemical structure gallery

On this page you can find a gallery the images of the structures and their information, sorted by molecular weight. Click an image to find out more, including an embedded external search function for ecoinvent ecoquery and PubChem.

The images are in SVG format and are named `mW_CAS.svg`, where `mW` is the rounded molecular weight and `CAS` is the CAS number. Each image is annotated with the CAS number, ecoinvent name, IUPAC name, smiles string, formula and synonyms. Ideally, the SVG text should be searchable, so you can hit ctrl+f and search for whatever you are looking for and see the structure and other information instantly. (this is at the moment, due to some html mysteries, so you need to open the image in a separate tab to select text or search)

Optimally, this presentation would be interactive somehow, with filtering and sorting options, but I don't have the time to do that right now. If you want to do it, feel free to fork the repo and make a pull request.


Example of a chemical structure image

Example chemical structure