Appendix I.   SEQ CHAPTER \h \r 1 Procedure For Chemical Substance
Sorting And Structural Verification

PURPOSE:	1) 	to sort substances (e.g., items on various EPA/OPPTS
chemical lists) to distinguish among chemicals in the following
categories: 

			a) discrete chemical structures, 

			b) mixtures composed of identifiable discrete components, 

			c) polymers, 

			d) mixtures containing undefined substances

to verify that each discrete chemical has a unique name (e.g.,
Collective Index (CI) name), structure (e.g., SMILES representation),
and Chemical Abstracts Services Registry Number (CAS#) 

			

NOTES:	This procedure was designed for sorting through chemical lists of
regulatory concern to identify discrete chemical structures and
verifying accuracy of CAS#, structure and chemical name(s).
Additionally, SMILES notation for structural depiction is also assigned
and accuracy checked.  A systematic chemical name is included and
available synonymous names are retrieved and stored. The information
serves as a necessary foundation for: linking together relational
databases that provide measured or predicted physical chemical
properties, toxicological endpoint data, metabolism simulations, etc.;
and accurate QSAR development and associated strategic chemical
selection procedures. These chemical verification procedures are  based
upon procedures used in ECOTOX and ASTER for chemical verification with
the addition of chemical sorting to distinguish CAS#s for discrete
chemical structures distinct from undefined mixtures, etc.  

	The procedure is written assuming that CAS# are available. However many
of the steps are equally applicable if only structure and/or systematic
name are available. Briefly, the procedure starts with verifying that
CAS#’s are valid. If an invalid CAS# is found no further verification
is possible unless the originator of list can provide additional
information. A CAS# lookup is done to determine if one discrete chemical
is represented, or a mixture of defined discrete chemicals is
represented, and mixture of undefined substances, or polymers.   While
the initial focus for structural lookup is CAS#, the philosophy
underlying the verification is that the unique descriptor for a chemical
substance is its structure. Therefore, CAS# are used to find structures
(i.e., SMILES,  2-D pictures) for all discrete chemicals identified on a
list; 2-D projections from SMILES are matched to 2-D structures depicted
from one of several independent reference sources (see “Procedure”).
Eventually SMILES are obtained for all chemicals from reference sources,
or written from 2-D structure depictions if not otherwise available.
Each newly written SMILES is checked for accuracy of structural
depiction by a second person.  

	Chemical names can vary, and common names may be especially
nondescript. Every attempt should be made to link the CAS# to a
systematic name for a given chemical. The Collective Indices name should
be sought and used as the preferred name. Synonyms should also be
collected so a chemical could also be searched by a common name(s).
Multiple CAS#s may exist for the same substance; active and/or retired
numbers may be encountered and should be recorded.

	Operationally, the process of structure verification is conveniently
done within a spreadsheet format; therefore, if a list is presented in
text form it is first transferred to a spreadsheet where columns of new
information may be added and groups may be easily sorted.

PROCEDURE: - Upon receipt of a new list of chemicals or chemical
categories with associated CAS#s: 



	1)	Look for duplicate CAS# entries and invalid CAS#s to remove from
list

	2) 	Check whether substance (via CAS#) has been entered/verified
previously

	3)	Sort substance/chemical into one of the following four categories:

discrete chemical (organic, inorganic, organometallic)

mixture (defined)

polymer

undefined substance 



based on information obtained from search on CAS# at various electronic
free web sites:	

Chemfinder

    HYPERLINK "http://chemfinder.cambridgesoft.com/" 
http://chemfinder.cambridgesoft.com/ 	

NIST Chemistry Webbook

    HYPERLINK "http://webbook.nist.gov/chemistry/cas-ser.html" 
http://webbook.nist.gov/chemistry/cas-ser.html 	

NIH ChemIDplus

    HYPERLINK "http://chem.sis.nlm.nih.gov/chemidplus/chemidlite.jsp" 
http://chem.sis.nlm.nih.gov/chemidplus/chemidlite.jsp 	

Dutch Dictionary on Organic Chemistry

    HYPERLINK "http://www.woc.sci.kun.nl/index.en.html" 
http://www.woc.sci.kun.nl/index.en.html 	

Alan Wood’s Compendium of Pesticide Common Names

    HYPERLINK "http://www.alanwood.net/pesticides/index.html" 
http://www.alanwood.net/pesticides/index.html 	

SIRI MSDS Index

    HYPERLINK "http://hazard.com/msds/"  http://hazard.com/msds/ 	

Other available programs with chemical lists, names, structures and
CAS#s may also be used to confirm that a CAS is correctly linked with
structure and systematic name. We routinely use the following locally
available resources: ASsessment Tools for Evaluation of Risk (ASTER);
ECOTOX, and Centralized Database (Oasis - Database Manager - LMC
Bourgas, Bulgaria - Dr. Ovanes Mekenyan). Other systems such as
Distributed Structure-Searchable TOXicity (DSSTOX)) website might also
provide additional information.

If chemical information cannot be found under various “free sources”
commercial sources that may be applicable are, e.g., Scifinder, STN
Easy, STN International, Chemical Abstract Services (CAS) pay for
service.

During the sort process, matching of chemical names and reported CAS#
are verified.



	4)	All CAS# of entries on a chemical list searched in either, or both:

Centralized Database (Oasis - Database Manager - LMC Bourgas, Bulgaria -
Dr. Ovanes Mekenyan)     HYPERLINK "http://www.oasis-lmc.org/" 
http://www.oasis-lmc.org/ 

ASTER program - US EPA, Duluth, MN USA - Chris Russom

to obtain additional verification information (name, CAS#) and SMILES
representation for the discrete chemicals. The SMILES obtained are then
verified with respect to structures obtained from the various electronic
sources listed above. SMILES strings are written for the discrete
chemicals not found in these databases.

	5)	Ideally a second independent verification of CAS#, name, structure
(SMILES) should be done with respect to good QA/QC.  Record
initials/date/verification source in  spreadsheet as part of the
documentation:

initial entry

an independent verification - different information source than initial
entry if possible.

any discrepancies should be verified a third time

											

