Description:

How to access the "Spectral Similarity Search" from an own
"Graphical User Interface (GUI)" or another program




KEYWORDValueDescription & ExamplesGiven as POST-Command
Keyword/Value-pairs
EMAIL:Email-address where results will be sent John.Doe@examples.com
Line with this keyword must occur exactly once
EMAIL:John.Doe@examples.com emailadr=
PROJECT:A descriptive projectname consisting of up to 2 parts separated by :: Structure elucidation::Problem #1
Second part optional
Line with this keyword must occur exactly once
PROJECT:Structure elucidation::Problem #1 project=
PEAKLIST:A descriptive name of the peaklist of up to 2 parts separated by :: Compound-08/15::HSQC-400MHz
Second part optional
Line with this keyword must occur exactly once
PEAKLIST:Compound-08/15::HSQC-400MHz spectrum=
LINE:Shiftvalue Multiplicity Deviation

Shiftvalue:
  • Allowed range is -399 to +399 ppm
  • Shiftvalue must be given
  • Up to 99 shiftvalues = 99 lines
  • At least one line must occur

Multiplicity:
  • Optional
  • Only 1JCH taken into account
  • Other couplings (eg. JCF or JCP) must be ignored
  • Lowercase or uppercase allowed
  • Allowed values are
ValueMeaning
?
Unknown = C or CH or CH2 or CH3
S/s
Singulet = C
D/d
Dublet = CH
T/t
Triplet = CH2
Q/q
Quartet = CH3
O/o
Odd = C or CH2 or CH3
E/e
Even = CH or CH3
P/p
Protonated = CH or CH2 or CH3

Deviation:
  • Optional
  • Allowed values from 1.0 to 5.0ppm
  • Recommended value: 3.0ppm
  • Default-value, when missing: 3.0ppm
  • Interval to be searched: Shiftvalue +/- Deviation
Line at 125.3ppm, multiplicity unknown
LINE: 125.3

Dublet at 190.2ppm
LINE: 190.2 D

Protonated carbon at 70.8ppm
LINE: 70.8 p

Protonated carbon at 70.8ppm
Interval to be searched: 68.0 to 73.6ppm
LINE: 70.8 p 2.8

Search for lines between 15 and 20ppm
Any multiplicity
LINE: 17.5 2.5

Search for a triplet between 100 and 104ppm
LINE: 102 t 2

LINE: 125.3

LINE: 190.2 D

LINE: 70.8 p

LINE: 70.8 p 2.8

LINE: 17.5 2.5

LINE: 102 t 2

peaklist=
ELEMENTS: Sequence: N, O, P, S, F, Cl, Br, I, other

ValueMeaning
0Element must be absent
1Element may be absent or may be present
2Element must be present

  • Make use of this restriction
  • Presence/Absence of F/P can be seen in NMR
  • Presence/Absence of Cl/Br can be seen in MS
  • Default-setting is: 111111110 = N,O,P,S,halogens might be there, all other elements forbidden
  • Compress string - no blanks
  • This line might be missing - not recommended !
Compound contains oxygen and maybe nitrogen, but no other element (except C and H)
ELEMENTS:120000000

Halogenated hydrocarbon
ELEMENTS:000011110
N, O, P, S and all other elements excluded, any halogen possible
Result may also include non-halogenated hydrocarbons !
ELEMENTS:120000000
ELEMENTS:000011110
ELEMENTS:111111110 default
nelem=
oelem=
pelem=
selem=
felem=
clelem=
brelem=
ielem=
xelem=
MASS:
  • Limit display of reference compounds by range for molecular mass
  • Minimum mass range: 10 amu - if smaller range is given, it will be expanded
  • Usual sequence: Lower mass limit - upper mass limit: If wrong it will be exchanged
  • Recommendation: Allow eg. +/-15 in order to get next lower/higher homologue
  • Recommendation: Allow eg. +/-45 in order to get acetylated/non-acetylated derivatives
  • Make use of this restriction
  • Default-setting: Mass range from 12 to 9999 amu
You are only interested in reference compounds
between 110 and 150 amu


MASS: 110 150
MASS: 110 150 mwmin=
mwmax=
SIGNALS:
  • Limit result by number of signals in the
    reference compounds (=compounds retrieved
    from the database)
  • Number of given lines in your query peaklist should be
    within the range selected
  • A wrong selection of this range will be automatically corrected
  • Make use of this restriction
  • Default-setting: Any reference compound having 1 to 99 lines
Your query-peaklist holds 10 lines:
  • A useful restriction might be: Show all compounds having between 8 and 15 lines
  • SIGNALS:8 15
  • Invalid selection: Show all compounds having between 15 and 25 lines

Your query-peaklist holds 27 lines:
  • A useful restriction might be: Show all compounds having between 22 and 35 lines
  • SIGNALS: 22 35
  • Invalid selection: Show all compounds having between 30 and 32 lines
SIGNALS:8 15
SIGNALS: 22 35
sigmin=
sigmax=
PROCESS: Define 2 parameters named "free" and "exclude"

ParameterValueDescription
free0Reference spectra allowed to have additonal lines in the areas where the query peaklist has no lines
free1
(default)
Reference spectra have no lines in linefree areas of the query-spectrum - more pronounced similar overall pattern
exclude0Apply restrictions during display; display also entries violating constraints (these entries are shown in different color)
exclude1
(default)
Apply restrictions already during search, entries violating the constraints are not shown
PROCESS:"free""exclude"

PROCESS:01
  • Dont allow additional lines in reference spectra
  • Apply given constraints already during search

PROCESS:11
  • Default-setting
PROCESS:11 free=
exclude=
Additional parameters only necessary when using the POST-command directly
Parameters must be there exactly as given here
action=search
Submit=Search
Additional parameter only necessary when using the POST-command directly
Parameter must be there exactly as given here

  • Value must be 32 Bytes long
  • Allowed characters are: 0 ... 9 and a ... f
  • Value must be unique
  • Create using either "md5sum" or "uuidgen" (here ommitt the "-" !)
  • As starting value use e.g. the actual time (microsecond/nanosecond resolution) combined e.g. with your IP-address/CPU-ID
result=


Example:



Your email address is: test@test123.abc.info
Your project is:       Test#1::Dereplication-engine
Your peaklist:         Propionic_acid,ethylester::400MHz

Your spectral data:      174 S
                          60 T 2.5
                          27
                          14 q
                           9 p

Your restrictions:       Range for molecular weight: 85 to 120 amu

Elements:                Oxygen must be present, all other elements are forbidden

Number of signals in reference compounds:    3 to 7

Line free areas in query peaklist should be also linefree in the reference spectra

Restrictions should be applied already during search

Entering these data into your GUI should generate the following datafile:



EMAIL:test@test123.abc.info
PROJECT:Test#1::Dereplication-engine
PEAKLIST:Propionic_acid,ethylester::400MHz
LINE:174 S
LINE:60 T 2.5
LINE:27
LINE:14 q
LINE:9 p
ELEMENTS:020000000
MASS: 85 120
SIGNALS:3 7
PROCESS:11

When you have this file ready, I will provide a shell-script to submit
the POST-request to the "Dereplication Engine"


The above given data have been used on 19-May-2021 (523,795,172 reference spectra)
to generate the following results

For complete result - see here






Page written by WR on:May 19th, 2021
Last update:May 26th, 2021
Page online since:May 19th, 2021