                                mcsbenchmark.py

Important URLs:
  - fmcs is an MCS program from https://bitbucket.org/dalke/fmcs
  - It uses the RDKit toolkit. Get it from http://rdkit.org/
  - Indigo is a toolkit with built-in MCS suport. Get it from
       http://ggasoftware.com/opensource/indigo
  - chemfp is a Python library for fingerprint-based similarity
     searches. Get it from http://code.google.com/p/chem-fingerprints/


Running a benchmark
===================

This is a command-line tool to generate and run MCS benchmarks. It's
based around an "MCSB" ("MCS Benchmark") file. Here is the contents of
"example.mcsb":

#MCS-Benchmark/1
#File chembl13_knearest_2.smi
16 CHEMBL173535 CHEMBL368519

The first line describes the format. A line starting with "#File"
states the filename containing the structures. Lines which do not
start with '#' describe the MCS test to carry out. In this case, line
3 says that test "16" (it's an arbitrary label) is a comparison
between CHEMBL173535 and CHEMBL368519.


To run this benchmark using fmcs (this assumes that fmcs and RDKit are
already installed):

   python mcsbench.py fmcs example.mcsb

The output should be similar to:

#MCS-Benchmark-Output/1
#software fmcs/1.0b1 RDKit/2011.12.1pre
#options atom-compare=elements bond-compare=bondtypes min-num-atoms=2
#timestamp 2012-05-09T00:24:25.790080
#File chembl13_knearest_2.smi
#    Loaded 2000 structures.
#  Using CHEMBL173535 CHEMBL368519
16 . 1 17 19 0.11 [#6]-[#6]-[#6]-1-[#6]-[#6]:2:[#6](:[#7]:[#6]:3:[#7]:[#6]:[#7]:[#7]:3:[#6]:2-[#7]-1-[#6]-[#6])-[#6]
#  
#           Summary
#  
#  Total 1 in 0.11 seconds (9.3/second)
#      17 atoms 19 bonds; average 17.0 atoms 19.0 bonds
#  Complete 1 in 0.11 seconds (9.3/second)
#      17 atoms 19 bonds; average 17.0 atoms 19.0 bonds
#  Incomplete 0 in 0.00 seconds (N/A/second)
#      0 atoms 0 bonds; average N/A atoms N/A bonds
#  Fail 0 in 0.00 seconds (average N/A sec)
#  Kill 0 in 0.00 seconds (average N/A sec)

The first line describes the format. The "#software" line describes
the software used to compute the MCS and the "#options" line describes
the specific MCS search parameters. The "#timestamp" line shows when
the benchmark was run. You can see I work late hours.

Lines which do not start with a '#' contain the MCS results. The only
result here is for test "16", which is the same label as the
input. The other fields are space separated, and broken down as:

16 . 1 17 19 0.11 [#6]-[#6]-[#6]-1-[#6]-[#6]:2:[#6](:[#7]:[#6]:3:[#7]:[#6]:[#7]:[#7]:3:[#6]:2-[#7]-1-[#6]-[#6])-[#6]
                  ^^^^^^^^^^^------ description; usually a SMILES or SMARTS pattern
             ^^^^----- MCS calculation time, in seconds
          ^^------ number of bonds in the MCS
       ^^------ number of bonds in the MCS
     ^------ number of fragments in the MCS
   ^----- status code; one of '.', 'I', 'F', 'X'
^----- the label (matches the input)

The status codes mean:
  .  = The algorithm completed and returned its best match
  I  = The algorithm gave an incomplete result, because it timed out
  F  = The algorithm failed to give a result, because it timed out
  X  = The algorithm failed to exit after the timeout was reached

If the status code is 'F' or 'X' then the number of fragments, atoms,
and bonds are -1 and the description should be '-'.

If the status code is '.' or 'I' but no MCS is found then the number
of fragments, atoms, and bonds is 0 and the description should be '-'.

Neither fmcs nor Indigo support disconnected MCSes so the number of
fragments they report will never be larger than 1. This field is there
to support MCS algorithms which find disconnected MCSes.



Similarly, the benchmark supports Indigo's 'exact' and 'approx'
extractCommonScaffold implementations. Here's the basic command-line
for 'exact':

  python mcsbench.py indigo-exact example.mcsb

The output should be similar to:

#MCS-Benchmark-Output/1
#software Indigo/1.1-rc extractCommonScaffold
#options method=exact maximize=atoms atom-compare=elements bond-compare=bondtypes aromatize=True fold_hydrogens=True
#timestamp 2012-05-09T00:31:10.073172
#File chembl13_knearest_2.smi
#    Loaded 2000 structures.
#  Using CHEMBL173535 CHEMBL368519
16 . 1 17 19 0.02 CCN1C2:C(:C(:N:C3:N:2:N:C:N:3)C)CC1CC
#  
#           Summary
#  
#  Total 1 in 0.02 seconds (63.8/second)
#      17 atoms 19 bonds; average 17.0 atoms 19.0 bonds
#  Complete 1 in 0.02 seconds (63.8/second)
#      17 atoms 19 bonds; average 17.0 atoms 19.0 bonds
#  Incomplete 0 in 0.00 seconds (N/A/second)
#      0 atoms 0 bonds; average N/A atoms N/A bonds
#  Fail 0 in 0.00 seconds (average N/A sec)
#  Kill 0 in 0.00 seconds (average N/A sec)


Here's an example of the "approx" method

  python mcsbench.py indigo-approx example.mcsb

and its corresponding output should look similar to:

#MCS-Benchmark-Output/1
#software Indigo/1.1-rc extractCommonScaffold
#options method=approx iterations=1000 maximize=atoms atom-compare=elements bond-compare=bondtypes aromatize=True fold_hydrogens=True
#timestamp 2012-05-09T00:31:14.550301
#File chembl13_knearest_2.smi
#    Loaded 2000 structures.
#  Using CHEMBL173535 CHEMBL368519
16 . 1 17 19 0.01 CCC1N(CC)C2:C(:C(:N:C3:N:2:N:C:N:3)C)C1
#  
#           Summary
#  
#  Total 1 in 0.01 seconds (100.0/second)
#      17 atoms 19 bonds; average 17.0 atoms 19.0 bonds
#  Complete 1 in 0.01 seconds (100.0/second)
#      17 atoms 19 bonds; average 17.0 atoms 19.0 bonds
#  Incomplete 0 in 0.00 seconds (N/A/second)
#      0 atoms 0 bonds; average N/A atoms N/A bonds
#  Fail 0 in 0.00 seconds (average N/A sec)
#  Kill 0 in 0.00 seconds (average N/A sec)



Command-line options
====================

The fmcs, indigo-exact, and indigo-approx subcommands support many
commmand-line options. Use "--help" to see them. Here are the fmcs
options:

% python mcsbench.py fmcs --help
usage: mcsbench.py fmcs [-h] [--maximize {atoms,bonds}]
                        [--atom-compare {any,elements,isotopes}]
                        [--bond-compare {any,ignore-aromaticity,bondtypes}]
                        [--min-num-atoms INT] [--ring-matches-ring-only]
                        [--complete-rings-only] [--timeout SECONDS]
                        [--min-time SECONDS] [--max-time SECONDS] [--lazy]
                        [--output-format {mcs-output,mcs-benchmark}]
                        [--client] [--progress] [--verbose]
                        [mcsb_filename]

positional arguments:
  mcsb_filename

optional arguments:
  -h, --help            show this help message and exit
  --maximize {atoms,bonds}
                        Should the MCS maximize the number of atoms or the
                        number of bonds?
  --atom-compare {any,elements,isotopes}
                        Specify the atom comparison method. With 'any', every
                        atom matches every other atom. With 'elements', atoms
                        match only if they contain the same element. With
                        'isotopes', atoms match only if they have the same
                        isotope number; element information is ignored so [5C]
                        and [5P] are identical. (Default: elements)
  --bond-compare {any,ignore-aromaticity,bondtypes}
                        Specify the bond comparison method. With 'any', every
                        bond matches every other bond. With 'ignore-
                        aromaticity', aromatic bonds match single, aromatic,
                        and double bonds, but no other types match each other.
                        With 'bondtypes', bonds are the same only if their
                        bond type is the same. (Default: bondtypes)
  --min-num-atoms INT   Minimum number of atoms in the MCS. Must be at least
                        2. (Default: 2)
  --ring-matches-ring-only
                        Modify the bond comparison so that ring bonds only
                        match ring bonds and chain bonds only match chain
                        bonds. (Ring atoms can still match non-ring atoms.)
  --complete-rings-only
                        If a bond is a ring bond in the input structures and a
                        bond is in the MCS then the bond must also be in a
                        ring in the MCS. Selecting this option also enables
                        --ring-matches-ring-only.
  --timeout SECONDS     Quit the MCS calculation after SECONDS seconds
  --min-time SECONDS    Do not report searches taking less than SECONDS
                        seconds
  --max-time SECONDS    Do not report searches taking more than SECONDS
                        seconds
  --lazy                Do not parse the structure records until needed
  --output-format {mcs-output,mcs-benchmark}
  --client              Enable experimental client protocol
  --progress            Write partial progress information to the benchmark
                        output file
  --verbose             Write status and summary information to stderr


Hopefully most of these are obvious. Many of them are shared between
the fmcs and Indigo MCS methods. See a lot further down in this README
for an example of using --min-time to show those MCSes which take a
long time to parse.

Do note that the mcsb_filename is optional. If not specified then the
MCS benchmark will be read from stdin.


The --atom-compare, --bond-compare, --min-num-atoms,
--ring-matches-ring-only, and --complete-rings-only are fmcs specific.


The Indigo subcommands implement these two special options.

  --no-aromatize        Don't reperceive input aromaticity
  --no-fold-hydrogens   Don't remove hydrogens from the input structure

(By default, the Indigo implementation reperceives aromaticity and
turns explicit hydrogens into implicit ones. These options disable
those transformations.)


The indigo-approx subcommand also implements:

  --iterations N        Stop the search after N iterations (Default: 1000)



MCSB Format specification
=========================

The format is line oriented. Both "\r\n" and "\n" newline conventions
are supported.

The first line must be "#MCS-Benchmark/1".

MCS requests
------------

The MCS request lines start with a character which isn't a '#' and
isn't whitespace. It contains non-whitespace fields separated by a
single space character. The first field is a unique label. If there
are only two fields and the second field is "all" then the MCS should
be done using all of the structures in the most recently specified
file. Otherwise there must be at least three fields, and fields 2, 3,
etc are identifiers for the structures to use in computing the MCS.


Examples:

1 all
8 CHEMBL348206 CHEMBL151444
22 CHEMBL1541444 CHEMBL1471382 CHEMBL1403443 CHEMBL1462010


Other lines
-----------

Lines starting with '#' come in one of several forms:

  - if the character in the second column is [A-Z] then it is a
     required line, and implementations must understand it or exit
     with a failure. See below for this list.

  - if the character in the second column is [a-z] then it is an
     optional line, and implementations may ignore it. (For example,
     this may be used to pass implementation-specific options to a
     program, or specify an alternative location for the structure
     data.) There are no current examples.

  - if the character in the second column is '#' then it is a
     progress line and should be ignored. Example:
     ## Total: 25/51.2s (0.5/s) Complete: 24/21.1s (1.1/s) 

  - if the character in the second column is a space then it is
     a comment line and should be forwarded to the output. Example:
     # Test cases found using 10-nearest Tanimoto search with threshold 0.9


In the following, phrase of the form 'starting with "#File"' mean
'starting with "#File "'. That is, there is a single space character
after the specified word.

#File
-----

Lines starting with "#File" contain the filename for the structures
used in the MCS requests which follow the "#File" line. The filename
starts in column 6 and goes to the end of the line. It is utf-8
encoded. SMILES filenames should end in ".smi" or ".smi.gz" and SD
filenames must end with ".sdf" or ".sdf.gz". (Other suffixes are
supported, but should not be used.)

There can be multiple #File lines. In general an implementation can
assume that all structures will be used, so may read and parse all the
structures when the #File line is reached. An implementation may
implement a lazy reader, which does not parse the structure record
until needed. The fcms and indigo SMILES readers currently support a
lazy reader if the --lazy option is enabled.

Examples:

#File smiles_dataset.smi
#File input.sdf.gz
#File contains a space.sdf


#Id-tag
-------

By default the structure identifier comes from the first line of an SD
record or from the second column of a SMILES file. Some SD files (like
ChEMBL and ChEBI releases) contain the identifer in an SD tag and not
in the title line. In that case, use a line starting with "#Id-tag" to
specify the tag which contains the identifier. This must occur before
the "#File" line.

Example:

#Id-tag chembl_id


MCS Benchmark output format
===========================

The MCS search programs generate output using a similar line-oriented
format. Again, lines starting with "#" + uppercase letter must be
understood by a reader, lines starting with "#" + lowercase letter are
optional, lines starting "##" are progress lines and should be
ignored, and lines starting with "# " are comment-lines meant for
humans to read.

The only required lines are "#File" and "#Id-tag", with the same
meanings as before.

#MCS-Benchmark-Output/1
-----------------------

This indicates the format type and version number for the output
format. It must be the first line of the output.

#software
---------

The line starting with "#software" describes the software components
and versions which went into the MCS calculation and which may have an
effect on the result. It must be the second line of the output.

The components are space separated. In general, each component term
should be the name of the component, a '/', and the version. But
notice how I use "extractCommonScaffold" since I wasn't sure that just
saying "Indigo" was clue enough.

Examples:

#software fmcs/1.0b1 RDKit/2011.12.1pre
#software Indigo/1.1-rc extractCommonScaffold


#options
--------

The line starting with "#options" describes how the MCS searches are
done. It should use a space-separated list of the following terms,
where possible:

 One of these is required:
  atom-compare=any        Any atom matches any other atom
  atom-compare=elements   Atoms match based on element number

 One of these is required:
  bond-compare=any        Any bond matches any other bond
  bond-compare=bondtypes  Bonds match based on bond type
  bond-compare=ignore-aromaticity  Aromatic bonds match single and double bonds

 At most one of these is allowed; "False" is the default case:
  ring-matches-ring-only=True  Ring bonds are only allowed to match ring
                          bonds and chain bonds may only match chain bonds.
  ring-matches-ring-only=False  Ring bonds may match chain bonds.


 At most one of these is allowed; "False" is the default case:
  complete-rings-only=True  If a bond in the MCS corresponds to a ring
                          bond in an input structure then the bond in
                          the MCS must also be in a ring in the MCS.
                  Note: this always implies --ring-matches-ring-only=True

  complete-rings-only=False  Partial ring MCSes are allowed


 At most one of these is allowed; "True" is the default case:
  aromatize=True   Aromaticity has been reperceived
  aromatize=False  Aromaticity is as defined in the input record


 At most one of these is allowed; "True" is the default case:
  fold-hydrogens=True   Explicit hydrogens have been made implicit
  fold-hydrogens=False  Explicit hydrogens are left in the structure

    Note: for backwards compatibility reasons, old benchmarks may
    have "fold_hydrogens" instead of "fold-hydrogens"

 If a timeout is specified then this is required:
  timeout=SECONDS   The length of the timeout, in seconds, as a float.
                       at least two decimal places should be supported.

 If there is an outer maximum iteration counter:
  iterations=NUM   The maximum number of iterations to process.

 If there is an ambiguity on the MCS algorithm used:
  method=NAME    For example, "method=exact" and "method=approx" help
                 distinguish between the two extractCommonScaffold
                 implementations.


#timestamp
----------

This is an ISO 8601 timestamp. It provides some metadata about when
the program was run. It should resolve down to seconds and may include
subsecond values.

Example:
#timestamp 2012-05-08T15:35:06.197802

Note: The timezone is undefined and presumed to be local time, not
GMT/UTC.

MCS result
----------

The MCS result line was described in full earlier. It looks like:

16 . 1 17 19 0.11 [#6]-[#6]-[#6]-1-[#6]-[#6]:2:[#6](:[#7]:[#6]:3:[#7]:[#6]:[#7]:[#7]:3:[#6]:2-[#7]-1-[#6]-[#6])-[#6]
                  ^^^^^^^^^^^------ description; usually a SMILES or SMARTS pattern
             ^^^^----- MCS calculation time, in seconds
          ^^------ number of bonds in the MCS
       ^^------ number of bonds in the MCS
     ^------ number of fragments in the MCS
   ^----- status code; one of '.', 'I', 'F', 'X'
^----- the label (matches the input)


See above for details.


##Ready - experimental client protocol
--------------------------------------


Some MCS algorithms do not implement a timeout. In that case it's best
to run the program as a separate process, and communicate with it via
two pipes. I want the comnunications to use the same MCS-Benchmark and
MCS-Benchmark-Output format, but in my experimental implementation I
found that I needed something where the client could tell the server
that it was ready for a new input line.

That is the "##Ready" message. The client sends (and flushes) it to
stdout when it's ready to read a new line.



Generating new benchmarks
=========================

mcsbench.py includes two methods for making new benchmarks given a
structure file. These are available through the "random" and
"neighbors" commands:


    random              Generate an MCS benchmark file using randomly selected
                        records from a structure file
    neighbors           Generate an MCS benchmark file using nearest-neighbor
                        searches of a fingerprint file


Random benchmarks
-----------------

This selects 'k' records at random, without replacement, from a
structure file. By default k=2, which generates MCS tests for random
pairs of structures.


Here's what you get from "mcsbench.py random --help"

usage: mcsbench.py random [-h] [--seed SEED] [-k K] [--num-tests N]
                          [--id-tag TAG] [--subset-filename FILENAME]
                          [--verbose]
                          structure_filename

positional arguments:
  structure_filename    input SD or SMILES file

optional arguments:
  -h, --help            show this help message and exit
  --seed SEED           initial random number seed
  -k K                  select k elements for each test
  --num-tests N, -n N   number of test pairs to generate
  --id-tag TAG          SD tag name containing the primary identifier
  --subset-filename FILENAME
                        Save the subset of the structures used for the tests
                        into FILENAME
  --verbose             Write status information to stderr


The structure filename can be a SMILES file or an SD file.

The generated benchmark file is written to stdout.

The "--subset-filename" exists because sometimes the generated tests
are only a small fraction of the input structures. For example, ChEMBL
contains over 1 million structures, but a benchmark might have only
2,000 structures in it. There's a load-time cost to read 1 million
structures, and it's hard to ship people only the structures which are
needed for the test.

For this case, use "--subset-filename" to have the structure records
saved to the named file. This is done before the new MCS benchmark
data is sent to stdout, and the new benchmark will have a "#File" line
using the subset filename.

If you don't specify the subset filename then the generated benchmark
will use the given structure_filename.




Neighbor benchmarks
-------------------

This selects neighbors based on a similarity search to a randomly
selected fingerprints. For example, you can select all structures with
at least 0.95 similarity to the randomly selected fingerprint, or the
nearest k=10 fingerprints, or a combination of threshold and k-nearest
parameters.

The fingerprints must be in FPS format and mcsbench.py requires the
chemfp Python library to use this functionality.

Here's what you get from "mcsbench.py neighbors --help"

usage: mcsbench.py neighbors [-h] [--seed SEED] [--num-tests N]
                             [--k K] [--k-min K_MIN] [--threshold THRESHOLD]
                             [--prefix PREFIX] [--structures FILENAME]
                             [--id-tag TAG] [--subset-filename FILENAME]
                             [--verbose]
                             fps_filename

positional arguments:
  fps_filename          structure fingerprints for the similarity search

optional arguments:
  -h, --help            show this help message and exit
  --seed SEED           initial random number seed
  --num-tests N, -n N   number of test cases to generate
  --k K, -k K           maximum number of neighgors to use
  --k-min K_MIN         minimum number of neighbors to use
  --threshold THRESHOLD
                        minimum threshold
  --prefix PREFIX       output prefix for the MCS and structure filenames
  --structures FILENAME
                        input SD or SMILES file (Default: use the FPS source
                        field)
  --id-tag TAG          SD tag name containing the primary identifier
  --subset-filename FILENAME
                        Save the subset of the structures used for the tests
                        into FILENAME
  --verbose             Write progress information to stderr




The "--k-min" specifies the minimum number of structures which must be
in an MCS calculation. By default it's 2, which is the smallest
minimum allowed. If you want a benchmark which always has exactly 5
neighbors to the randomly chosen fingerprint then do:

  mcsbench.py neighbors --k-min 5 -k 5 chembl_13-normalized.fps


Like the "random" option, you can use --subset-filename to generate a
new structure file containing only the records needed from the
original structure file. By default it looks at the FPS file "source"
metadata to get the structure filename, but you can override it with
the "--structures" option.


Example: find slow MCSes
------------------------

The "random" and "neighbors" options write the benchmark data to
stdout, and the "fmcs", "indigo-exact" and "indigo-approx" methods
read benchmark data from stdin. 

This means you can hook them together, like this:

  mcsbench.py random chembl13_knearest_100.smi | mcsbench.py indigo-exact

However, if you try that out you'll find that it takes a while for
Indigo to read and parse all of the structures. (Use the "--verbose"
option to "indigo-exact" so you can see the process of it's SMILES
reader.)

Instead, ask the MCS search program to use the "lazy" reader, which
only parses the record when it's needed.

  mcsbench.py random chembl13_knearest_100.smi | mcsbench.py indigo-exact --lazy


You should quickly see a lot of match data sent to the screen. That's
because the Indigo code is fast and because the MCS between randomly
chosen pairs is usually small.

Still, some matches take a lot of time. I want to see only those
things which takes at least 5 seconds to process. I'll use the
following:

  mcsbench.py random --seed 54321 --num-tests 10000 chembl13_knearest_100.smi | 
     mcsbench.py indigo-exact --lazy --timeout 6 --min-time 5 

This generates a benchmark with 10,000 random pairs. I specified the
initial random seed so you could reproduce my results. I then passed
the benchmark over to an "indigo-exact" search. I'll use a timeout of
6 seconds, and have it print anything which takes at least 5 seconds
to run.

Here's the initial part of the output:


#MCS-Benchmark-Output/1
#software Indigo/1.1-rc extractCommonScaffold
#options method=exact maximize=atoms atom-compare=elements bond-compare=bondtypes aromatize=True fold-hydrogens=True timeout=6.00
#timestamp 2012-05-09T03:40:39.031636
#  Displaying searches which took at least 5.0 seconds
#File chembl13_knearest_100.smi
##   Loaded 82880 structures.
#   10000 randomly generated tests, each with 2 ids. Seed=54321
#  Using CHEMBL552154 CHEMBL524541
459 F -1 -1 -1 6.00 -
#  Using CHEMBL1097670 CHEMBL437389
860 F -1 -1 -1 6.00 -
#  Using CHEMBL1229206 CHEMBL267480
1642 F -1 -1 -1 15.49 -
#  Using CHEMBL266412 CHEMBL440258
2407 . 1 13 12 5.35 C(C)NC(=O)C(C(O)C)NC(=O)C
#  Using CHEMBL252512 CHEMBL268084
3422 F -1 -1 -1 6.00 -
#  Using CHEMBL510621 CHEMBL412168
4391 F -1 -1 -1 6.00 -


You can see that CHEMBL266412 vs. CHEMBL440258 took 5.35 seconds to
run, so did not reach the timeout, but the other comparisons took at
least 6 seconds and failed to produce an MCS.

This generated the output in "MCS-Benchmark-Output" format. Suppose
you want to see if the hard cases for Indigo are also hard cases for
fmcs, or you want to improve the Indigo performance for hard cases.
In those scenarios you would likely rather have the output in
"MCS-Benchmark" format.

That's easy. Use "--output-format mcs-benchmark", like this:


    mcsbench.py random --seed 54321 --num-tests 10000 chembl13_knearest_100.smi |
       mcsbench.py indigo-exact --lazy --min-time 5 --timeout 6 --output-format mcs-benchmark

to get output which looks like:

#MCS-Benchmark/1
#software Indigo/1.1-rc extractCommonScaffold
#options method=exact maximize=atoms atom-compare=elements bond-compare=bondtypes aromatize=True fold-hydrogens=True timeout=6.00
#timestamp 2012-05-09T03:46:51.690220
#  Displaying searches which took at least 5.0 seconds
#File chembl13_knearest_100.smi
##   Loaded 82880 structures.
#   10000 randomly generated tests, each with 2 ids. Seed=54321
459 CHEMBL552154 CHEMBL524541
#    Took 6.00 seconds
860 CHEMBL1097670 CHEMBL437389
#    Took 6.00 seconds
1642 CHEMBL1229206 CHEMBL267480
#    Took 15.31 seconds
2407 CHEMBL266412 CHEMBL440258
#    Took 5.27 seconds
3422 CHEMBL252512 CHEMBL268084
#    Took 6.00 seconds
4391 CHEMBL510621 CHEMBL412168
#    Took 6.00 seconds
5894 CHEMBL410815 CHEMBL413071
#    Took 6.00 seconds
7677 CHEMBL356387 CHEMBL519416
#    Took 6.00 seconds
7940 CHEMBL593680 CHEMBL409983
#    Took 6.22 seconds
8056 CHEMBL1163436 CHEMBL260330
#    Took 6.00 seconds
9210 CHEMBL559226 CHEMBL411957
#    Took 6.01 seconds
9391 CHEMBL227456 CHEMBL1255689
#    Took 6.01 seconds


Interesting. There's still a case where the Indigo code doesn't
timeout when expected. In any case, the answer is "sometimes". In a
two cases, fmcs finds a solution in under 6 seconds where Indigo
couldn't.
