Programmatically interacting with the new compounds API
At Strateos we're passionate about frictionless transfers between digital representations and the real physical world. With a highly unique robotic lab that can be programmed we want to continue to enable our users to blend both their computational workflows with real physical experimentation workflows. With the coming launch of multiple chemistry capabilities on the Strateos Robotic Cloud Lab I wanted to show what is possible with the new Compounds API which is currently in preview (As of Jan 2020), showing a few examples of using the API then an example of using it in a workflow along with a pipeline of cheminformatics.
Individual compound records
Let's start by fetching an individual compound record, for this we'll need to know the compound_id
. You can get this from the record via the Compounds section of the web application, or by fetching the whole compound list, which we'll do in a moment. Below is a ruby snippet for fetching compound cmp123456
.
require 'uri'
require 'net/http'
require 'openssl'
# Set the compound_id of interest
compound_id = "cmp123456"
# Set your organization_id
querystring = "filter[organization_id]=org1235"
url = URI("https://secure.transcriptic.com/api/compounds/?#{id}/#{querystring}")
http = Net::HTTP.new(url.host, url.port)
http.use_ssl = true
http.verify_mode = OpenSSL::SSL::VERIFY_NONE
request = Net::HTTP::Get.new(url)
# Set your API credentials
request["x-user-email"] = 'chell@aperturescience.com'
request["x-user-token"] = 'apk12345678'
response = http.request(request)
puts response.read_body
Let's inspect the response. You can see that the compound record returned has a number of attributes, including some calculated molecular properties and multiple different identifiers.
Below is the schema for a Compound record. At its top level its contained in an object called data
but most of the details of the compound are located in the attributes
object. You can see more record schemas in the API documentation.
Fetching your compound list
For fetching your entire compound list, your url should be of the form https://secure.transcriptic.com/api/compounds/?filter[organization_id]=org123
The organization_id filter is required here. And here it is in use in ruby.
require 'uri'
require 'net/http'
require 'openssl'
querystring = "filter[organization_id]=org123"
url = URI("https://secure.transcriptic.com/api/compounds/?#{querystring}")
http = Net::HTTP.new(url.host, url.port)
http.use_ssl = true
http.verify_mode = OpenSSL::SSL::VERIFY_NONE
request = Net::HTTP::Get.new(url)
request["x-user-email"] = 'chell@aperturescience.com'
request["x-user-token"] = 'apk12345678'
response = http.request(request)
puts response.read_body
This will then return an array containing compound objects:
{"data":[{"id": "1", ...},
{"id": "2", ...},
{"id": "3", ...},
...
]
}
Creating a new compound
Let's now move on to creating compounds in Strateos, let's take this molecule below:
Here we create a POST
request to create this new compound. Compounds can be created from sdf, SMILES and InChi representations. You need to only provide one identifier in the body of the request. In the example below we're using the SMILES representation.
Let's now inspect the 201
Success response from Strateos:
Using Python and the RDKit with the Strateos API
We're going to switch over from Ruby to Python now so we can make use of the RDKit. In this example we're writing a small function that can fetch a compound from your Strateos collection by its Strateos compound_id
then we use the RDKit to create an RDKit Mol
object. From this point we can do any downstream manipulations or analyses that we would use the RDKit for, in this example just returning the number of atoms in the compound.
from rdkit import Chem
import requests
import json
def MolFromStrateos(compound_id):
url = "https://secure.transcriptic.com/api/compounds/" + compound_id
print(url)
# Set your organiation ID in the filter
querystring = "filter[organization_id]=org1235"
# Set your API credentials in the headers
headers = {
'x-user-email': "chell@aperturescience.com",
'x-user-token': "apk1234667"
}
response = requests.request("GET", url, headers=headers, params=querystring}
try:
mol = Chem.MolFromInchi(response.json()["data"]["attributes"]["inchi"])
except ValueError:
try:
mol = Chem.MolFromSmiles(response.data.smiles)
except ValueError:
print("Error parsing compound from Strateos")
pass
return mol
test_mol = MolFromStrateos("cmpl1d352345fwzv4")
print(test_mol.GetNumAtoms())
# => 8
Let's combine this function along with a cool code example from the RDKit blog written by Greg Landrum that was designed to show off some of the new drawing features. Below we will fetch a molecule from Strateos, create and RDKit Mol
object, generate some conformers and look at partial charge variation across those conformers.
After generating the partial charge distributions for 10 conformers, we generate a map of the mean partial charges across the molecule of interest. RDKit can generate these awesome visualizations.
Next rather than looking at the mean partial charge distribution for the conformers, we want to look at the standard deviation of partial charges across all conformers.
Creating Strateos compounds from the RDKit
Let's take one final step to create a function to make it easier to go from the RDKit Mol
objects to creating them on Strateos. In this function we wrap a Requests POST
object and construct the payload of the request from function arguments including a Mol
object as the primary argument. We use Chem.MolToSmiles()
to get the SMILES representation of the molecule and send this in the payload. We also populate the labels
field to tag the molecule as a statin, so it will be grouped with all other compounds tagged statin
on Strateos.
import json
import requests
organization_id = "org123"
def MolToStrateos(mol, labels=[]):
url = "https://secure.transcriptic.com/api/compounds/"
headers = {
'x-user-email': "chell@aperturescience.com",
'x-user-token': "user-token"
}
data = {"data":
{"attributes":
{"compound":
{"smiles": Chem.MolToSmiles(mol)},
"organization_id": organization_id,
"groups": labels
}
}
}
response = requests.request("POST", url, json=data, headers=headers)
return response
response = MolToStrateos(rosuvastatin, ["statin"])
#> <Response [200]>
Then if we login to our Strateos account we can see our creation of rosuvastatin.
One could use this in a workflow of enumeration where you want to generate 10's of variations of a molecule then create them on Strateos all tagged the same way so they can be identified as a group.
Recap
In this post we went through fetching both individual and lists of compound records from the Strateos API using both Ruby and Python. We also walked through the structure of the Compound
object. Finally we went through a couple of examples combining the awesome cheminformatics package the RDKit with the Strateos compounds endpoint, to integrate Strateos into a cheminformatics pipeline. This article should given an indication of how one will be able to move seamlessly from cheminformatic pipelines through the synthesis and characterization of molecules using the Strateos Robotic Cloud Lab.