skchem.core package

Submodules

skchem.core.atom module

## skchem.core.atom

Defining atoms in scikit-chem.

class skchem.core.atom.Atom[source]

Bases: rdkit.Chem.rdchem.Atom, skchem.core.base.ChemicalObject

Object representing an Atom in scikit-chem.

atomic_number

int – the atomic number of the atom.

element

str – the element symbol of the atom.

mass

float – the mass of the atom.

Usually relative atomic mass unless explicitly set.

props

PropertyView – rdkit properties of the atom.

class skchem.core.atom.AtomView(owner)[source]

Bases: skchem.core.base.ChemicalObjectView

atomic_mass

A pd.Series of the atomic mass of the atoms in the AtomView.

atomic_number

A pd.Series of the atomic number of the atoms in the AtomView.

element

A pd.Series of the element of the atoms in the AtomView.

index

A pd.Index of the atoms in the AtomView.

skchem.core.base module

## skchem.core.base

Define base classes for scikit chem objects

class skchem.core.base.ChemicalObject[source]

Bases: object

A mixin for each chemical object in scikit-chem.

classmethod from_super(obj)[source]

A method that converts the class of an object of parent class to that of the child.

class skchem.core.base.ChemicalObjectIterator(view)[source]

Bases: object

Iterator for chemical object views.

next()
class skchem.core.base.ChemicalObjectView(owner)[source]

Bases: object

Abstract iterable view of chemical objects.

Concrete classes inheriting from it should implement __getitem__ and __len__.

props

Return a property view of the objects in the view.

to_list()[source]

Return a list of objects in the view.

class skchem.core.base.MolPropertyView(obj_view)[source]

Bases: skchem.core.base.View

Mol property wrapper.

This provides properties for the atom and bond views.

get(key, default=None)[source]
keys()[source]

The available property keys on the object.

to_dict()[source]

Return a dict of the properties of the objectos fo the molecular view.

to_frame()[source]

Return a DataFrame of the properties of the objects of the molecular view.

class skchem.core.base.PropertyView(owner)[source]

Bases: skchem.core.base.View

Property object wrapper.

This provides properties for rdkit objects.

keys()[source]

The available property keys on the object.

class skchem.core.base.View[source]

Bases: object

View wrapper interface. Conforms to the dictionary interface.

Objects inheriting from this should implement the keys, getitem, setitem and delitem methods.

clear()[source]

Remove all properties from the object.

get(index, default=None)[source]
items()[source]

Return an iterable of key, value pairs.

keys()[source]
pop(index, default=None)[source]
remove(key)[source]

Remove a property from the object.

to_dict()[source]

Return a dict of the properties on the object.

to_series()[source]

Return a pd.Series of the properties on the object.

skchem.core.bond module

## skchem.core.bond

Defining chemical bonds in scikit-chem.

class skchem.core.bond.Bond[source]

Bases: rdkit.Chem.rdchem.Bond, skchem.core.base.ChemicalObject

Class representing a chemical bond in scikit-chem.

atoms

list[Atom] – list of atoms involved in the bond.

draw()[source]

str: Draw the bond in ascii.

order

int – the order of the bond.

props

PropertyView – rdkit properties of the atom.

to_dict()[source]

dict: Convert to a dictionary representation.

class skchem.core.bond.BondView(owner)[source]

Bases: skchem.core.base.ChemicalObjectView

Bond interface wrapper

index

A pd.Index of the bonds in the BondView.

order

A pd.Series of the bond orders of the bonds in the BondView.

skchem.core.conformer module

## skchem.core.conformer

Defining conformers in scikit-chem.

class skchem.core.conformer.Conformer[source]

Bases: rdkit.Chem.rdchem.Conformer, skchem.core.base.ChemicalObject

Class representing a Conformer in scikit-chem.

atom_positions

Return the atom positions in the conformer for the atoms in the molecule.

is_three_d

Return whether the conformer is three dimensional.

skchem.core.mol module

## skchem.core.mol

Defining molecules in scikit-chem.

class skchem.core.mol.Mol(*args, **kwargs)[source]

Bases: rdkit.Chem.rdchem.Mol, skchem.core.base.ChemicalObject

Class representing a Molecule in scikit-chem.

Mol objects inherit directly from rdkit Mol objects. Therefore, they contain atom and bond information, and may also include properties and atom bookmarks.

Example

Constructors are implemented as class methods with the from_ prefix.

>>> import skchem
>>> m = skchem.Mol.from_smiles('CC(=O)Cl'); m 
<Mol name="None" formula="C2H3ClO" at ...>

This is an rdkit Mol:

>>> from rdkit.Chem import Mol as RDKMol
>>> isinstance(m, RDKMol)
True

A name can be given at initialization: >>> m = skchem.Mol.from_smiles(‘CC(=O)Cl’, name=’acetyl chloride’); m # doctest: +ELLIPSIS <Mol name=”acetyl chloride” formula=”C2H3ClO” at ...>

>>> m.name
'acetyl chloride'

Serializers are implemented as instance methods with the to_ prefix.

>>> m.to_smiles()
'CC(=O)Cl'
>>> m.to_inchi()
'InChI=1S/C2H3ClO/c1-2(3)4/h1H3'
>>> m.to_inchi_key()
'WETWJCDKMRHUPV-UHFFFAOYSA-N'

RDKit properties are accessible through the props property:

>>> m.SetProp('example_key', 'example_value') # set prop with rdkit directly
>>> m.props['example_key']
'example_value'
>>> m.SetIntProp('float_key', 42) # set int prop with rdkit directly
>>> m.props['float_key']
42

They can be set too:

>>> m.props['example_set'] = 'set_value'
>>> m.GetProp('example_set') # getting with rdkit directly
'set_value'

We can export the properties into a dict or a pandas series:

>>> m.props.to_series()
example_key    example_value
example_set        set_value
float_key                 42
dtype: object

Atoms and bonds are provided in views:

>>> m.atoms 
<AtomView values="['C', 'C', 'O', 'Cl']" at ...>
>>> m.bonds 
<BondView values="['C-C', 'C=O', 'C-Cl']" at ...>

These are iterable: >>> [a.element for a in m.atoms] [‘C’, ‘C’, ‘O’, ‘Cl’]

The view provides shorthands for some attributes to get these as pandas objects:

>>> m.atoms.element
atom_idx
0     C
1     C
2     O
3    Cl
dtype: object

Atom and bond props can also be set:

>>> m.atoms[0].props['atom_key'] = 'atom_value'
>>> m.atoms[0].props['atom_key']
'atom_value'

The properties for atoms on the whole molecule can be accessed like so:

>>> m.atoms.props 
<MolPropertyView values="{'atom_key': ['atom_value', None, None, None]}" at ...>

The properties can be exported as a pandas dataframe >>> m.atoms.props.to_frame()

atom_key

atom_idx 0 atom_value 1 None 2 None 3 None

add_hs(inplace=False, add_coords=True, explicit_only=False, only_on_atoms=False)[source]
Parameters:
  • inplace (bool) – Whether to add Hs to Mol, or return a new Mol. Default is False, return a new Mol.
  • add_coords (bool) – Whether to set 3D coordinate for added Hs. Default is True.
  • explicit_only (bool) – Whether to add only explicit Hs, or also implicit ones. Default is False.
  • only_on_atoms (iterable<bool>) – An iterable specifying the atoms to add Hs.
Returns:

Mol with Hs added.

Return type:

skchem.Mol

atoms

List[skchem.Atom] – An iterable over the atoms of the molecule.

bonds

List[skchem.Bond] – An iterable over the bonds of the molecule.

conformers

List[Conformer] – conformers of the molecule.

classmethod from_binary(binary)[source]

Decode a molecule from a binary serialization.

Parameters:binary – The bytes string to decode.
Returns:The molecule encoded in the binary.
Return type:skchem.Mol
classmethod from_inchi(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_mol2block(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_mol2file(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_molblock(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_molfile(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_pdbblock(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_pdbfile(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_smarts(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_smiles(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_tplblock(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_tplfile(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

mass

float – the mass of the molecule.

name

str – The name of the molecule.

Raises:KeyError
props

PropertyView – A dictionary of the properties of the molecule.

remove_hs(inplace=False, sanitize=True, update_explicit=False, implicit_only=False)[source]
Parameters:
  • inplace (bool) – Whether to add Hs to Mol, or return a new Mol. Default is False, return a new Mol.
  • sanitize (bool) – Whether to sanitize after Hs are removed. Default is True.
  • update_explicit (bool) – Whether to update explicit count after the removal. Default is False.
  • implicit_only (bool) – Whether to remove explict and implicit Hs, or Hs only. Default is False.
Returns:

Mol with Hs removed.

Return type:

skchem.Mol

to_binary()[source]

Serialize the molecule to binary encoding.

Parameters:None
Returns:the molecule in bytes.
Return type:bytes

Notes

Due to limitations in RDKit, not all data is serialized. Notably, properties are not, so e.g. compound names are not saved.

to_dict(kind='chemdoodle')[source]

A dictionary representation of the molecule.

Parameters:kind (str) – The type of representation to use. Only chemdoodle is currently supported. Defaults to ‘Chemdoodle’.
Returns:dictionary representation of the molecule.
Return type:dict
to_formula()[source]

str: the chemical formula of the molecule.

Raises:RuntimeError
to_inchi(*args, **kwargs)

The serializer to be bound.

to_inchi_key()[source]

The InChI key of the molecule.

Returns:the InChI key.
Return type:str
Raises:RuntimeError
to_json(kind='chemdoodle')[source]

Serialize a molecule using JSON.

Parameters:kind (str) – The type of serialization to use. Only chemdoodle is currently supported.
Returns:the json string.
Return type:str
to_molblock(*args, **kwargs)

The serializer to be bound.

to_molfile(*args, **kwargs)

The serializer to be bound.

to_pdbblock(*args, **kwargs)

The serializer to be bound.

to_smarts(*args, **kwargs)

The serializer to be bound.

to_smiles(*args, **kwargs)

The serializer to be bound.

to_tplblock(*args, **kwargs)

The serializer to be bound.

to_tplfile(*args, **kwargs)

The serializer to be bound.

skchem.core.mol.bind_constructor(constructor_name, name_to_bind=None)[source]

Bind an (rdkit) constructor to the class

skchem.core.mol.bind_serializer(serializer_name, name_to_bind=None)[source]

Bind an (rdkit) serializer to the class

skchem.core.point module

## skchem.core.point

Defining points in scikit-chem.

class skchem.core.point.Point3D[source]

Bases: rdkit.Geometry.rdGeometry.Point3D, skchem.core.base.ChemicalObject

Class representing a point in scikit-chem

to_dict(two_d=True)[source]

Dictionary representation of the point.

Parameters:two_d (bool) – Whether the point is in two dimensions or three.
Returns:float]: dictionary of coordinates to values.
Return type:dict[str

Module contents

## skchem.core

Module defining chemical types used in scikit-chem.

class skchem.core.Atom[source]

Bases: rdkit.Chem.rdchem.Atom, skchem.core.base.ChemicalObject

Object representing an Atom in scikit-chem.

atomic_number

int – the atomic number of the atom.

element

str – the element symbol of the atom.

mass

float – the mass of the atom.

Usually relative atomic mass unless explicitly set.

props

PropertyView – rdkit properties of the atom.

class skchem.core.Bond[source]

Bases: rdkit.Chem.rdchem.Bond, skchem.core.base.ChemicalObject

Class representing a chemical bond in scikit-chem.

atoms

list[Atom] – list of atoms involved in the bond.

draw()[source]

str: Draw the bond in ascii.

order

int – the order of the bond.

props

PropertyView – rdkit properties of the atom.

to_dict()[source]

dict: Convert to a dictionary representation.

class skchem.core.Conformer[source]

Bases: rdkit.Chem.rdchem.Conformer, skchem.core.base.ChemicalObject

Class representing a Conformer in scikit-chem.

atom_positions

Return the atom positions in the conformer for the atoms in the molecule.

is_three_d

Return whether the conformer is three dimensional.

class skchem.core.Mol(*args, **kwargs)[source]

Bases: rdkit.Chem.rdchem.Mol, skchem.core.base.ChemicalObject

Class representing a Molecule in scikit-chem.

Mol objects inherit directly from rdkit Mol objects. Therefore, they contain atom and bond information, and may also include properties and atom bookmarks.

Example

Constructors are implemented as class methods with the from_ prefix.

>>> import skchem
>>> m = skchem.Mol.from_smiles('CC(=O)Cl'); m 
<Mol name="None" formula="C2H3ClO" at ...>

This is an rdkit Mol:

>>> from rdkit.Chem import Mol as RDKMol
>>> isinstance(m, RDKMol)
True

A name can be given at initialization: >>> m = skchem.Mol.from_smiles(‘CC(=O)Cl’, name=’acetyl chloride’); m # doctest: +ELLIPSIS <Mol name=”acetyl chloride” formula=”C2H3ClO” at ...>

>>> m.name
'acetyl chloride'

Serializers are implemented as instance methods with the to_ prefix.

>>> m.to_smiles()
'CC(=O)Cl'
>>> m.to_inchi()
'InChI=1S/C2H3ClO/c1-2(3)4/h1H3'
>>> m.to_inchi_key()
'WETWJCDKMRHUPV-UHFFFAOYSA-N'

RDKit properties are accessible through the props property:

>>> m.SetProp('example_key', 'example_value') # set prop with rdkit directly
>>> m.props['example_key']
'example_value'
>>> m.SetIntProp('float_key', 42) # set int prop with rdkit directly
>>> m.props['float_key']
42

They can be set too:

>>> m.props['example_set'] = 'set_value'
>>> m.GetProp('example_set') # getting with rdkit directly
'set_value'

We can export the properties into a dict or a pandas series:

>>> m.props.to_series()
example_key    example_value
example_set        set_value
float_key                 42
dtype: object

Atoms and bonds are provided in views:

>>> m.atoms 
<AtomView values="['C', 'C', 'O', 'Cl']" at ...>
>>> m.bonds 
<BondView values="['C-C', 'C=O', 'C-Cl']" at ...>

These are iterable: >>> [a.element for a in m.atoms] [‘C’, ‘C’, ‘O’, ‘Cl’]

The view provides shorthands for some attributes to get these as pandas objects:

>>> m.atoms.element
atom_idx
0     C
1     C
2     O
3    Cl
dtype: object

Atom and bond props can also be set:

>>> m.atoms[0].props['atom_key'] = 'atom_value'
>>> m.atoms[0].props['atom_key']
'atom_value'

The properties for atoms on the whole molecule can be accessed like so:

>>> m.atoms.props 
<MolPropertyView values="{'atom_key': ['atom_value', None, None, None]}" at ...>

The properties can be exported as a pandas dataframe >>> m.atoms.props.to_frame()

atom_key

atom_idx 0 atom_value 1 None 2 None 3 None

add_hs(inplace=False, add_coords=True, explicit_only=False, only_on_atoms=False)[source]
Parameters:
  • inplace (bool) – Whether to add Hs to Mol, or return a new Mol. Default is False, return a new Mol.
  • add_coords (bool) – Whether to set 3D coordinate for added Hs. Default is True.
  • explicit_only (bool) – Whether to add only explicit Hs, or also implicit ones. Default is False.
  • only_on_atoms (iterable<bool>) – An iterable specifying the atoms to add Hs.
Returns:

Mol with Hs added.

Return type:

skchem.Mol

atoms

List[skchem.Atom] – An iterable over the atoms of the molecule.

bonds

List[skchem.Bond] – An iterable over the bonds of the molecule.

conformers

List[Conformer] – conformers of the molecule.

classmethod from_binary(binary)[source]

Decode a molecule from a binary serialization.

Parameters:binary – The bytes string to decode.
Returns:The molecule encoded in the binary.
Return type:skchem.Mol
classmethod from_inchi(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_mol2block(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_mol2file(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_molblock(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_molfile(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_pdbblock(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_pdbfile(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_smarts(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_smiles(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_tplblock(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_tplfile(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

mass

float – the mass of the molecule.

name

str – The name of the molecule.

Raises:KeyError
props

PropertyView – A dictionary of the properties of the molecule.

remove_hs(inplace=False, sanitize=True, update_explicit=False, implicit_only=False)[source]
Parameters:
  • inplace (bool) – Whether to add Hs to Mol, or return a new Mol. Default is False, return a new Mol.
  • sanitize (bool) – Whether to sanitize after Hs are removed. Default is True.
  • update_explicit (bool) – Whether to update explicit count after the removal. Default is False.
  • implicit_only (bool) – Whether to remove explict and implicit Hs, or Hs only. Default is False.
Returns:

Mol with Hs removed.

Return type:

skchem.Mol

to_binary()[source]

Serialize the molecule to binary encoding.

Parameters:None
Returns:the molecule in bytes.
Return type:bytes

Notes

Due to limitations in RDKit, not all data is serialized. Notably, properties are not, so e.g. compound names are not saved.

to_dict(kind='chemdoodle')[source]

A dictionary representation of the molecule.

Parameters:kind (str) – The type of representation to use. Only chemdoodle is currently supported. Defaults to ‘Chemdoodle’.
Returns:dictionary representation of the molecule.
Return type:dict
to_formula()[source]

str: the chemical formula of the molecule.

Raises:RuntimeError
to_inchi(*args, **kwargs)

The serializer to be bound.

to_inchi_key()[source]

The InChI key of the molecule.

Returns:the InChI key.
Return type:str
Raises:RuntimeError
to_json(kind='chemdoodle')[source]

Serialize a molecule using JSON.

Parameters:kind (str) – The type of serialization to use. Only chemdoodle is currently supported.
Returns:the json string.
Return type:str
to_molblock(*args, **kwargs)

The serializer to be bound.

to_molfile(*args, **kwargs)

The serializer to be bound.

to_pdbblock(*args, **kwargs)

The serializer to be bound.

to_smarts(*args, **kwargs)

The serializer to be bound.

to_smiles(*args, **kwargs)

The serializer to be bound.

to_tplblock(*args, **kwargs)

The serializer to be bound.

to_tplfile(*args, **kwargs)

The serializer to be bound.