skchem.core package

Submodules

skchem.core.atom module

## skchem.core.atom

Defining atoms in scikit-chem.

class skchem.core.atom.Atom[source]

Bases: rdkit.Chem.rdchem.Atom, skchem.core.base.ChemicalObject

Object representing an Atom in scikit-chem.

atomic_mass

float – the atomic mass of the atom in u.

atomic_number

int – the atomic number of the atom.

bonds

tuple<skchem.Bonds> – the bonds to this Atom.

cahn_ingold_prelog

The Cahn Ingold Prelog chirality indicator.

chiral_tag

int – the chiral tag.

covalent_radius

float – the covalent radius in angstroms.

degree

int – the degree of the atom.

depleted_degree

int – the degree of the atom in the h depleted molecular graph.

electron_affinity

float – the first electron affinity in eV.

explicit_valence

int – the explicit valence.

formal_charge

int – the formal charge.

full_degree

int – the full degree of the atom in the h full molecular graph.

hexcode

The hexcode to use as a color for the atom.

hybridization_state

str – the hybridization state.

implicit_valence

int – the implicit valence.

index

int – the index of the atom.

intrinsic_state

float – the intrinsic state of the atom.

ionisation_energy

float – the first ionisation energy in eV.

is_aromatic

bool – whether the atom is aromatic.

is_in_ring

bool – whether the atom is in a ring.

is_terminal

bool – whether the atom is terminal.

kier_hall_alpha_contrib

float – the covalent radius in angstroms.

kier_hall_electronegativity

float – the hall-keir electronegativity.

mcgowan_parameter

float – the mcgowan volume parameter

n_explicit_hs

int – the number of explicit hydrogens.

n_hs

int – the instanced, implicit and explicit number of hydrogens

n_implicit_hs

int – the number of implicit hydrogens.

n_instanced_hs

int – The number of instanced hydrogens.

n_lone_pairs

int – the number of lone pairs.

n_pi_electrons

int – the number of pi electrons.

n_total_hs

int – the total number of hydrogens (according to rdkit).

n_val_electrons

int – the number of valence electrons.

neighbours()[source]

tuple<Atom>: the neighbours of the atom.

owner

skchem.Mol – the owning molecule.

Warning

This will seg fault if the atom is created manually.

pauling_electronegativity

float – the pauling electronegativity on Pauling scale.

polarisability

float – the atomic polarisability in 10^{-20} m^3.

principal_quantum_number

int – the principle quantum number.

props

PropertyView – rdkit properties of the atom.

sanderson_electronegativity

float – the sanderson electronegativity on Pauling scale.

symbol

str – the element symbol of the atom.

valence

int – the valence.

valence_degree

int – the valence degree.

$$ delta_i^v = Z_i^v - h_i $$

Where $ Z_i^v $ is the number of valence electrons and $ h_i $ is the number of hydrogens.

van_der_waals_radius

float – the Van der Waals radius in angstroms.

van_der_waals_volume

float

the van der waals volume in angstroms^3.

$

rac{4}{3} pi r_v^3 $

class skchem.core.atom.AtomView(owner)[source]

Bases: skchem.core.base.ChemicalObjectView

adjacency_matrix(bond_orders=False, force=True)[source]

The vertex adjacency matrix.

Parameters:
  • bond_orders (bool) – Whether to use bond orders.
  • force (bool) – Whether to recalculate or used rdkit cached value.
Returns:

np.array[int]

atomic_mass

np.array<float> – the atomic mass of the atoms in view

atomic_number

np.array<int> – the atomic number of the atoms in view

cahn_ingold_prelog

np.array<str> – the CIP string representation of atoms in view.

chiral_tag

np.array<str> – the chiral tag of the atoms in view.

covalent_radius

np.array<float> – the covalent radius of the atoms in the view.

degree

np.array<int> – the degree of the atoms in view, according to rdkit.

depleted_degree

np.array<int> – the degree of the atoms in the view in the h-depleted molecular graph.

distance_matrix(bond_orders=False, force=True)[source]

The vertex distance matrix.

Parameters:
  • bond_orders (bool) – Whether to use bond orders.
  • force (bool) – Whether to recalculate or used rdkit cached value.
Returns:

np.array[int]

electron_affinity

np.array<float> – the electron affinity of the atoms in the view.

explicit_valence

np.array<int> – the explicit valence of the atoms in view..

formal_charge

np.array<int> – the formal charge on the atoms in view

full_degree

np.array<int> – the degree of the atoms in the view in the h-filled molecular graph.

hexcode

The hexcode to use as a color for the atoms in the view.

hybridization_state

np.array<str> – the hybridization state of the atoms in view.

One of ‘SP’, ‘SP2’, ‘SP3’, ‘SP3D’, ‘SP3D2’, ‘UNSPECIFIED’, ‘OTHER’

implicit_valence

np.array<int> – the explicit valence of the atoms in view.

index

pd.Index – an index for the atoms in the view.

intrinsic_state

np.ndarray<float> – the intrinsic state of the atoms in the view.

ionisation_energy

np.array<float> – the first ionisation energy of the atoms in the view.

is_aromatic

np.array<bool> – whether the atoms in the view are aromatic.

is_in_ring

np.array<bool> – whether the atoms in the view are in a ring.

is_terminal

np.array<bool> – whether the atoms in the view are terminal.

kier_hall_alpha_contrib

np.array<float> – the contribution to the kier hall alpha for each atom in the view.

kier_hall_electronegativity

np.array<float> – the hall kier electronegativity of the atoms in the view.

mcgowan_parameter

np.array<float> – the mcgowan parameter of the atoms in the iew.

n_explicit_hs

np.array<int> – the number of explicit hydrogens bonded to atoms in view, according to rdkit.

n_hs

np.array<int> – the number of hydrogens bonded to atoms in view.

n_implicit_hs

np.array<int> – the number of implicit hydrogens bonded to atoms in view, according to rdkit.

n_instanced_hs

np.array<int> – the number of instanced hydrogens bonded to atoms in view.

In this case, instanced means the number hs explicitly initialized as atoms.

n_lone_pairs

np.array<int> – the number of lone pairs on atoms in view.

n_pi_electrons

np.array<int> – the number of pi electrons on atoms in view.

n_total_hs

np.array<int> – the number of total hydrogens bonded to atoms in view, according to rdkit.

n_val_electrons

np.array<int> – the number of valence electrons bonded to atoms in view.

pauling_electronegativity

np.array<float> – the pauling electronegativity of the atoms in the view.

polarisability

np.array<float> – the atomic polarisability of the atoms in the view.

principal_quantum_number

np.array<float> – the principal quantum number of the atoms in the view.

sanderson_electronegativity

np.array<float> – the sanderson electronegativity of the atoms in the view.

symbol

np.array<str> – the symbols of the atoms in view

valence

np.array<int> – the valence of the atoms in view.

valence_degree

np.array<int> – the valence degree of the atoms in the view.

van_der_waals_radius

np.array<float> – the Van der Waals radius of the atoms in the view.

van_der_waals_volume

np.array<float> – the Van der Waals volume of the atoms in the view.

skchem.core.base module

## skchem.core.base

Define base classes for scikit chem objects

class skchem.core.base.ChemicalObject[source]

Bases: object

A mixin for each chemical object in scikit-chem.

classmethod from_super(obj)[source]

Converts the class of an object to this class.

class skchem.core.base.ChemicalObjectIterator(view)[source]

Bases: object

Iterator for chemical object views.

next()
class skchem.core.base.ChemicalObjectView(owner)[source]

Bases: object

Abstract iterable view of chemical objects.

Concrete classes inheriting from it should implement __getitem__ and __len__.

props

Return a property view of the objects in the view.

to_list()[source]

Return a list of objects in the view.

class skchem.core.base.MolPropertyView(obj_view)[source]

Bases: skchem.core.base.View

Mol property wrapper.

This provides properties for the atom and bond views.

get(key, default=None)[source]
keys()[source]

The available property keys on the object.

to_dict()[source]

Return a dict of the properties of the view’s objects.

to_frame()[source]

Return a DataFrame of the properties of the view’s objects.

class skchem.core.base.PropertyView(owner)[source]

Bases: skchem.core.base.View

Property object wrapper.

This provides properties for rdkit objects.

keys()[source]

The available property keys on the object.

class skchem.core.base.View[source]

Bases: object

View wrapper interface. Conforms to the dictionary interface.

Objects inheriting from this should implement the keys, getitem, setitem and delitem methods.

clear()[source]

Remove all properties from the object.

get(index, default=None)[source]
items()[source]

Return an iterable of key, value pairs.

keys()[source]
pop(index, default=None)[source]
remove(key)[source]

Remove a property from the object.

to_dict()[source]

Return a dict of the properties on the object.

to_series()[source]

Return a pd.Series of the properties on the object.

skchem.core.bond module

## skchem.core.bond

Defining chemical bonds in scikit-chem.

class skchem.core.bond.Atom[source]

Bases: rdkit.Chem.rdchem.Atom, skchem.core.base.ChemicalObject

Object representing an Atom in scikit-chem.

atomic_mass

float – the atomic mass of the atom in u.

atomic_number

int – the atomic number of the atom.

bonds

tuple<skchem.Bonds> – the bonds to this Atom.

cahn_ingold_prelog

The Cahn Ingold Prelog chirality indicator.

chiral_tag

int – the chiral tag.

covalent_radius

float – the covalent radius in angstroms.

degree

int – the degree of the atom.

depleted_degree

int – the degree of the atom in the h depleted molecular graph.

electron_affinity

float – the first electron affinity in eV.

explicit_valence

int – the explicit valence.

formal_charge

int – the formal charge.

full_degree

int – the full degree of the atom in the h full molecular graph.

hexcode

The hexcode to use as a color for the atom.

hybridization_state

str – the hybridization state.

implicit_valence

int – the implicit valence.

index

int – the index of the atom.

intrinsic_state

float – the intrinsic state of the atom.

ionisation_energy

float – the first ionisation energy in eV.

is_aromatic

bool – whether the atom is aromatic.

is_in_ring

bool – whether the atom is in a ring.

is_terminal

bool – whether the atom is terminal.

kier_hall_alpha_contrib

float – the covalent radius in angstroms.

kier_hall_electronegativity

float – the hall-keir electronegativity.

mcgowan_parameter

float – the mcgowan volume parameter

n_explicit_hs

int – the number of explicit hydrogens.

n_hs

int – the instanced, implicit and explicit number of hydrogens

n_implicit_hs

int – the number of implicit hydrogens.

n_instanced_hs

int – The number of instanced hydrogens.

n_lone_pairs

int – the number of lone pairs.

n_pi_electrons

int – the number of pi electrons.

n_total_hs

int – the total number of hydrogens (according to rdkit).

n_val_electrons

int – the number of valence electrons.

neighbours()[source]

tuple<Atom>: the neighbours of the atom.

owner

skchem.Mol – the owning molecule.

Warning

This will seg fault if the atom is created manually.

pauling_electronegativity

float – the pauling electronegativity on Pauling scale.

polarisability

float – the atomic polarisability in 10^{-20} m^3.

principal_quantum_number

int – the principle quantum number.

props

PropertyView – rdkit properties of the atom.

sanderson_electronegativity

float – the sanderson electronegativity on Pauling scale.

symbol

str – the element symbol of the atom.

valence

int – the valence.

valence_degree

int – the valence degree.

$$ delta_i^v = Z_i^v - h_i $$

Where $ Z_i^v $ is the number of valence electrons and $ h_i $ is the number of hydrogens.

van_der_waals_radius

float – the Van der Waals radius in angstroms.

van_der_waals_volume

float

the van der waals volume in angstroms^3.

$

rac{4}{3} pi r_v^3 $

skchem.core.conformer module

## skchem.core.conformer

Defining conformers in scikit-chem.

class skchem.core.conformer.Conformer[source]

Bases: rdkit.Chem.rdchem.Conformer, skchem.core.base.ChemicalObject

Class representing a Conformer in scikit-chem.

align_with_principal_axes()[source]

Align the reference frame with the principal axes of inertia.

canonicalize()[source]

Center the reference frame at the centre of mass and

centre_of_mass

np.array – the centre of mass of the comformer.

centre_representation(centre_of_mass=True)[source]

Centre representation to the center of mass.

Parameters:centre_of_mass (bool) – Whether to use the masses of atoms to calculate the centre of mass, or just use the mean position coordinate.
Returns:Conformer
geometric_centre

np.array – the geometric centre of the conformer.

id

The ID of the conformer.

is_3d

bool – whether the conformer is three dimensional.

owner

skchem.Mol – the owning molecule.

positions

np.ndarray – the atom positions in the conformer.

Note

This is a copy of the data, not the data itself. You cannot allocate to a slice of this.

class skchem.core.conformer.ConformerIterator(view)[source]

Bases: object

Iterator for chemical object views.

next()
class skchem.core.conformer.ConformerView(owner)[source]

Bases: skchem.core.base.ChemicalObjectView

append(value)[source]
append_2d(**kwargs)[source]

Append a 2D conformer.

append_3d(n_conformers=1, **kwargs)[source]

Append (a) 3D conformer(s), roughly embedded but not optimized.

Parameters:
  • n_conformers (int) – The number of conformers to append.
  • kwargs are passed to EmbedMultipleConfs. (Further) –
id
is_3d
positions

skchem.core.mol module

## skchem.core.mol

Defining molecules in scikit-chem.

class skchem.core.mol.Mol(*args, **kwargs)[source]

Bases: rdkit.Chem.rdchem.Mol, skchem.core.base.ChemicalObject

Class representing a Molecule in scikit-chem.

Mol objects inherit directly from rdkit Mol objects. Therefore, they contain atom and bond information, and may also include properties and atom bookmarks.

Example

Constructors are implemented as class methods with the from_ prefix.

>>> import skchem
>>> m = skchem.Mol.from_smiles('CC(=O)Cl'); m 
<Mol name="None" formula="C2H3ClO" at ...>

This is an rdkit Mol:

>>> from rdkit.Chem import Mol as RDKMol
>>> isinstance(m, RDKMol)
True

A name can be given at initialization: >>> m = skchem.Mol.from_smiles(‘CC(=O)Cl’, name=’acetyl chloride’); m # doctest: +ELLIPSIS <Mol name=”acetyl chloride” formula=”C2H3ClO” at ...>

>>> m.name
'acetyl chloride'

Serializers are implemented as instance methods with the to_ prefix.

>>> m.to_smiles()
'CC(=O)Cl'
>>> m.to_inchi()
'InChI=1S/C2H3ClO/c1-2(3)4/h1H3'
>>> m.to_inchi_key()
'WETWJCDKMRHUPV-UHFFFAOYSA-N'

RDKit properties are accessible through the props property:

>>> m.SetProp('example_key', 'example_value') # set prop with rdkit directly
>>> m.props['example_key']
'example_value'
>>> m.SetIntProp('float_key', 42) # set int prop with rdkit directly
>>> m.props['float_key']
42

They can be set too:

>>> m.props['example_set'] = 'set_value'
>>> m.GetProp('example_set') # getting with rdkit directly
'set_value'

We can export the properties into a dict or a pandas series:

>>> m.props.to_series()
example_key    example_value
example_set        set_value
float_key                 42
dtype: object

Atoms and bonds are provided in views:

>>> m.atoms 
<AtomView values="['C', 'C', 'O', 'Cl']" at ...>
>>> m.bonds 
<BondView values="['C-C', 'C=O', 'C-Cl']" at ...>

These are iterable: >>> [a.symbol for a in m.atoms] [‘C’, ‘C’, ‘O’, ‘Cl’]

The view provides shorthands for some attributes to get these:

>>> m.atoms.symbol  
array(['C', 'C', 'O', 'Cl'], dtype=...)

Atom and bond props can also be set:

>>> m.atoms[0].props['atom_key'] = 'atom_value'
>>> m.atoms[0].props['atom_key']
'atom_value'

The properties for atoms on the whole molecule can be accessed like so:

>>> m.atoms.props 
<MolPropertyView values="{'atom_key': ['atom_value', None, None, None]}" at ...>

The properties can be exported as a pandas dataframe >>> m.atoms.props.to_frame()

atom_key

atom_idx 0 atom_value 1 None 2 None 3 None

add_hs(inplace=False, add_coords=True, explicit_only=False, only_on_atoms=False)[source]

Add hydrogens to self.

Parameters:
  • inplace (bool) – Whether to add Hs to Mol, or return a new Mol.
  • add_coords (bool) – Whether to set 3D coordinate for added Hs.
  • explicit_only (bool) – Whether to add only explicit Hs, or also implicit ones.
  • only_on_atoms (iterable<bool>) – An iterable specifying the atoms to add Hs.
Returns:

Mol with Hs added.

Return type:

skchem.Mol

atoms

List[skchem.Atom] – An iterable over the atoms of the molecule.

bonds

List[skchem.Bond] – An iterable over the bonds of the molecule.

conformers

List[Conformer] – conformers of the molecule.

copy()[source]

Return a copy of the molecule.

classmethod from_binary(binary)[source]

Decode a molecule from a binary serialization.

Parameters:binary – The bytes string to decode.
Returns:The molecule encoded in the binary.
Return type:skchem.Mol
classmethod from_inchi(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_mol2block(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_mol2file(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_molblock(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_molfile(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_pdbblock(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_pdbfile(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_smarts(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_smiles(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_tplblock(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_tplfile(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

mass

float – the mass of the molecule.

name

str – The name of the molecule.

Raises:KeyError
props

PropertyView – A dictionary of the properties of the molecule.

remove_hs(inplace=False, sanitize=True, update_explicit=False, implicit_only=False)[source]

Remove hydrogens from self.

Parameters:
  • inplace (bool) – Whether to add Hs to Mol, or return a new Mol.
  • sanitize (bool) – Whether to sanitize after Hs are removed.
  • update_explicit (bool) – Whether to update explicit count after the removal.
  • implicit_only (bool) – Whether to remove explict and implicit Hs, or Hs only.
Returns:

Mol with Hs removed.

Return type:

skchem.Mol

to_binary()[source]

Serialize the molecule to binary encoding.

Returns:the molecule in bytes.
Return type:bytes

Notes

Due to limitations in RDKit, not all data is serialized. Notably, properties are not, so e.g. compound names are not saved.

to_dict(kind='chemdoodle', conformer_id=-1)[source]

A dictionary representation of the molecule.

Parameters:kind (str) – The type of representation to use. Only chemdoodle is currently supported.
Returns:dictionary representation of the molecule.
Return type:dict
to_formula()[source]

str: the chemical formula of the molecule.

Raises:RuntimeError
to_inchi(*args, **kwargs)

The serializer to be bound.

to_inchi_key()[source]

The InChI key of the molecule.

Returns:the InChI key.
Return type:str
Raises:RuntimeError
to_json(kind='chemdoodle')[source]

Serialize a molecule using JSON.

Parameters:kind (str) – The type of serialization to use. Only chemdoodle is currently supported.
Returns:the json string.
Return type:str
to_molblock(*args, **kwargs)

The serializer to be bound.

to_molfile(*args, **kwargs)

The serializer to be bound.

to_pdbblock(*args, **kwargs)

The serializer to be bound.

to_smarts(*args, **kwargs)

The serializer to be bound.

to_smiles(*args, **kwargs)

The serializer to be bound.

to_tplblock(*args, **kwargs)

The serializer to be bound.

to_tplfile(*args, **kwargs)

The serializer to be bound.

skchem.core.mol.bind_constructor(constructor_name, name_to_bind=None)[source]

Bind an (rdkit) constructor to the class

skchem.core.mol.bind_serializer(serializer_name, name_to_bind=None)[source]

Bind an (rdkit) serializer to the class

skchem.core.point module

Module contents

## skchem.core

Module defining chemical types used in scikit-chem.

class skchem.core.Atom[source]

Bases: rdkit.Chem.rdchem.Atom, skchem.core.base.ChemicalObject

Object representing an Atom in scikit-chem.

atomic_mass

float – the atomic mass of the atom in u.

atomic_number

int – the atomic number of the atom.

bonds

tuple<skchem.Bonds> – the bonds to this Atom.

cahn_ingold_prelog

The Cahn Ingold Prelog chirality indicator.

chiral_tag

int – the chiral tag.

covalent_radius

float – the covalent radius in angstroms.

degree

int – the degree of the atom.

depleted_degree

int – the degree of the atom in the h depleted molecular graph.

electron_affinity

float – the first electron affinity in eV.

explicit_valence

int – the explicit valence.

formal_charge

int – the formal charge.

full_degree

int – the full degree of the atom in the h full molecular graph.

hexcode

The hexcode to use as a color for the atom.

hybridization_state

str – the hybridization state.

implicit_valence

int – the implicit valence.

index

int – the index of the atom.

intrinsic_state

float – the intrinsic state of the atom.

ionisation_energy

float – the first ionisation energy in eV.

is_aromatic

bool – whether the atom is aromatic.

is_in_ring

bool – whether the atom is in a ring.

is_terminal

bool – whether the atom is terminal.

kier_hall_alpha_contrib

float – the covalent radius in angstroms.

kier_hall_electronegativity

float – the hall-keir electronegativity.

mcgowan_parameter

float – the mcgowan volume parameter

n_explicit_hs

int – the number of explicit hydrogens.

n_hs

int – the instanced, implicit and explicit number of hydrogens

n_implicit_hs

int – the number of implicit hydrogens.

n_instanced_hs

int – The number of instanced hydrogens.

n_lone_pairs

int – the number of lone pairs.

n_pi_electrons

int – the number of pi electrons.

n_total_hs

int – the total number of hydrogens (according to rdkit).

n_val_electrons

int – the number of valence electrons.

neighbours()[source]

tuple<Atom>: the neighbours of the atom.

owner

skchem.Mol – the owning molecule.

Warning

This will seg fault if the atom is created manually.

pauling_electronegativity

float – the pauling electronegativity on Pauling scale.

polarisability

float – the atomic polarisability in 10^{-20} m^3.

principal_quantum_number

int – the principle quantum number.

props

PropertyView – rdkit properties of the atom.

sanderson_electronegativity

float – the sanderson electronegativity on Pauling scale.

symbol

str – the element symbol of the atom.

valence

int – the valence.

valence_degree

int – the valence degree.

$$ delta_i^v = Z_i^v - h_i $$

Where $ Z_i^v $ is the number of valence electrons and $ h_i $ is the number of hydrogens.

van_der_waals_radius

float – the Van der Waals radius in angstroms.

van_der_waals_volume

float

the van der waals volume in angstroms^3.

$

rac{4}{3} pi r_v^3 $

class skchem.core.Bond[source]

Bases: rdkit.Chem.rdchem.Bond, skchem.core.base.ChemicalObject

Class representing a chemical bond in scikit-chem.

atom_idxs

tuple[int] – list of atom indexes involved in the bond.

atoms

tuple[Atom] – list of atoms involved in the bond.

draw()[source]

str: Draw the bond in ascii.

index

int – the index of the bond in the atom.

is_aromatic

bool – whether the bond is aromatic.

is_conjugated

bool – whether the bond is conjugated.

is_in_ring

bool – whether the bond is in a ring.

order

int – the order of the bond.

owner

skchem.Mol – the molecule this bond is a part of.

props

PropertyView – rdkit properties of the atom.

stereo_symbol

str – the stereo label of the bond (‘Z’, ‘E’, ‘ANY’, ‘NONE’)

to_dict()[source]

dict: Convert to a dictionary representation.

class skchem.core.Conformer[source]

Bases: rdkit.Chem.rdchem.Conformer, skchem.core.base.ChemicalObject

Class representing a Conformer in scikit-chem.

align_with_principal_axes()[source]

Align the reference frame with the principal axes of inertia.

canonicalize()[source]

Center the reference frame at the centre of mass and

centre_of_mass

np.array – the centre of mass of the comformer.

centre_representation(centre_of_mass=True)[source]

Centre representation to the center of mass.

Parameters:centre_of_mass (bool) – Whether to use the masses of atoms to calculate the centre of mass, or just use the mean position coordinate.
Returns:Conformer
geometric_centre

np.array – the geometric centre of the conformer.

id

The ID of the conformer.

is_3d

bool – whether the conformer is three dimensional.

owner

skchem.Mol – the owning molecule.

positions

np.ndarray – the atom positions in the conformer.

Note

This is a copy of the data, not the data itself. You cannot allocate to a slice of this.

class skchem.core.Mol(*args, **kwargs)[source]

Bases: rdkit.Chem.rdchem.Mol, skchem.core.base.ChemicalObject

Class representing a Molecule in scikit-chem.

Mol objects inherit directly from rdkit Mol objects. Therefore, they contain atom and bond information, and may also include properties and atom bookmarks.

Example

Constructors are implemented as class methods with the from_ prefix.

>>> import skchem
>>> m = skchem.Mol.from_smiles('CC(=O)Cl'); m 
<Mol name="None" formula="C2H3ClO" at ...>

This is an rdkit Mol:

>>> from rdkit.Chem import Mol as RDKMol
>>> isinstance(m, RDKMol)
True

A name can be given at initialization: >>> m = skchem.Mol.from_smiles(‘CC(=O)Cl’, name=’acetyl chloride’); m # doctest: +ELLIPSIS <Mol name=”acetyl chloride” formula=”C2H3ClO” at ...>

>>> m.name
'acetyl chloride'

Serializers are implemented as instance methods with the to_ prefix.

>>> m.to_smiles()
'CC(=O)Cl'
>>> m.to_inchi()
'InChI=1S/C2H3ClO/c1-2(3)4/h1H3'
>>> m.to_inchi_key()
'WETWJCDKMRHUPV-UHFFFAOYSA-N'

RDKit properties are accessible through the props property:

>>> m.SetProp('example_key', 'example_value') # set prop with rdkit directly
>>> m.props['example_key']
'example_value'
>>> m.SetIntProp('float_key', 42) # set int prop with rdkit directly
>>> m.props['float_key']
42

They can be set too:

>>> m.props['example_set'] = 'set_value'
>>> m.GetProp('example_set') # getting with rdkit directly
'set_value'

We can export the properties into a dict or a pandas series:

>>> m.props.to_series()
example_key    example_value
example_set        set_value
float_key                 42
dtype: object

Atoms and bonds are provided in views:

>>> m.atoms 
<AtomView values="['C', 'C', 'O', 'Cl']" at ...>
>>> m.bonds 
<BondView values="['C-C', 'C=O', 'C-Cl']" at ...>

These are iterable: >>> [a.symbol for a in m.atoms] [‘C’, ‘C’, ‘O’, ‘Cl’]

The view provides shorthands for some attributes to get these:

>>> m.atoms.symbol  
array(['C', 'C', 'O', 'Cl'], dtype=...)

Atom and bond props can also be set:

>>> m.atoms[0].props['atom_key'] = 'atom_value'
>>> m.atoms[0].props['atom_key']
'atom_value'

The properties for atoms on the whole molecule can be accessed like so:

>>> m.atoms.props 
<MolPropertyView values="{'atom_key': ['atom_value', None, None, None]}" at ...>

The properties can be exported as a pandas dataframe >>> m.atoms.props.to_frame()

atom_key

atom_idx 0 atom_value 1 None 2 None 3 None

add_hs(inplace=False, add_coords=True, explicit_only=False, only_on_atoms=False)[source]

Add hydrogens to self.

Parameters:
  • inplace (bool) – Whether to add Hs to Mol, or return a new Mol.
  • add_coords (bool) – Whether to set 3D coordinate for added Hs.
  • explicit_only (bool) – Whether to add only explicit Hs, or also implicit ones.
  • only_on_atoms (iterable<bool>) – An iterable specifying the atoms to add Hs.
Returns:

Mol with Hs added.

Return type:

skchem.Mol

atoms

List[skchem.Atom] – An iterable over the atoms of the molecule.

bonds

List[skchem.Bond] – An iterable over the bonds of the molecule.

conformers

List[Conformer] – conformers of the molecule.

copy()[source]

Return a copy of the molecule.

classmethod from_binary(binary)[source]

Decode a molecule from a binary serialization.

Parameters:binary – The bytes string to decode.
Returns:The molecule encoded in the binary.
Return type:skchem.Mol
classmethod from_inchi(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_mol2block(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_mol2file(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_molblock(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_molfile(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_pdbblock(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_pdbfile(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_smarts(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_smiles(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_tplblock(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

classmethod from_tplfile(_, in_arg, name=None, *args, **kwargs)

The constructor to be bound.

mass

float – the mass of the molecule.

name

str – The name of the molecule.

Raises:KeyError
props

PropertyView – A dictionary of the properties of the molecule.

remove_hs(inplace=False, sanitize=True, update_explicit=False, implicit_only=False)[source]

Remove hydrogens from self.

Parameters:
  • inplace (bool) – Whether to add Hs to Mol, or return a new Mol.
  • sanitize (bool) – Whether to sanitize after Hs are removed.
  • update_explicit (bool) – Whether to update explicit count after the removal.
  • implicit_only (bool) – Whether to remove explict and implicit Hs, or Hs only.
Returns:

Mol with Hs removed.

Return type:

skchem.Mol

to_binary()[source]

Serialize the molecule to binary encoding.

Returns:the molecule in bytes.
Return type:bytes

Notes

Due to limitations in RDKit, not all data is serialized. Notably, properties are not, so e.g. compound names are not saved.

to_dict(kind='chemdoodle', conformer_id=-1)[source]

A dictionary representation of the molecule.

Parameters:kind (str) – The type of representation to use. Only chemdoodle is currently supported.
Returns:dictionary representation of the molecule.
Return type:dict
to_formula()[source]

str: the chemical formula of the molecule.

Raises:RuntimeError
to_inchi(*args, **kwargs)

The serializer to be bound.

to_inchi_key()[source]

The InChI key of the molecule.

Returns:the InChI key.
Return type:str
Raises:RuntimeError
to_json(kind='chemdoodle')[source]

Serialize a molecule using JSON.

Parameters:kind (str) – The type of serialization to use. Only chemdoodle is currently supported.
Returns:the json string.
Return type:str
to_molblock(*args, **kwargs)

The serializer to be bound.

to_molfile(*args, **kwargs)

The serializer to be bound.

to_pdbblock(*args, **kwargs)

The serializer to be bound.

to_smarts(*args, **kwargs)

The serializer to be bound.

to_smiles(*args, **kwargs)

The serializer to be bound.

to_tplblock(*args, **kwargs)

The serializer to be bound.

to_tplfile(*args, **kwargs)

The serializer to be bound.