Skip to content

Specie

mdinterface.core.specie.Specie

Bases: object

A single molecular species with geometry and force-field parameters.

Combines an ASE :class:ase.Atoms object with the topology (atom types, bonds, angles, dihedrals, impropers) and Lennard-Jones parameters needed to write a LAMMPS data file. All database entries (Water, Ion, Metal111, …) and Polymer inherit from this class.

.. note:: All topology parameters (bonds, angles, dihedrals, impropers, lj, atom_types, charges) are optional. A bare Specie(atoms) containing only positions is perfectly valid and sufficient for geometry tasks, AIMD, or ML-MD workflows. Force-field parameters are only needed when writing LAMMPS data files.

Parameters:

Name Type Description Default
atoms Atoms or str

Molecular geometry. A chemical formula string (e.g. "H2O") is accepted and converted to an ase.Atoms object.

None
charges float, list, or array-like

Partial charges in elementary charge units. A scalar broadcasts to all atoms.

None
atom_types list of Atom

Force-field atom-type objects. If None and lj is provided, types are inferred from element symbols.

None
bonds Bond or list of Bond

Bond interaction definitions.

None
angles Angle or list of Angle

Angle interaction definitions.

None
dihedrals Dihedral or list of Dihedral

Dihedral interaction definitions.

None
impropers Improper or list of Improper

Improper dihedral definitions.

None
lj dict

Lennard-Jones parameters keyed by element symbol: {element: [epsilon (kcal/mol), sigma (Å)]}.

None
cutoff float

Scale factor applied to covalent radii when building the bond graph. Increase slightly for flexible molecules with long bonds.

1.0
name str

Residue name (truncated to 4 characters for PACKMOL compatibility). Defaults to the chemical formula.

None
lammps_data str

Path to a LAMMPS data file. When provided, geometry and topology are read from this file and all other structural arguments are ignored.

None
fix_missing bool

If True, attempt to auto-generate missing bond/angle/dihedral entries based on the bond graph.

False
chg_scaling float

Multiplicative scale factor applied to charges.

1.0
pbc bool

Whether to treat the atomic cell as periodic.

False
ligpargen bool

If True, call LigParGen to generate OPLS-AA force-field parameters. Requires a working LigParGen installation and BOSSdir in config.

False
tot_charge int

Total molecular charge (integer). Inferred from nominal_charge arrays if not supplied.

None
prune_z bool

Remove the Z-coordinate component from atom positions.

False
calc Calculator

ASE calculator to attach to the ase.Atoms object.

None
keep_ids bool

Preserve atom IDs as read from the input rather than re-indexing.

False

Examples:

Load from the built-in database::

from mdinterface.database import Water, Ion
water = Water(model="ewald")
na    = Ion("Na", ffield="Cheatham")

Define a custom molecule::

from ase import Atoms
from mdinterface import Specie
mol = Atoms("CO2", positions=[[0,0,0],[1.16,0,0],[-1.16,0,0]])
co2 = Specie(mol, charges=[-0.3298, 0.6596, -0.3298])

Generate OPLS-AA parameters with LigParGen::

methanol = Specie("CH3OH", ligpargen=True)
Source code in mdinterface/core/specie.py
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
class Specie(object):
    """
    A single molecular species with geometry and force-field parameters.

    Combines an ASE :class:`ase.Atoms` object with the topology (atom types,
    bonds, angles, dihedrals, impropers) and Lennard-Jones parameters needed
    to write a LAMMPS data file.  All database entries (Water, Ion, Metal111,
    …) and Polymer inherit from this class.

    .. note::
        All topology parameters (``bonds``, ``angles``, ``dihedrals``,
        ``impropers``, ``lj``, ``atom_types``, ``charges``) are **optional**.
        A bare ``Specie(atoms)`` containing only positions is perfectly valid
        and sufficient for geometry tasks, AIMD, or ML-MD workflows.
        Force-field parameters are only needed when writing LAMMPS data files.

    Parameters
    ----------
    atoms : ase.Atoms or str, optional
        Molecular geometry.  A chemical formula string (e.g. ``"H2O"``) is
        accepted and converted to an ``ase.Atoms`` object.
    charges : float, list, or array-like, optional
        Partial charges in elementary charge units.  A scalar broadcasts to
        all atoms.
    atom_types : list of Atom, optional
        Force-field atom-type objects.  If *None* and *lj* is provided, types
        are inferred from element symbols.
    bonds : Bond or list of Bond, optional
        Bond interaction definitions.
    angles : Angle or list of Angle, optional
        Angle interaction definitions.
    dihedrals : Dihedral or list of Dihedral, optional
        Dihedral interaction definitions.
    impropers : Improper or list of Improper, optional
        Improper dihedral definitions.
    lj : dict, optional
        Lennard-Jones parameters keyed by element symbol:
        ``{element: [epsilon (kcal/mol), sigma (Å)]}``.
    cutoff : float, default 1.0
        Scale factor applied to covalent radii when building the bond graph.
        Increase slightly for flexible molecules with long bonds.
    name : str, optional
        Residue name (truncated to 4 characters for PACKMOL compatibility).
        Defaults to the chemical formula.
    lammps_data : str, optional
        Path to a LAMMPS data file.  When provided, geometry and topology are
        read from this file and all other structural arguments are ignored.
    fix_missing : bool, default False
        If True, attempt to auto-generate missing bond/angle/dihedral entries
        based on the bond graph.
    chg_scaling : float, default 1.0
        Multiplicative scale factor applied to *charges*.
    pbc : bool, default False
        Whether to treat the atomic cell as periodic.
    ligpargen : bool, default False
        If True, call LigParGen to generate OPLS-AA force-field parameters.
        Requires a working LigParGen installation and ``BOSSdir`` in config.
    tot_charge : int, optional
        Total molecular charge (integer).  Inferred from ``nominal_charge``
        arrays if not supplied.
    prune_z : bool, default False
        Remove the Z-coordinate component from atom positions.
    calc : ase.Calculator, optional
        ASE calculator to attach to the ``ase.Atoms`` object.
    keep_ids : bool, default False
        Preserve atom IDs as read from the input rather than re-indexing.

    Examples
    --------
    Load from the built-in database::

        from mdinterface.database import Water, Ion
        water = Water(model="ewald")
        na    = Ion("Na", ffield="Cheatham")

    Define a custom molecule::

        from ase import Atoms
        from mdinterface import Specie
        mol = Atoms("CO2", positions=[[0,0,0],[1.16,0,0],[-1.16,0,0]])
        co2 = Specie(mol, charges=[-0.3298, 0.6596, -0.3298])

    Generate OPLS-AA parameters with LigParGen::

        methanol = Specie("CH3OH", ligpargen=True)
    """

    def __init__(self, atoms=None, charges=None, atom_types=None, bonds=None,
                 angles=None, dihedrals=None, impropers=None, lj=None, cutoff=1.0,
                 name=None, lammps_data=None, fix_missing=False, chg_scaling=1.0,
                 pbc=False, ligpargen=False, tot_charge=None, prune_z=False,
                 calc=None, keep_ids=False):

        # store int. variables
        self.cutoff = cutoff

        # if file provided, read it
        if lammps_data is not None:
            atoms, atom_types, bonds, angles, dihedrals, impropers = read_lammps_data_file(lammps_data)
        else:
            # read/update atoms atoms
            atoms, atom_types_tmp = self._read_atoms(atoms, charges,
                                                 chg_scaling=chg_scaling, pbc=pbc)
            if atom_types is None:
                atom_types = atom_types_tmp

        # setup nominal charge
        if tot_charge is None:
            if not "nominal_charge" in atoms.arrays:
                atoms.set_array("nominal_charge", np.array(len(atoms)*[0]))
            tot_charge = int(np.sum(atoms.arrays["nominal_charge"]))

        # assign name
        if name is None:
            name = atoms.get_chemical_formula()
        if len(name) > 4:
            logger.warning(
                "Resname '%s' is longer than 4 characters and will be "
                "truncated to '%s'. PACKMOL requires resnames <= 4 chars.",
                name, name[:4],
            )
        self.resname = name[:4]

        # set up atoms and generate graph of specie
        self.set_atoms(atoms, cutoff, prune_z=prune_z)

        # read atom_types from LJ
        if atom_types is None:
            atom_types = self._atom_types_from_lj(lj if lj is not None else {})

        # set up internal topology attributes
        self._setup_topology(atom_types, bonds, angles, dihedrals, impropers,
                             keep_ids=keep_ids)

        if fix_missing:
            self._fix_missing_interactions()

        # initialize topology indexing
        self._update_topology_indexing()

        # run ligpargen to calculate parameters; for large molecules use the
        # segment-and-junction strategy automatically
        if ligpargen:
            if len(self.atoms) <= 200:
                _, atom_types, bonds, angles, dihedrals, impropers = run_ligpargen(
                    self.atoms, charge=tot_charge)
                self._setup_topology(atom_types, bonds, angles, dihedrals, impropers)
                self._update_topology_indexing()
            else:
                from mdinterface.externals.ligpargen import refine_large_specie_topology
                refine_large_specie_topology(self)

        # set tot charge
        self._tot_charge = tot_charge

        # add calculator if provided
        self.assign_calculator(calc)

        return

    @property
    def atoms(self):
        return self._atoms

    @property
    def graph(self):
        return self._graph

    @property
    def bonds(self):
        if not self._bmap: return [[], []]
        return self._find_interactions(1, self._bmap)

    @property
    def angles(self):
        if not self._amap: return [[], []]
        return self._find_interactions(2, self._amap)

    @property
    def dihedrals(self):
        if not self._dmap: return [[], []]
        return self._find_interactions(3, self._dmap)

    @property
    def impropers(self):
        if not self._imap: return [[], []]
        return self._find_interactions(3, self._imap, impropers=True)

    @property
    def _sids(self):
        return self.atoms.arrays["sids"]

    @_sids.setter
    def _sids(self, atom_ids):
        self.atoms.arrays["sids"] = atom_ids

    def copy(self):
        return copy.deepcopy(self)

    @property
    def charges(self):
        return self.atoms.get_initial_charges()

    @property
    def calc(self):
        return self.atoms.calc

    # read atoms to return ase.Atoms
    @staticmethod
    def _read_atoms(atoms, charges, chg_scaling=1.0, pbc=False):

        # initialize atoms obj
        if isinstance(atoms, str):
            try:
                atoms = ase.io.read(atoms)
            except Exception:
                try:
                    atoms = ase.build.molecule(atoms)
                except Exception:
                    atoms = ase.Atoms(atoms)
        elif isinstance(atoms, ase.Atoms):
            atoms = atoms.copy()

        # assign charges
        if charges is not None:
            charges = as_list(charges)
            if len(charges) == 1:
                charges = len(atoms)*charges
        else:
            charges = atoms.get_initial_charges()

        # rescale charges if needed
        charges = chg_scaling*np.asarray(charges)
        atoms.set_initial_charges(charges)

        # see if stype is already present and contains Atom objects (not raw
        # strings from a serialized xyz -- those can't be used without LJ data)
        if "stype" in atoms.arrays:
            stype = atoms.arrays["stype"]
            if len(stype) == 0 or not hasattr(stype[0], "label"):
                stype = None
        else:
            stype = None

        return atoms, stype

    def _setup_topology(self, atoms, bonds, angles, dihedrals, impropers,
                        keep_ids=False):
        """
        Setup topology from scratch - this is run in the beginning
        """

        # define atoms
        atoms_list, atom_map, atom_ids = pmap.map_atoms(as_list(atoms), keep_ids=keep_ids)

        self._stype = atoms_list
        self._smap = atom_map
        self._sids = atom_ids

        # initialize topology values
        self._old_bonds = copy.deepcopy(as_list(bonds))
        self._old_angles = copy.deepcopy(as_list(angles))
        self._old_dihedrals = copy.deepcopy(as_list(dihedrals))
        self._old_impropers = copy.deepcopy(as_list(impropers))

        # map list of inputs
        self._update_topology_mappings()

        return

    def _update_topology_mappings(self):
        """
        Update topology mappings.
        """
        # Remap topology lists
        bonds_list, bond_map, bond_ids = pmap.map_bonds(self._old_bonds)
        angles_list, angle_map, angle_ids = pmap.map_angles(self._old_angles)
        dihedrals_list, dihedral_map, dihedral_ids = pmap.map_dihedrals(self._old_dihedrals)
        impropers_list, improper_map, improper_ids = pmap.map_impropers(self._old_impropers)

        self._btype = bonds_list
        self._atype = angles_list
        self._dtype = dihedrals_list
        self._itype = impropers_list

        self._bmap = bond_map
        self._amap = angle_map
        self._dmap = dihedral_map
        self._imap = improper_map

        self._bids = bond_ids
        self._aids = angle_ids
        self._dids = dihedral_ids
        self._iids = improper_ids

        # Update topology to ensure consistency
        self._update_topology_indexing()

        return

    def _update_topology_indexing(self):
        formula = self.atoms.get_chemical_formula()

        for attributes in ["_btype", "_atype", "_dtype", "_itype", "_stype"]:
            attr_type = []
            for attr in self.__getattribute__(attributes):
                attr.set_formula(formula)
                attr.set_resname(self.resname)
                if attr.id is None or attr.id in attr_type:
                    idx = find_smallest_missing(attr_type, start=1)
                    attr.set_id(idx)
                else:
                    idx = attr.id
                attr_type.append(idx)

        return

    def _update_junction_lj_types(self, cut_idxs, snippet_idxs, sn_atypes):
        """Update LJ atom types for junction atoms using a snippet LigParGen result.

        Called after a junction snippet run to correct the OPLS types of atoms
        that had H caps substituting their real bonded partner during the
        segment/monomer LigParGen run.  Only the atoms in *cut_idxs* are
        updated; atoms further from the cut already have correct context.

        Parameters
        ----------
        cut_idxs : iterable of int
            Original atom indices whose types need correcting (typically the
            two atoms on either side of the cut bond).
        snippet_idxs : array-like of int
            Mapping from snippet positions to original atom indices.
        sn_atypes : list of Atom
            Atom-type objects returned by ``run_ligpargen`` for the snippet.
        """
        for cut_idx in cut_idxs:
            sn_pos     = int(np.argwhere(snippet_idxs == cut_idx)[0][0])
            new_atom   = sn_atypes[sn_pos]
            atom_label = str(self._sids[cut_idx])
            existing_idx = next(
                (idx for idx, a in enumerate(self._stype) if a == new_atom),
                None)
            if existing_idx is None:
                added = copy.deepcopy(new_atom)
                added.set_label(atom_label)
                self._stype.append(added)
                existing_idx = len(self._stype) - 1
            self._smap[atom_label] = existing_idx

    def _add_to_topology(self, bonds=None, angles=None, dihedrals=None, impropers=None):
        if bonds is None:
            bonds = []
        if angles is None:
            angles = []
        if dihedrals is None:
            dihedrals = []
        if impropers is None:
            impropers = []

        # check for uniqueness
        new_bonds = []
        new_angles = []
        new_dihedrals = []
        new_impropers = []
        for bond in copy.deepcopy(bonds):
            if not any(bond.__eq_strict__(bb) for bb in self._old_bonds):
                new_bonds.append(bond)

        for angle in copy.deepcopy(angles):
            if not any(angle.__eq_strict__(aa) for aa in self._old_angles):
                new_angles.append(angle)

        for dihedral in copy.deepcopy(dihedrals):
            if not any(dihedral.__eq_strict__(dd) for dd in self._old_dihedrals):
                new_dihedrals.append(dihedral)

        for improper in copy.deepcopy(impropers):
            if not any(improper.__eq_strict__(ii) for ii in self._old_impropers):
                new_impropers.append(improper)

        # add to topology
        self._old_bonds     += new_bonds
        self._old_angles    += new_angles
        self._old_dihedrals += new_dihedrals
        self._old_impropers += new_impropers

        # update topology mapping
        self._update_topology_mappings()

        return

    def _cleanup_topology(self):
        """
        Remove topology interactions that reference deleted atoms.

        This method should be called after atoms are deleted from the system
        to ensure topology consistency.

        Parameters:
        deleted_atom_ids (list): List of atom IDs (from self._sids) that were deleted
        """

        atom_ids = self._sids

        # Clean bonds
        valid_bonds = []
        for bond in self._old_bonds:
            if all(sim in atom_ids for sim in bond.symbols):
                valid_bonds.append(bond)
        self._old_bonds = valid_bonds

        # Clean angles
        valid_angles = []
        for angle in self._old_angles:
            if all(sim in atom_ids for sim in angle.symbols):
                valid_angles.append(angle)
        self._old_angles = valid_angles

        # Clean dihedrals
        valid_dihedrals = []
        for dihedral in self._old_dihedrals:
            if all(sim in atom_ids for sim in dihedral.symbols):
                valid_dihedrals.append(dihedral)
        self._old_dihedrals = valid_dihedrals

        # Clean impropers
        valid_impropers = []
        for improper in self._old_impropers:
            if all(sim in atom_ids for sim in improper.symbols):
                valid_impropers.append(improper)
        self._old_impropers = valid_impropers

        # Update topology mappings
        self._update_topology_mappings()

        return

    def _fix_missing_interactions(self):

        mss_bnd = pmap.generate_missing_interactions(self, "bonds")
        mss_ang = pmap.generate_missing_interactions(self, "angles")
        mss_dih = pmap.generate_missing_interactions(self, "dihedrals")
        mss_imp = pmap.generate_missing_interactions(self, "impropers")

        self._add_to_topology(mss_bnd, mss_ang, mss_dih, mss_imp)

        return

    # function to setup atom types
    def _atom_types_from_lj(self, lj):

        # use function to retrieve IDs
        atom_type_ids, types_map = find_atom_types(self.graph, max_depth=1)

        atom_types = []
        for atom_id in atom_type_ids:

            atom_symbol = types_map[atom_id][0]
            atom_neighs = "".join(types_map[atom_id][1])

            label = "{}_{}".format(atom_symbol, atom_neighs)
            # types_map[atom_type] = label

            if label in lj:
                eps, sig = lj[label]
            elif atom_symbol in lj:
                eps, sig = lj[atom_symbol]
            else:
                eps, sig = None,None

            atom = Atom(atom_symbol, label=label, eps=eps, sig=sig)
            atom_types.append(atom)

        return atom_types

    def round_charges(self, nround=7):

        rcharges = round_list_to_sum(self.charges, round(sum(self.charges), nround), nround)
        self.atoms.set_initial_charges(rcharges)

        return

    def set_atoms(self, atoms, cutoff=1.0, prune_z=False):

        # make sure no mess
        atoms = atoms.copy()

        if prune_z:
            zmin = atoms.get_positions(wrap=True)[:,2].min()
            zmax = atoms.get_positions(wrap=True)[:,2].max()
            atoms.translate([0,0,-zmin])
            atoms.cell[2][2] = zmax-zmin
            atoms.pbc[2] = False

        if not any(atoms.pbc):
            atoms.set_center_of_mass([0,0,0])

        self._atoms = atoms
        self._graph = molecule_to_graph(atoms, cutoff_scale=cutoff)

        return

    def assign_calculator(self, calc=None):
        self.atoms.calc = calc
        return

    # covnert to mda.Universe
    def to_universe(self, charges=True, layered=False, match_cell=False, xydim=None):

        # empty top object
        top = mda.core.topology.Topology(n_atoms=len(self.atoms))

        # empty universe
        uni = mda.Universe(top, self.atoms.get_positions())

        # add some stuff
        uni.add_TopologyAttr("masses", self.atoms.get_masses())
        uni.add_TopologyAttr("resnames", [self.resname])
        # add atom symbols
        atom_symbols = self.atoms.get_chemical_symbols()  # Assuming this method exists
        uni.add_TopologyAttr("names", atom_symbols)

        # generate type
        types, indexes = self.get_atom_types(return_index=True)
        uni.add_TopologyAttr("types", types)
        uni.add_TopologyAttr("type_index", indexes)

        # populate with bonds and angles
        for att in ["bonds", "angles", "dihedrals", "impropers"]:
            attribute, types = self.__getattribute__(att)
            att_types = self._type2id(att, types)
            uni._add_topology_objects(att, attribute, types=att_types)

        # add charges
        if charges:
            uni.add_TopologyAttr("charges", self.atoms.get_initial_charges())

        # layer it nicely
        if layered:
            layer_idxs, __ = ase.geometry.get_layers(self.atoms, [0,0,1],
                                                      tolerance=0.01)
            groups = []
            for idx in np.unique(layer_idxs):
                idxs = np.where(layer_idxs == idx)[0]

                groups.append(uni.atoms[idxs])

            uni = mda.Merge(*groups)

            # uni.residues[0].resids = layer_idxs

        # match the cell!
        if match_cell:
            assert xydim is not None
            zz = self.atoms.get_cell()[2][2]
            tatoms = self.atoms.copy()

            tatoms.set_cell([xydim[0], xydim[1], zz], scale_atoms=True)

            uni.dimensions = tatoms.cell.cellpar()
            uni.atoms.positions = tatoms.get_positions()

        # if has cell info, pass them along
        elif self.atoms.get_cell():
            uni.dimensions = self.atoms.cell.cellpar()

        return uni

    def repeat(self, rep, make_cubic=False):

        atoms = self.atoms.repeat(rep)

        if make_cubic:  # Note: assumes orthogonal cell; off-diagonal components are discarded

            xsize = [1,0,0]@atoms.cell@[1,0,0]
            ysize = [0,1,0]@atoms.cell@[0,1,0]
            zsize = [0,0,1]@atoms.cell@[0,0,1]

            atoms.set_cell([xsize, ysize, zsize, 90, 90, 90])
            atoms.wrap()

        self.set_atoms(atoms)

        return

    def _find_interactions(self, path_length, tag_map, impropers=False):

        if not impropers:
            paths = find_unique_paths_of_length(self.graph, path_length)
        else:
            paths = find_improper_idxs(self.graph)

        interaction_list = []
        interaction_type = []

        for indices in paths:
            atoms = [self._sids[idx] for idx in indices]
            symbols = [atom.split("_")[0] for atom in atoms]

            if tuple(atoms) in tag_map:
                interaction_list.append(indices)
                interaction_type.append(tag_map[tuple(atoms)])
            elif tuple(atoms[::-1]) in tag_map:
                interaction_list.append(indices[::-1])
                interaction_type.append(tag_map[tuple(atoms[::-1])])
            elif tuple(symbols) in tag_map:
                interaction_list.append(indices)
                interaction_type.append(tag_map[tuple(symbols)])
            elif tuple(symbols[::-1]) in tag_map:
                interaction_list.append(indices[::-1])
                interaction_type.append(tag_map[tuple(symbols[::-1])])

        interaction_list = np.array(interaction_list, dtype=int)
        interaction_type = np.array(interaction_type, dtype=int)

        return [interaction_list.tolist(), interaction_type.tolist()]

    def get_atom_types(self, return_index=False):

        type_indexes = np.array([self._smap[ii] for ii in self._sids])
        atom_types   = np.array([self._stype[ii].extended_label for ii in type_indexes])

        if return_index:
            return atom_types, type_indexes

        return atom_types

    # method to estimate the volume of the specie
    def estimate_specie_volume(self, probe_radius=0):

        try:
            from libarvo import molecular_vs
        except ImportError:
            raise ImportError(
                "libarvo is required for volume estimation but is not installed. "
                "Install it with: pip install libarvo"
            )

        centers = self.atoms.get_positions()
        radii = [vdw_radii[ii] for ii in self.atoms.get_atomic_numbers()]

        volume, surface = molecular_vs(centers, radii, probe_radius)

        return volume

    # method to estimate the radius of the specie if it were a sphere
    def estimate_specie_radius(self, probe_radius=0):
        volume = self.estimate_specie_volume(probe_radius=probe_radius)
        return (3 * volume / (4 * np.pi)) ** (1/3)

    def view(self):
        ase.visualize.view(self.atoms)
        return

    def _type2id(self, attribute, types):

        # Get the corresponding attribute list
        if attribute == "bonds":
            attribute_list = self._btype
        elif attribute == "angles":
            attribute_list = self._atype
        elif attribute == "dihedrals":
            attribute_list = self._dtype
        elif attribute == "impropers":
            attribute_list = self._itype
        elif attribute == "atoms":
            attribute_list = self._stype
        else:
            raise ValueError("Invalid topology attribute type.")

        return [attribute_list[idx].id for idx in types]

    def suggest_missing_interactions(self, stype="all"):

        # Check for missing interactions
        missing_bonds = pmap.find_missing_bonds(self)
        missing_angles = pmap.find_missing_angles(self)
        missing_dihedrals = pmap.find_missing_dihedrals(self)
        missing_impropers = pmap.find_missing_impropers(self)

        suggestions = {
            "bonds": missing_bonds,
            "angles": missing_angles,
            "dihedrals": missing_dihedrals,
            "impropers": missing_impropers
        }

        if stype == "all":
            return suggestions
        return suggestions[stype]

    def __repr__(self):
        return f"{self.__class__.__name__}({self.resname})"

    def find_relevant_distances(self, Nmax, Nmin=0, centers=None, Ninv=0):

        unique_pairs_list = find_relevant_distances(self.graph, Nmax, Nmin=Nmin,
                                                    centers=centers, Ninv=Ninv)
        return unique_pairs_list

    def plot_graph(self, show_bonds=False):

        colors = [jmol_colors[a.number] for a in self.atoms]

        # node_pos = nx.spring_layout(self.graph, k=1.5/np.sqrt(self.graph.order()))
        node_pos = nx.kamada_kawai_layout(self.graph)

        fig, ax = plt.subplots()
        nx.draw(self.graph, pos=node_pos, with_labels=True, node_color=colors,
                node_size=1000, edge_color='black', linewidths=2, font_size=15,
                edgecolors="black", ax=ax, width=2)

        if show_bonds:
            draw_bond_markers(ax, self, node_pos, jmol_colors)

        plt.show()

        return

    def update_positions(self, positions=None, cellpar=None, atoms=None, prune_z=False):

        if atoms is not None:
            positions = atoms.get_positions()
            cellpar   = atoms.get_cell()

        atoms = self._atoms.copy()

        if positions is not None:
            atoms.set_positions(positions)
        if cellpar is not None:
            atoms.set_cell(cellpar)

        self.set_atoms(atoms, cutoff=self.cutoff, prune_z=prune_z)

        return

    def write_gromacs_itp(self, filename=None):
        """
        Write a GROMACS include topology (.itp) file for this species.

        See :func:`~mdinterface.io.gromacswriter.write_gromacs_itp` for full
        documentation.

        Parameters
        ----------
        filename : str, optional
            Output filename. Defaults to ``{resname}.itp``.
        """
        logger.warning(
            "write_gromacs_itp is experimental -- verify output before production use."
        )
        from mdinterface.io.gromacswriter import write_gromacs_itp
        write_gromacs_itp(self, filename=filename)

    def refine_large_topology(self, Nmax=12, ending="H", offset=False,
                              segment_size=200):
        """
        Assign OPLS-AA parameters to this species via LigParGen using a
        segment-and-junction strategy.

        The molecule is split into segments of at most *segment_size* atoms
        along clean backbone bonds (avoiding rings, heteroatoms, and their
        neighbours). Each segment is capped and passed to LigParGen
        individually; a local snippet centred on each cut bond then corrects
        the junction region.

        Requires ``nominal_charge`` to be set on ``self.atoms`` (integer
        formal charge per atom, same convention as
        :class:`~mdinterface.core.polymer.Polymer`).

        Parameters
        ----------
        Nmax : int
            Neighbourhood radius (in bonds) for junction snippet creation.
        ending : str
            Capping element for dangling bonds. Default ``"H"``.
        offset : bool
            Redistribute residual charge error uniformly to match
            ``nominal_charge.sum()``.
        segment_size : int
            Maximum atoms per segment. Default 200. Lower values force more
            splits and are useful for testing.
        """
        from mdinterface.externals.ligpargen import refine_large_specie_topology
        refine_large_specie_topology(self, Nmax=Nmax, ending=ending,
                                     offset=offset, segment_size=segment_size)

    def write_gro(self, filename=None):
        """
        Write a GROMACS structure (.gro) file for this species.

        Parameters
        ----------
        filename : str, optional
            Output filename. Defaults to ``{resname}.gro``.
        """
        if filename is None:
            filename = f"{self.resname}.gro"
        universe = self.to_universe()
        universe.atoms.write(filename)

    def estimate_charges(self, method="obabel", charge=None, assign=False, **respargs):

        if charge is None:
            charge = 0
            logger.warning("No charge specified for estimate_charges; defaulting to 0. "
                           "Verify this is correct for your system.")

        if method == "obabel":
            charges = run_OBChargeModel(self.atoms)
        elif method == "ligpargen":
            system, _, _, _, _, _ = run_ligpargen(self.atoms, charge=charge)
            charges = system.get_initial_charges()
        elif method == "resp":
            charges, atoms = calculate_RESP_charges(self, **respargs)

        if assign:
            self.atoms.set_positions(atoms.get_positions())
            self.atoms.set_initial_charges(charges)

        return charges

    def estimate_OPLSAA_parameters(self, charge=None):

        if charge is None:
            charge = 0

        system, atoms, bonds, angles, dihedrals, impropers = run_ligpargen(
            self.atoms, charge=charge)

        return system, atoms, bonds, angles, dihedrals, impropers

    def relax_structure(self, optimizer='FIRE', fmax=0.05,
                       steps=200, update_positions=True, trajectory=None,
                       logfile=None, charge=None, spin=0, **kwargs):
        """
        Relax the Specie structure using ASE or UMA optimization.

        Parameters
        ----------
        optimizer : str, default 'FIRE'
            Optimizer for ASE method ('BFGS', 'LBFGS', 'FIRE')
        fmax : float, default 0.05
            Maximum force threshold for convergence (eV/Å)
        steps : int, default 200
            Maximum number of optimization steps
        update_positions : bool, default True
            Whether to update the Specie's atomic positions after relaxation
        trajectory : str, optional
            Path to save optimization trajectory
        logfile : str, optional
            Path to save optimization log
        charge : int or None, default None
            Total charge of the system. Defaults to the stored tot_charge.
        spin : int, default 0
            Total spin multiplicity (2S). Use >0 for open-shell systems.
        **kwargs
            Additional arguments passed to the relaxation function

        Returns
        -------
        relaxed_atoms : ase.Atoms
            The relaxed atomic structure
        converged : bool
            Whether the optimization converged
        """

        if charge is None:
            charge = self._tot_charge

        atoms_to_relax = self.atoms
        atoms_to_relax.info["charge"] = charge
        atoms_to_relax.info["spin"]   = spin

        relaxed_atoms = relax_structure(
            atoms_to_relax, optimizer=optimizer, fmax=fmax, steps=steps,
            trajectory=trajectory, logfile=logfile, **kwargs)

        # Update positions if requested
        if update_positions:
            self.update_positions(atoms=relaxed_atoms, prune_z=False)

        return

    def run_aimd(self, timestep=0.5, temperature_K=300, friction=0.001,
              steps=1000, update_positions=True, trajectory=None,
              logfile=None, **kwargs):
        """
        Run AIMD for the Specie structure using ASE and FAIRChem.

        Parameters
        ----------
        timestep : float, default 0.1
            Time step for the simulation in fs.
        temperature_K : float, default 300
            Target temperature for the Langevin dynamics in Kelvin.
        friction : float, default 0.001
            Frictional damping coefficient in 1/fs.
        steps : int, default 1000
            Number of time steps to run the AIMD.
        update_positions : bool, default True
            Whether to update the Specie's atomic positions after AIMD.
        trajectory : str, optional
            Path to save the MD trajectory.
        logfile : str, optional
            Path to save the MD log.
        **kwargs
            Additional arguments passed to the Langevin integrator.

        Returns
        -------
        None
        """

        atoms_to_simulate = self.atoms
        atoms_to_simulate.info["charge"] = self._tot_charge
        atoms_to_simulate.info["spin"] = 0  # Update this if needed

        # Call the previously defined function to perform AIMD
        run_aimd(atoms_to_simulate, timestep=timestep, temperature_K=temperature_K,
                 friction=friction, steps=steps, trajectory=trajectory,
                 logfile=logfile, **kwargs)

        # Update positions if requested
        if update_positions:
            self.update_positions(atoms=atoms_to_simulate, prune_z=False)

        return

    def _find_rings(self, max_ring_size=8):
        return find_rings(self.graph, max_ring_size=max_ring_size)

    def _get_rings_containing_atoms(self, atom_indices, max_ring_size=8):
        return get_rings_containing_atoms(self.graph, atom_indices, max_ring_size=max_ring_size)

write_gromacs_itp(filename=None)

Write a GROMACS include topology (.itp) file for this species.

See :func:~mdinterface.io.gromacswriter.write_gromacs_itp for full documentation.

Parameters:

Name Type Description Default
filename str

Output filename. Defaults to {resname}.itp.

None
Source code in mdinterface/core/specie.py
def write_gromacs_itp(self, filename=None):
    """
    Write a GROMACS include topology (.itp) file for this species.

    See :func:`~mdinterface.io.gromacswriter.write_gromacs_itp` for full
    documentation.

    Parameters
    ----------
    filename : str, optional
        Output filename. Defaults to ``{resname}.itp``.
    """
    logger.warning(
        "write_gromacs_itp is experimental -- verify output before production use."
    )
    from mdinterface.io.gromacswriter import write_gromacs_itp
    write_gromacs_itp(self, filename=filename)

refine_large_topology(Nmax=12, ending='H', offset=False, segment_size=200)

Assign OPLS-AA parameters to this species via LigParGen using a segment-and-junction strategy.

The molecule is split into segments of at most segment_size atoms along clean backbone bonds (avoiding rings, heteroatoms, and their neighbours). Each segment is capped and passed to LigParGen individually; a local snippet centred on each cut bond then corrects the junction region.

Requires nominal_charge to be set on self.atoms (integer formal charge per atom, same convention as :class:~mdinterface.core.polymer.Polymer).

Parameters:

Name Type Description Default
Nmax int

Neighbourhood radius (in bonds) for junction snippet creation.

12
ending str

Capping element for dangling bonds. Default "H".

'H'
offset bool

Redistribute residual charge error uniformly to match nominal_charge.sum().

False
segment_size int

Maximum atoms per segment. Default 200. Lower values force more splits and are useful for testing.

200
Source code in mdinterface/core/specie.py
def refine_large_topology(self, Nmax=12, ending="H", offset=False,
                          segment_size=200):
    """
    Assign OPLS-AA parameters to this species via LigParGen using a
    segment-and-junction strategy.

    The molecule is split into segments of at most *segment_size* atoms
    along clean backbone bonds (avoiding rings, heteroatoms, and their
    neighbours). Each segment is capped and passed to LigParGen
    individually; a local snippet centred on each cut bond then corrects
    the junction region.

    Requires ``nominal_charge`` to be set on ``self.atoms`` (integer
    formal charge per atom, same convention as
    :class:`~mdinterface.core.polymer.Polymer`).

    Parameters
    ----------
    Nmax : int
        Neighbourhood radius (in bonds) for junction snippet creation.
    ending : str
        Capping element for dangling bonds. Default ``"H"``.
    offset : bool
        Redistribute residual charge error uniformly to match
        ``nominal_charge.sum()``.
    segment_size : int
        Maximum atoms per segment. Default 200. Lower values force more
        splits and are useful for testing.
    """
    from mdinterface.externals.ligpargen import refine_large_specie_topology
    refine_large_specie_topology(self, Nmax=Nmax, ending=ending,
                                 offset=offset, segment_size=segment_size)

write_gro(filename=None)

Write a GROMACS structure (.gro) file for this species.

Parameters:

Name Type Description Default
filename str

Output filename. Defaults to {resname}.gro.

None
Source code in mdinterface/core/specie.py
def write_gro(self, filename=None):
    """
    Write a GROMACS structure (.gro) file for this species.

    Parameters
    ----------
    filename : str, optional
        Output filename. Defaults to ``{resname}.gro``.
    """
    if filename is None:
        filename = f"{self.resname}.gro"
    universe = self.to_universe()
    universe.atoms.write(filename)

relax_structure(optimizer='FIRE', fmax=0.05, steps=200, update_positions=True, trajectory=None, logfile=None, charge=None, spin=0, **kwargs)

Relax the Specie structure using ASE or UMA optimization.

Parameters:

Name Type Description Default
optimizer str

Optimizer for ASE method ('BFGS', 'LBFGS', 'FIRE')

'FIRE'
fmax float

Maximum force threshold for convergence (eV/Å)

0.05
steps int

Maximum number of optimization steps

200
update_positions bool

Whether to update the Specie's atomic positions after relaxation

True
trajectory str

Path to save optimization trajectory

None
logfile str

Path to save optimization log

None
charge int or None

Total charge of the system. Defaults to the stored tot_charge.

None
spin int

Total spin multiplicity (2S). Use >0 for open-shell systems.

0
**kwargs

Additional arguments passed to the relaxation function

{}

Returns:

Name Type Description
relaxed_atoms Atoms

The relaxed atomic structure

converged bool

Whether the optimization converged

Source code in mdinterface/core/specie.py
def relax_structure(self, optimizer='FIRE', fmax=0.05,
                   steps=200, update_positions=True, trajectory=None,
                   logfile=None, charge=None, spin=0, **kwargs):
    """
    Relax the Specie structure using ASE or UMA optimization.

    Parameters
    ----------
    optimizer : str, default 'FIRE'
        Optimizer for ASE method ('BFGS', 'LBFGS', 'FIRE')
    fmax : float, default 0.05
        Maximum force threshold for convergence (eV/Å)
    steps : int, default 200
        Maximum number of optimization steps
    update_positions : bool, default True
        Whether to update the Specie's atomic positions after relaxation
    trajectory : str, optional
        Path to save optimization trajectory
    logfile : str, optional
        Path to save optimization log
    charge : int or None, default None
        Total charge of the system. Defaults to the stored tot_charge.
    spin : int, default 0
        Total spin multiplicity (2S). Use >0 for open-shell systems.
    **kwargs
        Additional arguments passed to the relaxation function

    Returns
    -------
    relaxed_atoms : ase.Atoms
        The relaxed atomic structure
    converged : bool
        Whether the optimization converged
    """

    if charge is None:
        charge = self._tot_charge

    atoms_to_relax = self.atoms
    atoms_to_relax.info["charge"] = charge
    atoms_to_relax.info["spin"]   = spin

    relaxed_atoms = relax_structure(
        atoms_to_relax, optimizer=optimizer, fmax=fmax, steps=steps,
        trajectory=trajectory, logfile=logfile, **kwargs)

    # Update positions if requested
    if update_positions:
        self.update_positions(atoms=relaxed_atoms, prune_z=False)

    return

run_aimd(timestep=0.5, temperature_K=300, friction=0.001, steps=1000, update_positions=True, trajectory=None, logfile=None, **kwargs)

Run AIMD for the Specie structure using ASE and FAIRChem.

Parameters:

Name Type Description Default
timestep float

Time step for the simulation in fs.

0.1
temperature_K float

Target temperature for the Langevin dynamics in Kelvin.

300
friction float

Frictional damping coefficient in 1/fs.

0.001
steps int

Number of time steps to run the AIMD.

1000
update_positions bool

Whether to update the Specie's atomic positions after AIMD.

True
trajectory str

Path to save the MD trajectory.

None
logfile str

Path to save the MD log.

None
**kwargs

Additional arguments passed to the Langevin integrator.

{}

Returns:

Type Description
None
Source code in mdinterface/core/specie.py
def run_aimd(self, timestep=0.5, temperature_K=300, friction=0.001,
          steps=1000, update_positions=True, trajectory=None,
          logfile=None, **kwargs):
    """
    Run AIMD for the Specie structure using ASE and FAIRChem.

    Parameters
    ----------
    timestep : float, default 0.1
        Time step for the simulation in fs.
    temperature_K : float, default 300
        Target temperature for the Langevin dynamics in Kelvin.
    friction : float, default 0.001
        Frictional damping coefficient in 1/fs.
    steps : int, default 1000
        Number of time steps to run the AIMD.
    update_positions : bool, default True
        Whether to update the Specie's atomic positions after AIMD.
    trajectory : str, optional
        Path to save the MD trajectory.
    logfile : str, optional
        Path to save the MD log.
    **kwargs
        Additional arguments passed to the Langevin integrator.

    Returns
    -------
    None
    """

    atoms_to_simulate = self.atoms
    atoms_to_simulate.info["charge"] = self._tot_charge
    atoms_to_simulate.info["spin"] = 0  # Update this if needed

    # Call the previously defined function to perform AIMD
    run_aimd(atoms_to_simulate, timestep=timestep, temperature_K=temperature_K,
             friction=friction, steps=steps, trajectory=trajectory,
             logfile=logfile, **kwargs)

    # Update positions if requested
    if update_positions:
        self.update_positions(atoms=atoms_to_simulate, prune_z=False)

    return