API Reference

Peptacular: A ProForma peptide sequence parser and annotation library

Core

Peptacular contains a functional and object-oriented API for working with peptides and proteins. Everything can be accessed through the peptacular namespace (`import peptacular as pt`), but for clarity the API is broken down into sections below.

Sequence

Processing

peptacular.sequence.digestion.cleavage_sites(sequence, enzyme_regex, n_workers=None, chunksize=None, method=None)[source]

Return positions where cleavage occurs in input sequence based on the provided enzyme regex.

Return type:

list[int] | list[list[int]]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • enzyme_regex (str)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.digestion.digest(sequence, enzyme_regex, missed_cleavages=0, semi=False, min_len=None, max_len=None, *, n_workers=None, chunksize=None, method=None)[source]

Returns digested sequences using a regular expression to define cleavage sites.

Return type:

list[tuple[str, Span]] | list[list[tuple[str, Span]]]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • enzyme_regex (str)

  • missed_cleavages (int)

  • semi (bool)

  • min_len (int | None)

  • max_len (int | None)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.digestion.left_semi_digest(sequence, min_len=None, max_len=None, n_workers=None, chunksize=None, method=None)[source]
Return type:

list[tuple[str, Span]] | list[list[tuple[str, Span]]]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • min_len (int | None)

  • max_len (int | None)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.digestion.nonspecific_digest(sequence, min_len=None, max_len=None, n_workers=None, chunksize=None, method=None)[source]

Builds all non-enzymatic sequences from the given input sequence.

Return type:

list[tuple[str, Span]] | list[list[tuple[str, Span]]]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • min_len (int | None)

  • max_len (int | None)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.digestion.right_semi_digest(sequence, min_len=None, max_len=None, n_workers=None, chunksize=None, method=None)[source]
Return type:

list[tuple[str, Span]] | list[list[tuple[str, Span]]]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • min_len (int | None)

  • max_len (int | None)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.digestion.semi_digest(sequence, min_len=None, max_len=None, n_workers=None, chunksize=None, method=None)[source]

Builds all semi-enzymatic sequences from the given input sequence. Equivalent to combining left and right semi-enzymatic sequences.

Return type:

list[tuple[str, Span]] | list[list[tuple[str, Span]]]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • min_len (int | None)

  • max_len (int | None)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.digestion.simple_cleavage_sites(sequence, cleave_on, restrict_before='', restrict_after='', cterminal=True, n_workers=None, chunksize=None, method=None)[source]

Get cleavage sites using simple amino acid rules.

Return type:

list[int] | list[list[int]]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • cleave_on (str)

  • restrict_before (str)

  • restrict_after (str)

  • cterminal (bool)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.digestion.simple_digest(sequence, cleave_on, restrict_before='', restrict_after='', cterminal=True, missed_cleavages=0, semi=False, min_len=None, max_len=None, *, n_workers=None, chunksize=None, method=None)[source]

Returns digested sequences using amino acid specifications with optional restrictions.

Return type:

list[tuple[str, Span]] | list[list[tuple[str, Span]]]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • cleave_on (str)

  • restrict_before (str)

  • restrict_after (str)

  • cterminal (bool)

  • missed_cleavages (int)

  • semi (bool)

  • min_len (int | None)

  • max_len (int | None)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.transformations.join(annotations, n_workers=None, chunksize=None, method=None)[source]

Joins a list of annotations into a single annotation.

# Single list of annotations
>>> join(['PEPTIDE', 'MODIFIED'])
'PEPTIDEMODIFIED'
Return type:

str | list[str]

Parameters:
  • annotations (Sequence[ProFormaAnnotation | str] | Sequence[Sequence[ProFormaAnnotation | str]])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.transformations.reverse(sequence, keep_nterm=0, keep_cterm=0, n_workers=None, chunksize=None, method=None)[source]

Reverses the sequence, while preserving the position of any modifications.

swap_terms: If True, the N- and C-terminal modifications will be swapped. keep_nterm: Number of N-terminal residues to keep unchanged. Default is 0. keep_cterm: Number of C-terminal residues to keep unchanged. Default is 0.

# Single sequence
>>> reverse('PEPTIDE')
'EDITPEP'

# Keep first 2 residues unchanged
>>> reverse('PEPTIDE', keep_nterm=2)
'PEEDITP'
Return type:

str | list[str]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • keep_nterm (int)

  • keep_cterm (int)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.transformations.shift(sequence, n, keep_nterm=0, keep_cterm=0, n_workers=None, chunksize=None, method=None)[source]

Shifts the sequence to the left by a given number of positions, while preserving the position of any modifications.

keep_nterm: Number of N-terminal residues to keep unchanged. Default is 0. keep_cterm: Number of C-terminal residues to keep unchanged. Default is 0.

# Single sequence
>>> shift('PEPTIDE', 2)
'PTIDEPE'

# Keep first 2 residues unchanged
>>> shift('PEPTIDE', 2, keep_nterm=2)
'PEIDEPT'
Return type:

str | list[str]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • n (int)

  • keep_nterm (int)

  • keep_cterm (int)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.transformations.shuffle(sequence, seed=None, keep_nterm=0, keep_cterm=0, n_workers=None, chunksize=None, method=None)[source]

Shuffles the sequence, while preserving the position of any modifications.

keep_nterm: Number of N-terminal residues to keep unchanged. Default is 0. keep_cterm: Number of C-terminal residues to keep unchanged. Default is 0.

# Single sequence
>>> shuffle('PEPTIDE', seed=0)
'IPEPDTE'

# Keep first 2 residues unchanged
>>> shuffle('PEPTIDE', seed=0, keep_nterm=2)
'PEITPED'
Return type:

str | list[str]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • seed (int | None)

  • keep_nterm (int)

  • keep_cterm (int)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.transformations.sort(sequence, key=None, reverse=False, n_workers=None, chunksize=None, method=None)[source]

Sorts the input sequence using the provided sort function. Terminal sequences are kept in place.

key: A function that serves as a key for the sort comparison. Default is None. reverse: If True, the sorted sequence is reversed (descending order). Default is False.

# Single sequence
>>> sort('PEPTIDE')
'DEEIPPT'
Return type:

str | list[str]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • key (Callable[[str], Any] | None)

  • reverse (bool)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.transformations.span_to_sequence(sequence, span, n_workers=None, chunksize=None, method=None)[source]

Extracts a subsequence from the input sequence based on the provided span.

The span is defined as a tuple of three integers: (start, end, step).

# Single sequence
>>> span_to_sequence('PEPTIDE', (0, 4, 0))
'PEPT'
Return type:

str | list[str]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • span (tuple[int, int, int] | Span)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.transformations.split(sequence, n_workers=None, chunksize=None, method=None)[source]

Splits sequence into a list of amino acids, preserving modifications.

# Single sequence
>>> split('PEPTIDE')
['P', 'E', 'P', 'T', 'I', 'D', 'E']
Return type:

list[str] | list[list[str]]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.mod_builder.append_mods(sequence, mods)[source]
Return type:

str

Parameters:
  • sequence (str | ProFormaAnnotation)

  • mods (Mapping[ModType | Literal['nterm', 'cterm', 'isotope', 'static', 'labile', 'unknown', 'interval', 'internal', 'charge'] | int, ~typing.Any])

peptacular.sequence.mod_builder.condense_static_mods(sequence)[source]

Condenses static modifications into internal modifications.

# Condenses static modifications to specified internal modifications
>>> condense_static_mods('<13C><[100]@P>PEPTIDE')
'<13C>P[100]EP[100]TIDE'

# If residue is already modified, the static modification will be appended
>>> condense_static_mods('<13C><[100]@P>P[10]EPTIDE')
'<13C>P[10][100]EP[100]TIDE'

# Example for unmodified sequences
>>> condense_static_mods('PEPTIDE')
'PEPTIDE'

# Example for N-Term static modifications
>>> condense_static_mods('<[Oxidation]@N-term>PEPTIDE')
'[Oxidation]-PEPTIDE'

# Example for C-Term static modifications
>>> condense_static_mods('<[Oxidation]@C-term>PEPTIDE')
'PEPTIDE-[Oxidation]'
Return type:

str

Parameters:

sequence (str | ProFormaAnnotation)

peptacular.sequence.mod_builder.condense_to_peptidoform(sequence)[source]

Condenses all modifications into a peptidoform representation for a sequence or list of sequences.

Return type:

str | list[str]

Parameters:

sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

peptacular.sequence.mod_builder.extend_mods(sequence, mods)[source]
Return type:

str

Parameters:
  • sequence (str | ProFormaAnnotation)

  • mods (Mapping[ModType | Literal['nterm', 'cterm', 'isotope', 'static', 'labile', 'unknown', 'interval', 'internal', 'charge'] | int, ~typing.Any])

peptacular.sequence.mod_builder.filter_mods(sequence, mods=None)[source]

Keeps only the specified modifications in the sequence, removing all others.

# Keeps only internal modifications:
>>> filter_mods('PEP[phospho]TIDE', mods='internal')
'PEP[phospho]TIDE'
Return type:

str

Parameters:
  • sequence (str | ProFormaAnnotation)

  • mods (ModType | Iterable[ModType] | None)

peptacular.sequence.mod_builder.from_ms2_pip(sequence, static_mods=None, n_workers=None, chunksize=None, method=None)[source]

Convert MS2PIP format to ProForma string(s).

# Single sequence
>>> from_ms2_pip(('PEPTIDE', '3|Phospho'))
'PEP[Phospho]TIDE'

# Batch processing
>>> items = [('PEPTIDE', '3|Phospho'), ('PROTEIN', '4|Oxidation')]
>>> from_ms2_pip(items)
['PEP[Phospho]TIDE', 'PROT[Oxidation]EIN']

# Empty modifications
>>> from_ms2_pip(('PEPTIDE', ''))
'PEPTIDE'
Return type:

str | list[str]

Parameters:
  • sequence (tuple[str, str] | Sequence[tuple[str, str]])

  • static_mods (Mapping[str, float] | None)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.mod_builder.get_mods(sequence, mods=None)[source]

Parses a sequence with modifications and returns a dictionary where keys represent the position/type of the modifications.

Return type:

dict[Union[ModType, Literal['nterm', 'cterm', 'isotope', 'static', 'labile', 'unknown', 'interval', 'internal', 'charge']], Any]

Parameters:
  • sequence (str | ProFormaAnnotation)

  • mods (ModType | Iterable[ModType] | Literal['nterm', 'cterm', 'isotope', 'static', 'labile', 'unknown', 'interval', 'internal', 'charge'] | None)

peptacular.sequence.mod_builder.modify(sequence, *, nterm_static=None, cterm_static=None, internal_static=None, labile_static=None, nterm_variable=None, cterm_variable=None, internal_variable=None, labile_variable=None, max_variable_mods=2, use_regex=False, n_workers=None, chunksize=None, method=None)[source]

Build modified sequences by applying static and variable modifications to a sequence or list of sequences.

Modifications are specified as dictionaries where keys represent the residue or terminus type and values are iterables of modifications.

# Single sequence
>>> results = modify('PEPTIDE', internal_variable={'P': [79.966]}, max_variable_mods=1)
>>> len(results) > 0
True

# Multiple sequences (automatic parallel processing)
>>> sequences = ['PEPTIDE', 'PROTEIN', 'SEQUENCE']
>>> results = modify(sequences, internal_variable={'P': [79.966]}, max_variable_mods=1)
>>> len(results)
3
Return type:

list[str] | list[list[str]]

Parameters:
  • sequence (ProFormaAnnotation | str | Sequence[ProFormaAnnotation | str])

  • nterm_static (Mapping[str | None, Iterable[Any]] | None)

  • cterm_static (Mapping[str | None, Iterable[Any]] | None)

  • internal_static (Mapping[str | None, Iterable[Any]] | None)

  • labile_static (Mapping[str | None, Iterable[Any]] | None)

  • nterm_variable (Mapping[str | None, Iterable[Any]] | None)

  • cterm_variable (Mapping[str | None, Iterable[Any]] | None)

  • internal_variable (Mapping[str | None, Iterable[Any]] | None)

  • labile_variable (Mapping[str | None, Iterable[Any]] | None)

  • max_variable_mods (int)

  • use_regex (bool)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.mod_builder.pop_mods(sequence, mods=None)[source]

Removes all modifications from the given sequence, returning the unmodified sequence and a dictionary of the removed modifications.

# Simply combines the functionality of strip_modifications and get_modifications
>>> seq, mod_dict = pop_mods('PEP[phospho]TIDE')
>>> seq
'PEPTIDE'
Return type:

tuple[str, dict[ModType, Any]]

Parameters:
  • sequence (str | ProFormaAnnotation)

  • mods (ModType | Iterable[ModType] | None)

peptacular.sequence.mod_builder.remove_mods(sequence, mods=None)[source]
Return type:

str

Parameters:
  • sequence (str | ProFormaAnnotation)

  • mods (ModType | Iterable[ModType] | None)

peptacular.sequence.mod_builder.set_mods(sequence, mods)[source]
Return type:

str

Parameters:
  • sequence (str | ProFormaAnnotation)

  • mods (Mapping[ModType | Literal['nterm', 'cterm', 'isotope', 'static', 'labile', 'unknown', 'interval', 'internal', 'charge'] | int, ~typing.Any] | None)

peptacular.sequence.mod_builder.strip_mods(sequence, mods=None)[source]

Strips all modifications from the given sequence or list of sequences, returning the unmodified sequence(s).

# Removes internal modifications:
>>> strip_mods('PEP[phospho]TIDE')
'PEPTIDE'
Return type:

str | list[str]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • mods (ModType | Iterable[ModType] | None)

peptacular.sequence.mod_builder.to_ms2_pip(sequence, n_workers=None, chunksize=None, method=None)[source]

Convert a peptide sequence to MS2PIP format by condensing modifications.

# Single sequence
>>> to_ms2_pip('PEP[Phospho]TIDE')
('PEPTIDE', '3|Phospho')

# Batch processing
>>> sequences = ['PEP[Phospho]TIDE', 'PROT[Oxidation]EIN']
>>> to_ms2_pip(sequences)
[('PEPTIDE', '3|Phospho'), ('PROTEIN', '4|Oxidation')]
Return type:

tuple[str, str] | list[tuple[str, str]]

Parameters:
  • sequence (ProFormaAnnotation | str | Sequence[ProFormaAnnotation | str])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.subseqs.coverage(sequence, subsequences, accumulate=False, ignore_mods=False, ignore_ambiguity=False)[source]

Calculate the sequence coverage given a list of subsequecnes.

The coverage is represented as a binary list where each position in the protein sequence is marked as 1 if it is covered by at least one peptide and 0 otherwise.

>>> coverage("PEPTIDE", ["PEP"])
[1, 1, 1, 0, 0, 0, 0]

>>> coverage("PEPTIDE", ["PEP", "EPT"])
[1, 1, 1, 1, 0, 0, 0]

# If accumulate is True, overlapping indecies will be accumulated
>>> coverage("PEPTIDE", ["PEP", "EPT"], accumulate=True)
[1, 2, 2, 1, 0, 0, 0]

# By default ambiguity does not add to coverage
>>> coverage("PEPTIDE", ["P(?EP)"])
[1, 0, 0, 0, 0, 0, 0]
Return type:

list[int]

Parameters:
  • sequence (str | ProFormaAnnotation)

  • subsequences (Iterable[str | ProFormaAnnotation])

  • accumulate (bool)

  • ignore_mods (bool)

  • ignore_ambiguity (bool)

peptacular.sequence.subseqs.find_subsequence_indices(sequence, subsequence, ignore_mods=False)[source]

Retrieves all starting indexes of a given subsequence within a sequence.

# Find the starting indexes of a subsequence
>>> find_subsequence_indices("PEPTIDE", "PEP")
[0]
>>> find_subsequence_indices("PEPTIDE", "EPT")
[1]
>>> find_subsequence_indices("PEPTIDE", "E")
[1, 6]

# By default the function will not ignore modifications
>>> find_subsequence_indices("[Acetyl]-PEPTIDE", "PEP")
[]
>>> find_subsequence_indices("<13C>PEPTIDE", "PEP")
[]
>>> find_subsequence_indices("<13C>PEP[1][Phospho]TIDE", "<13C>PEP[Phospho][1]")
[0]
>>> find_subsequence_indices("[Acetyl]-PEPTIDE", "[Acetyl]-PEP")
[0]
>>> find_subsequence_indices("PEPTIDE", "[Acetyl]-PEP")
[]
>>> find_subsequence_indices("PEPTIDE[1.0]", "IDE")
[]
>>> find_subsequence_indices("[Acetyl]-PEPTIDE-[Amide]", "[Acetyl]-PEPTIDE-[Amide]")
[0]

# If ignore_mods is set to True, the function will ignore modifications
>>> find_subsequence_indices("[Acetyl]-PEPTIDE", "PEP", ignore_mods=True)
[0]
>>> find_subsequence_indices("PEPTIDE", "[Acetyl]-PEP", ignore_mods=True)
[0]
Return type:

list[int]

Parameters:
  • sequence (str | ProFormaAnnotation)

  • subsequence (str | ProFormaAnnotation)

  • ignore_mods (bool)

peptacular.sequence.subseqs.is_subsequence(subsequence, sequence, order=True, ignore_mods=False)[source]

Checks if the input subsequence is a subsequence of the input sequence. If order is True, the subsequence must be in the same order as in the sequence. If order is False, the subsequence can be in any order.

>>> is_subsequence('PEP', 'PEPTIDE')
True

>>> is_subsequence('PET', 'PEPTIDE', order=False)
True

>>> is_subsequence('PET', 'PEPTIDE', order=True)
False

>>> is_subsequence('<13C>PEP', '<13C>PEPTIDE', order=True)
True

>>> is_subsequence('<13C>PEP[1.0]', '<13C>PEP[1.0]TIDE', order=True)
True

>>> is_subsequence('<13C>PEP', '<13C>PEP[1.0]TIDE', order=True)
False
Return type:

bool

Parameters:
  • subsequence (str | ProFormaAnnotation)

  • sequence (str | ProFormaAnnotation)

  • order (bool)

  • ignore_mods (bool)

peptacular.sequence.subseqs.modification_coverage(sequence, subsequences, accumulate=False)[source]

Calculate the modification coverage given a list of subsequences.

This function identifies which modifications in the main sequence are covered by subsequences. It returns a dictionary where each key is a modification position/type (matching the format from get_mods) and each value is the number of subsequences that cover that modification.

>>> modification_coverage("PEPTIDE[Phospho]", ["TIDE"])
{6: 0}

>>> modification_coverage("PEPTIDE[Phospho]", ["TIDE[Phospho]"])
{6: 1}

>>> modification_coverage("PEP[Phospho]TIDE[Methyl]", ["PEP[Phospho]", "TIDE[Methyl]"])
{2: 1, 6: 1}

>>> modification_coverage("PEP[Phospho]TIDE[Methyl]", ["PEP[Phospho]", "TIDE[Methyl]", "PEP[Phospho]"], accumulate=True)
{2: 2, 6: 1}
Return type:

dict[int, int]

Parameters:
  • sequence (str | ProFormaAnnotation)

  • subsequences (list[str | ProFormaAnnotation])

  • accumulate (bool)

peptacular.sequence.subseqs.percent_coverage(sequence, subsequences, ignore_mods=False, accumulate=False, ignore_ambiguity=False)[source]

Calculates the coverage given a list of subsequences.

if accumulate is True, overlapping indecies will be accumulated. if ignore_mods is True, modifications will be ignored when calculating coverage (only the amino acid sequence will be considered). if ignore_ambiguity is True, ambiguous regions will not be counted towards coverage.

>>> round(percent_coverage("PEPTIDE", ["PEP"]), 3)
0.429

# ambiguity does not add to coverage by default
>>> round(percent_coverage("PEPTIDE", ["P(?EP)"]), 3)
0.143

>>> round(percent_coverage("PEPTIDE", ["PEP", "EPT"]), 3)
0.571

>>> round(percent_coverage("PEPTIDE", ["PEPTIDE", "PEPTIDE"], accumulate=True), 3)
2.0
Return type:

float

Parameters:
  • sequence (str | ProFormaAnnotation)

  • subsequences (Iterable[str | ProFormaAnnotation])

  • ignore_mods (bool)

  • accumulate (bool)

  • ignore_ambiguity (bool)

peptacular.sequence.basic.annotate_ambiguity(sequence, forward_coverage, reverse_coverage, mass_shift=None, add_mods_to_intervals=False, sort_mods=True, condense_to_xnotation=False)[source]

This function identifies regions in the sequence where there is insufficient fragment ion coverage and marks them as ambiguous using ProForma notation with parentheses. If a mass shift is provided, it will be added to the appropriate location.

forward_coverage: Binary list indicating which positions have forward ion coverage (1) or not (0). reverse_coverage: Binary list indicating which positions have reverse ion coverage (1) or not (0). mass_shift: An optional mass shift to be added to the sequence at the appropriate position. add_mods_to_intervals: Whether to add modifications to interval annotations. sort_mods: Whether to sort modifications. condense_to_xnotation: Whether to condense ambiguity to X notation.

# Add ambiguity intervals based on fragment ion coverage
>>> annotate_ambiguity('PEPTIDE', [0,1,1,1,0,0,0], [0,0,0,0,0,1,0])
'(?PE)PTI(?DE)'

# With a phosphorylation mass shift (note the '+' sign)
>>> annotate_ambiguity('PEPTIDE', [1,1,1,0,0,0,0], [0,0,0,0,1,1,1], 79.966)
'PEPT[+79.966]IDE'

# Handling existing modifications
>>> annotate_ambiguity('P[+10]EPTIDE', [1,1,1,0,0,0,0], [0,0,0,0,0,1,1])
'P[+10]EP(?TI)DE'

# When mass shift can't be localized to a specific residue
>>> annotate_ambiguity('PEPTIDE', [0,1,1,0,0,0,0], [0,0,0,0,0,1,0], 120)
'(?PE)P(?TI)[+120](?DE)'

# When mass shift is completely unlocalized, it becomes a labile modification
>>> annotate_ambiguity('PEPTIDE', [0,1,1,1,1,0,0], [0,0,1,1,1,1,0], 120)
'{+120}(?PE)PTI(?DE)'

# Complex example with multiple intervals
>>> for_ions = list(map(int, '00011101001000000000000000000000000000'))
>>> rev_ions = list(map(int, '00000000000110000000101111111111010100'))
>>> annotate_ambiguity('SSGSIASSYVQWYQQRPGSAPTTVIYEDDERPSGVPDR', for_ions, rev_ions, 120)
'(?SSGS)IA(?SS)(?YVQ)W[+120](?YQQRPGSA)(?PT)TVIYEDDER(?PS)(?GV)(?PDR)'
Return type:

str

Parameters:
  • sequence (str | ProFormaAnnotation)

  • forward_coverage (list[int])

  • reverse_coverage (list[int])

  • mass_shift (Any | None)

  • add_mods_to_intervals (bool)

  • sort_mods (bool)

  • condense_to_xnotation (bool)

peptacular.sequence.basic.count_residues(sequence, include_mods=True, n_workers=None, chunksize=None, method=None)[source]

Counts the occurrences of each amino acid in the input sequence.

Return type:

dict[str, int] | list[dict[str, int]]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • include_mods (bool)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

include_mods: If True, modified residues are counted as distinct entities. If False,

only unmodified residues are counted.

# Single sequence
>>> count_residues('PEPTIDE')
{'P': 2, 'E': 2, 'T': 1, 'I': 1, 'D': 1}

# Single sequence
>>> count_residues('PEP[Oxidation]TIDE-[+30]')
{'P': 1, 'E': 1, 'P[Oxidation]': 1, 'T': 1, 'I': 1, 'D': 1, 'E-[+30]': 1}
peptacular.sequence.basic.generate_random(count=None, min_length=6, max_length=20, mod_probability=0.05, include_internal_mods=True, include_nterm_mods=True, include_cterm_mods=True, include_labile_mods=True, include_unknown_mods=True, include_isotopic_mods=True, include_static_mods=True, include_intervals=True, include_charge=True, require_composition=True, n_workers=None, chunksize=None, method=None)[source]

Generate random ProForma annotation(s) with configurable features.

Parameters:
  • count (int | None) – Number of random sequences to generate. If None, generates a single sequence.

  • min_length (int) – Minimum sequence length

  • max_length (int) – Maximum sequence length

  • mod_probability (float) – Probability of adding modifications (0.0 to 1.0)

  • include_internal_mods (bool) – Whether to generate internal modifications

  • include_nterm_mods (bool) – Whether to generate N-terminal modifications

  • include_cterm_mods (bool) – Whether to generate C-terminal modifications

  • include_labile_mods (bool) – Whether to generate labile modifications

  • include_unknown_mods (bool) – Whether to generate unknown position modifications

  • include_isotopic_mods (bool) – Whether to generate isotopic modifications

  • include_static_mods (bool) – Whether to generate static modifications

  • include_intervals (bool) – Whether to generate intervals

  • include_charge (bool) – Whether to generate charge state or adduct

  • require_composition (bool) – If True, only modifications with composition are allowed (no mass-only)

  • n_workers (int | None) – Number of parallel workers (only used when count > 1)

  • chunksize (int | None) – Size of chunks for parallel processing

  • method (Union[parallelMethod, Literal['process', 'thread', 'sequential'], None]) – Parallel processing method (‘process’, ‘thread’, or ‘sequential’)

Return type:

ProFormaAnnotation | list[ProFormaAnnotation]

Returns:

A single ProFormaAnnotation if count is None, otherwise a list of ProFormaAnnotations

# Generate a single random sequence
>>> seq = generate_random()
>>> isinstance(seq, ProFormaAnnotation)
True

# Generate multiple random sequences
>>> seqs = generate_random(count=10)
>>> len(seqs)
10

# Generate without modifications
>>> seq = generate_random(mod_probability=0.0)

# Generate with only internal modifications
>>> seq = generate_random(
...     include_nterm_mods=False,
...     include_cterm_mods=False,
...     include_labile_mods=False
... )
peptacular.sequence.basic.is_ambiguous(sequence, n_workers=None, chunksize=None, method=None)[source]

Check if the sequence contains ambiguous amino acids.

Return type:

bool | list[bool]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.basic.is_modified(sequence, n_workers=None, chunksize=None, method=None)[source]

Check if the sequence contains any modifications.

Return type:

bool | list[bool]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.basic.parse(s, validate=False, n_workers=None, chunksize=None, method=None, reuse_pool=True)[source]

Parse a ProForma string or list of strings into ProFormaAnnotation object(s).

Return type:

ProFormaAnnotation | list[ProFormaAnnotation]

Parameters:
  • s (str | Sequence[str])

  • validate (bool)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

  • reuse_pool (bool)

peptacular.sequence.basic.parse_chimeric(s, validate=False, n_workers=None, chunksize=None, method=None, reuse_pool=True)[source]

Parse a chimeric ProForma string or list of strings into lists of ProFormaAnnotation objects.

Return type:

list[ProFormaAnnotation] | list[list[ProFormaAnnotation]]

Parameters:
  • s (str | Sequence[str])

  • validate (bool)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

  • reuse_pool (bool)

peptacular.sequence.basic.percent_residues(sequence, include_mods=True, n_workers=None, chunksize=None, method=None)[source]

Calculates the percentage of each amino acid in the input sequence.

Return type:

dict[str, float] | list[dict[str, float]]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • include_mods (bool)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

include_mods: If True, modified residues are counted as distinct entities. If False,

only unmodified residues are counted.

# Single sequence
>>> d = percent_residues('PEPTIDE')
>>> dict(map(lambda item: (item[0], round(item[1], 2)), d.items()))
{'P': 28.57, 'E': 28.57, 'T': 14.29, 'I': 14.29, 'D': 14.29}

# Single sequence with modification
>>> d = percent_residues('PEP[Oxidation]TIDE-[+30]')
>>> dict(map(lambda item: (item[0], round(item[1], 2)), d.items()))
{'P': 14.29, 'E': 14.29, 'P[Oxidation]': 14.29, 'T': 14.29, 'I': 14.29, 'D': 14.29, 'E-[+30]': 14.29}
peptacular.sequence.basic.sequence_length(sequence, n_workers=None, chunksize=None, method=None)[source]

Compute the length of the peptide sequence based on the unmodified sequence.

Return type:

int | list[int]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.basic.serialize(sequence, n_workers=None, chunksize=None, method=None)[source]

Serialize a peptide sequence or list of sequences to ProForma string format.

Return type:

str | list[str]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.basic.serialize_chimeric(sequence, n_workers=None, chunksize=None, method=None)[source]

Serialize a chimeric peptide sequence or list of sequences to ProForma string format.

Return type:

str | list[str]

Parameters:
  • sequence (Sequence[ProFormaAnnotation | str] | Sequence[Sequence[ProFormaAnnotation | str]])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.basic.validate(sequence, n_workers=None, chunksize=None, method=None)[source]

Checks if the input sequence is a valid ProForma sequence.

Return type:

bool | list[bool]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.combinatoric.combinations(sequence, size, n_workers=None, chunksize=None, method=None)[source]

Generates all combinations of the input sequence of a given size. Terminal sequence are kept in place.

size: The size of the combinations to be generated. If None, uses the length of the sequence.

>>> combinations('PET', 2)
['PE', 'PT', 'ET']

>>> combinations('[3]-PET-[1]', 2)
['[3]-PE-[1]', '[3]-PT-[1]', '[3]-ET-[1]']

>>> combinations('PE[3.14]T', 2)
['PE[3.14]', 'PT', 'E[3.14]T']

>>> combinations('<13C>PET', 2)
['<13C>PE', '<13C>PT', '<13C>ET']
Return type:

list[str] | list[list[str]]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • size (int | None)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.combinatoric.combinations_with_replacement(sequence, size, n_workers=None, chunksize=None, method=None)[source]

Generates all combinations with replacement of the input sequence of a given size. Terminal sequence are kept in place.

size: The size of the combinations to be generated. If None, uses the length of the sequence.

>>> combinations_with_replacement('PET', 2)
['PP', 'PE', 'PT', 'EE', 'ET', 'TT']

>>> combinations_with_replacement('[3]-PET-[1]', 2)
['[3]-PP-[1]', '[3]-PE-[1]', '[3]-PT-[1]', '[3]-EE-[1]', '[3]-ET-[1]', '[3]-TT-[1]']

>>> combinations_with_replacement('PE[3.14]T', 2)
['PP', 'PE[3.14]', 'PT', 'E[3.14]E[3.14]', 'E[3.14]T', 'TT']

>>> combinations_with_replacement('<13C>PET', 2)
['<13C>PP', '<13C>PE', '<13C>PT', '<13C>EE', '<13C>ET', '<13C>TT']
Return type:

list[str] | list[list[str]]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • size (int | None)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.combinatoric.permutations(sequence, size=None, n_workers=None, chunksize=None, method=None)[source]

Generates all permutations of the input sequence. Terminal sequence are kept in place.

The size of the permutations. If None, uses the length of the sequence.

>>> permutations('PET')
['PET', 'PTE', 'EPT', 'ETP', 'TPE', 'TEP']

>>> permutations('[3]-PET-[1]')
['[3]-PET-[1]', '[3]-PTE-[1]', '[3]-EPT-[1]', '[3]-ETP-[1]', '[3]-TPE-[1]', '[3]-TEP-[1]']

>>> permutations('PE[3.14]T')
['PE[3.14]T', 'PTE[3.14]', 'E[3.14]PT', 'E[3.14]TP', 'TPE[3.14]', 'TE[3.14]P']

>>> permutations('<13C>PET')
['<13C>PET', '<13C>PTE', '<13C>EPT', '<13C>ETP', '<13C>TPE', '<13C>TEP']
Return type:

list[str] | list[list[str]]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • size (int | None)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.combinatoric.product(sequence, repeat, n_workers=None, chunksize=None, method=None)[source]

Generates all cartesian products of the input sequence of a given size. Terminal sequence are kept in place.

The size of the combinations to be generated. If None, uses the length of the sequence.

>>> product('PET', 2)
['PP', 'PE', 'PT', 'EP', 'EE', 'ET', 'TP', 'TE', 'TT']

>>> product('[3]-PET-[1]', 2)[:5]
['[3]-PP-[1]', '[3]-PE-[1]', '[3]-PT-[1]', '[3]-EP-[1]', '[3]-EE-[1]']

>>> product('PE[3.14]T', 2)
['PP', 'PE[3.14]', 'PT', 'E[3.14]P', 'E[3.14]E[3.14]', 'E[3.14]T', 'TP', 'TE[3.14]', 'TT']

>>> product('<13C>PET', 2)
['<13C>PP', '<13C>PE', '<13C>PT', '<13C>EP', '<13C>EE', '<13C>ET', '<13C>TP', '<13C>TE', '<13C>TT']
Return type:

list[str] | list[list[str]]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • repeat (int | None)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.converters.convert_casanovo_sequence(sequence, n_workers=None, chunksize=None, method=None)[source]

Converts a Casanovo sequence with modifications to a proforma2.0 compatible sequence.

Returns:

Proforma2.0 compatable sequence or list of sequences.

Return type:

str | list[str]

Parameters:
  • sequence (str | list[str])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

>>> convert_casanovo_sequence('+43.006P+100EPTIDE')
'[+43.006]-P[+100]EPTIDE'
peptacular.sequence.converters.convert_diann_sequence(sequence, n_workers=None, chunksize=None, method=None)[source]

Converts a DIANN-Like sequence to a proforma2.0 compatible sequence.

>>> convert_diann_sequence('_[Acytel]YMGTLRGC[Carbamidomethyl]LLRLYHD[1.0]_[Methyl]')
'[Acytel]-YMGTLRGC[Carbamidomethyl]LLRLYHD[1.0]-[Methyl]'
Return type:

str | list[str]

Parameters:
  • sequence (str | list[str])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.converters.convert_ip2_sequence(sequence, n_workers=None, chunksize=None, method=None)[source]

Converts a IP2-Like sequence to a proforma2.0 compatible sequence.

>>> convert_ip2_sequence('K.PEP(phospho)TIDE.K')
'PEP[phospho]TIDE'
Return type:

str | list[str]

Parameters:
  • sequence (str | list[str])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

Mass/Comp/Isotope/Fragment

peptacular.sequence.isotope.estimate_isotopic_distribution(annotations, ion_type=IonType.PRECURSOR, charge=None, isotopes=None, deltas=None, max_isotopes=10, min_abundance_threshold=0.001, distribution_resolution=5, use_neutron_count=False, conv_min_abundance_threshold=1e-14, n_workers=None, chunksize=None, method=None)[source]
Return type:

list[IsotopicData] | list[list[IsotopicData]]

Parameters:
  • annotations (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • ion_type (Literal['p', 'n', 'a', 'b', 'c', 'x', 'y', 'z', 'z.', 'z+H', 'c-H', 'i', 'd', 'd-valine', 'da', 'db', 'da-threonine', 'da-isoleucine', 'db-threonine', 'db-isoleucine', 'v', 'w', 'w-valine', 'wa-threonine', 'wa', 'wb', 'wa-isoleucine', 'wb-threonine', 'wb-isoleucine', 'by', 'ax', 'cz', 'ay', 'az', 'bx', 'bz', 'cx', 'cy'] | ~tacular.ion_types.data.IonType)

  • charge (int | str | list[str] | Mods[GlobalChargeCarrier] | GlobalChargeCarrier | Mod[GlobalChargeCarrier] | None)

  • isotopes (int | dict[str | ElementInfo, int] | None)

  • deltas (str | ChargedFormula | float | dict[str | ChargedFormula | float, int] | None)

  • max_isotopes (int | None)

  • min_abundance_threshold (float)

  • distribution_resolution (int | None)

  • use_neutron_count (bool)

  • conv_min_abundance_threshold (float)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.isotope.isotopic_distribution(annotations, ion_type=IonType.PRECURSOR, charge=None, isotopes=None, deltas=None, max_isotopes=10, min_abundance_threshold=0.001, distribution_resolution=5, use_neutron_count=False, conv_min_abundance_threshold=1e-14, n_workers=None, chunksize=None, method=None)[source]
Return type:

list[IsotopicData] | list[list[IsotopicData]]

Parameters:
  • annotations (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • ion_type (Literal['p', 'n', 'a', 'b', 'c', 'x', 'y', 'z', 'z.', 'z+H', 'c-H', 'i', 'd', 'd-valine', 'da', 'db', 'da-threonine', 'da-isoleucine', 'db-threonine', 'db-isoleucine', 'v', 'w', 'w-valine', 'wa-threonine', 'wa', 'wb', 'wa-isoleucine', 'wb-threonine', 'wb-isoleucine', 'by', 'ax', 'cz', 'ay', 'az', 'bx', 'bz', 'cx', 'cy'] | ~tacular.ion_types.data.IonType)

  • charge (int | str | list[str] | Mods[GlobalChargeCarrier] | GlobalChargeCarrier | Mod[GlobalChargeCarrier] | None)

  • isotopes (int | dict[str | ElementInfo, int] | None)

  • deltas (str | ChargedFormula | float | dict[str | ChargedFormula | float, int] | None)

  • max_isotopes (int | None)

  • min_abundance_threshold (float)

  • distribution_resolution (int | None)

  • use_neutron_count (bool)

  • conv_min_abundance_threshold (float)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.mass_funcs.comp(sequence, ion_type=IonType.PRECURSOR, charge=None, isotopes=None, deltas=None, n_workers=None, chunksize=None, method=None)[source]

Calculates the elemental composition of a peptide sequence, including modifications.

Return type:

Counter[ElementInfo] | list[Counter[ElementInfo]]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • ion_type (Literal['p', 'n', 'a', 'b', 'c', 'x', 'y', 'z', 'z.', 'z+H', 'c-H', 'i', 'd', 'd-valine', 'da', 'db', 'da-threonine', 'da-isoleucine', 'db-threonine', 'db-isoleucine', 'v', 'w', 'w-valine', 'wa-threonine', 'wa', 'wb', 'wa-isoleucine', 'wb-threonine', 'wb-isoleucine', 'by', 'ax', 'cz', 'ay', 'az', 'bx', 'bz', 'cx', 'cy'] | ~tacular.ion_types.data.IonType)

  • charge (int | str | list[str] | Mods[GlobalChargeCarrier] | GlobalChargeCarrier | Mod[GlobalChargeCarrier] | None)

  • isotopes (int | dict[str | ElementInfo, int] | None)

  • deltas (str | ChargedFormula | float | dict[str | ChargedFormula | float, int] | None)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.mass_funcs.mass(sequence, ion_type=IonType.PRECURSOR, charge=None, monoisotopic=True, isotopes=None, deltas=None, calculate_with_composition=False, n_workers=None, chunksize=None, method=None)[source]

Calculate the mass of an amino acid ‘sequence’.

Return type:

float | list[float]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • ion_type (Literal['p', 'n', 'a', 'b', 'c', 'x', 'y', 'z', 'z.', 'z+H', 'c-H', 'i', 'd', 'd-valine', 'da', 'db', 'da-threonine', 'da-isoleucine', 'db-threonine', 'db-isoleucine', 'v', 'w', 'w-valine', 'wa-threonine', 'wa', 'wb', 'wa-isoleucine', 'wb-threonine', 'wb-isoleucine', 'by', 'ax', 'cz', 'ay', 'az', 'bx', 'bz', 'cx', 'cy'] | ~tacular.ion_types.data.IonType)

  • charge (int | str | list[str] | Mods[GlobalChargeCarrier] | GlobalChargeCarrier | Mod[GlobalChargeCarrier] | None)

  • monoisotopic (bool)

  • isotopes (int | dict[str | ElementInfo, int] | None)

  • deltas (str | ChargedFormula | float | dict[str | ChargedFormula | float, int] | None)

  • calculate_with_composition (bool)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.mass_funcs.mz(sequence, ion_type=IonType.PRECURSOR, charge=None, monoisotopic=True, isotopes=None, deltas=None, calculate_with_composition=False, n_workers=None, chunksize=None, method=None)[source]

Calculate the m/z (mass-to-charge ratio) of an amino acid ‘sequence’.

Return type:

float | list[float]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • ion_type (Literal['p', 'n', 'a', 'b', 'c', 'x', 'y', 'z', 'z.', 'z+H', 'c-H', 'i', 'd', 'd-valine', 'da', 'db', 'da-threonine', 'da-isoleucine', 'db-threonine', 'db-isoleucine', 'v', 'w', 'w-valine', 'wa-threonine', 'wa', 'wb', 'wa-isoleucine', 'wb-threonine', 'wb-isoleucine', 'by', 'ax', 'cz', 'ay', 'az', 'bx', 'bz', 'cx', 'cy'] | ~tacular.ion_types.data.IonType)

  • charge (int | str | list[str] | Mods[GlobalChargeCarrier] | GlobalChargeCarrier | Mod[GlobalChargeCarrier] | None)

  • monoisotopic (bool)

  • isotopes (int | dict[str | ElementInfo, int] | None)

  • deltas (str | ChargedFormula | float | dict[str | ChargedFormula | float, int] | None)

  • calculate_with_composition (bool)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.fragmentation.fast_fragment(sequence, ion_types=(IonType.B, IonType.Y), charges=None, monoisotopic=True, n_workers=None, chunksize=None, method=None)[source]

Compute fragment ion m/z values for a sequence or list of sequences.

Uses a fast prefix/suffix-sum approach. Returns a dict mapping (IonType, charge) to a list of m/z values of length equal to the sequence length, ordered from fragment position 1 to N. Neutral losses, isotope shifts, and custom deltas are not supported.

Return type:

dict[tuple[IonType, int], list[float]] | list[dict[tuple[IonType, int], list[float]]]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • ion_types (Sequence[Literal['p', 'n', 'a', 'b', 'c', 'x', 'y', 'z', 'z.', 'z+H', 'c-H', 'i', 'd', 'd-valine', 'da', 'db', 'da-threonine', 'da-isoleucine', 'db-threonine', 'db-isoleucine', 'v', 'w', 'w-valine', 'wa-threonine', 'wa', 'wb', 'wa-isoleucine', 'wb-threonine', 'wb-isoleucine', 'by', 'ax', 'cz', 'ay', 'az', 'bx', 'bz', 'cx', 'cy'] | ~tacular.ion_types.data.IonType])

  • charges (Sequence[int] | None)

  • monoisotopic (bool)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.fragmentation.frag(sequence, ion_type=IonType.PRECURSOR, charge=None, monoisotopic=True, isotopes=None, deltas=None, calculate_composition=False, position=None, n_workers=None, chunksize=None, method=None)[source]

Calculate a single fragment from a sequence or multiple sequences.

Return type:

Fragment | list[Fragment]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • ion_type (Literal['p', 'n', 'a', 'b', 'c', 'x', 'y', 'z', 'z.', 'z+H', 'c-H', 'i', 'd', 'd-valine', 'da', 'db', 'da-threonine', 'da-isoleucine', 'db-threonine', 'db-isoleucine', 'v', 'w', 'w-valine', 'wa-threonine', 'wa', 'wb', 'wa-isoleucine', 'wb-threonine', 'wb-isoleucine', 'by', 'ax', 'cz', 'ay', 'az', 'bx', 'bz', 'cx', 'cy'] | ~tacular.ion_types.data.IonType)

  • charge (int | str | list[str] | Mods[GlobalChargeCarrier] | GlobalChargeCarrier | Mod[GlobalChargeCarrier] | None)

  • monoisotopic (bool)

  • isotopes (int | dict[str | ElementInfo, int] | None)

  • deltas (str | ChargedFormula | float | dict[str | ChargedFormula | float, int] | None)

  • calculate_composition (bool)

  • position (int | tuple[int, int] | None)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.fragmentation.fragment(sequence, ion_types=(IonType.B, IonType.Y), charges=(1,), monoisotopic=True, isotopes=(0,), deltas=(None,), neutral_deltas=(), calculate_composition=False, max_ndeltas=1, n_workers=None, chunksize=None, method=None)[source]

Builds fragment ions from a given input sequence or list of sequences.

Return type:

list[Fragment] | list[list[Fragment]]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • ion_types (Sequence[Literal['p', 'n', 'a', 'b', 'c', 'x', 'y', 'z', 'z.', 'z+H', 'c-H', 'i', 'd', 'd-valine', 'da', 'db', 'da-threonine', 'da-isoleucine', 'db-threonine', 'db-isoleucine', 'v', 'w', 'w-valine', 'wa-threonine', 'wa', 'wb', 'wa-isoleucine', 'wb-threonine', 'wb-isoleucine', 'by', 'ax', 'cz', 'ay', 'az', 'bx', 'bz', 'cx', 'cy'] | ~tacular.ion_types.data.IonType])

  • charges (Sequence[int | str | list[str] | Mods[GlobalChargeCarrier] | GlobalChargeCarrier | Mod[GlobalChargeCarrier]])

  • monoisotopic (bool)

  • isotopes (Sequence[int | dict[str | ElementInfo, int] | None])

  • deltas (Sequence[str | ChargedFormula | float | dict[str | ChargedFormula | float, int] | None])

  • neutral_deltas (Sequence[NeutralDelta | Literal['H', 'NH3', 'H2O', 'CO', 'CO2', 'HCONH2', 'HCOOH', 'CH4OS', 'SO3', 'HPO3', 'C2H5NOS', 'C2H4O2S', 'H3PO4'] | ~tacular.neutral_deltas.dclass.NeutralDeltaInfo | str])

  • calculate_composition (bool)

  • max_ndeltas (int)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

Property

peptacular.sequence.properties.aa_property_percentage(sequence, residues, n_workers=None, chunksize=None, method=None)[source]
Return type:

float | list[float]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • residues (list[str])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.properties.alpha_helix_percent(sequence, n_workers=None, chunksize=None, method=None)[source]
Return type:

float | list[float]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.properties.aromaticity(sequence, aromatic_residues=None, n_workers=None, chunksize=None, method=None)[source]
Return type:

float | list[float]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • aromatic_residues (list[str] | None)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.properties.average_buried_area(sequence, n_workers=None, chunksize=None, method=None)[source]
Return type:

float | list[float]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.properties.beta_sheet_percent(sequence, n_workers=None, chunksize=None, method=None)[source]
Return type:

float | list[float]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.properties.beta_turn_percent(sequence, n_workers=None, chunksize=None, method=None)[source]
Return type:

float | list[float]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.properties.bulkiness(sequence, n_workers=None, chunksize=None, method=None)[source]
Return type:

float | list[float]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.properties.calc_property(sequence, scale, missing_aa_handling=MissingAAHandling.ERROR, aggregation_method=AggregationMethod.AVG, normalize=False, weighting_scheme=WeightingMethods.UNIFORM, min_weight=0.1, max_weight=1.0, n_workers=None, chunksize=None, method=None)[source]

Calculate a physicochemical property for a sequence or list of sequences.

Return type:

float | list[float]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • scale (str | dict[str, float])

  • missing_aa_handling (Literal['zero', 'avg', 'min', 'max', 'median', 'error', 'skip'] | ~peptacular.property.types.MissingAAHandling)

  • aggregation_method (Literal['sum', 'avg'] | ~peptacular.property.types.AggregationMethod)

  • normalize (bool)

  • weighting_scheme (Literal['uniform', 'linear', 'exponential', 'gaussian', 'sigmoid', 'cosine', 'sinusoidal'] | ~peptacular.property.types.WeightingMethods)

  • min_weight (float)

  • max_weight (float)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.properties.calc_window_property(sequence, scale, window_size=9, missing_aa_handling=MissingAAHandling.ERROR, aggregation_method=AggregationMethod.AVG, normalize=False, weighting_scheme=WeightingMethods.UNIFORM, min_weight=0.1, max_weight=1.0)[source]
Return type:

list[float]

Parameters:
  • sequence (str | ProFormaAnnotation)

  • scale (str | dict[str, float])

  • window_size (int)

  • missing_aa_handling (Literal['zero', 'avg', 'min', 'max', 'median', 'error', 'skip'] | ~peptacular.property.types.MissingAAHandling)

  • aggregation_method (Literal['sum', 'avg'] | ~peptacular.property.types.AggregationMethod)

  • normalize (bool)

  • weighting_scheme (Literal['uniform', 'linear', 'exponential', 'gaussian', 'sigmoid', 'cosine', 'sinusoidal'] | ~peptacular.property.types.WeightingMethods)

  • min_weight (float)

  • max_weight (float)

peptacular.sequence.properties.charge_at_ph(sequence, pH=7.0, n_workers=None, chunksize=None, method=None)[source]
Return type:

float | list[float]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • pH (float)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.properties.codons(sequence, n_workers=None, chunksize=None, method=None)[source]
Return type:

float | list[float]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.properties.coil_percent(sequence, n_workers=None, chunksize=None, method=None)[source]
Return type:

float | list[float]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.properties.flexibility(sequence, n_workers=None, chunksize=None, method=None)[source]
Return type:

float | list[float]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.properties.hplc(sequence, n_workers=None, chunksize=None, method=None)[source]
Return type:

float | list[float]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.properties.hydrophilicity(sequence, n_workers=None, chunksize=None, method=None)[source]
Return type:

float | list[float]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.properties.hydrophobicity(sequence, n_workers=None, chunksize=None, method=None)[source]
Return type:

float | list[float]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.properties.mutability(sequence, n_workers=None, chunksize=None, method=None)[source]
Return type:

float | list[float]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.properties.pi(sequence, n_workers=None, chunksize=None, method=None)[source]
Return type:

float | list[float]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.properties.polarity(sequence, n_workers=None, chunksize=None, method=None)[source]
Return type:

float | list[float]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.properties.property_partitions(sequence, scale, num_windows=5, aa_overlap=0, missing_aa_handling=MissingAAHandling.AVG, aggregation_method=AggregationMethod.AVG, normalize=False, weighting_scheme=WeightingMethods.UNIFORM, min_weight=0.1, max_weight=1.0, n_workers=None, chunksize=None, method=None)[source]

Generate property values for N number of sliding windows across the sequence.

Divides the sequence into N overlapping windows and calculates property values for each window. Useful for analyzing local variations in peptide properties.

Return type:

list[float] | list[list[float]]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • scale (str | dict[str, float])

  • num_windows (int)

  • aa_overlap (int)

  • missing_aa_handling (Literal['zero', 'avg', 'min', 'max', 'median', 'error', 'skip'] | ~peptacular.property.types.MissingAAHandling)

  • aggregation_method (Literal['sum', 'avg'] | ~peptacular.property.types.AggregationMethod)

  • normalize (bool)

  • weighting_scheme (Literal['uniform', 'linear', 'exponential', 'gaussian', 'sigmoid', 'cosine', 'sinusoidal'] | ~peptacular.property.types.WeightingMethods)

  • min_weight (float)

  • max_weight (float)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.properties.recognition_factors(sequence, n_workers=None, chunksize=None, method=None)[source]
Return type:

float | list[float]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.properties.refractivity(sequence, n_workers=None, chunksize=None, method=None)[source]
Return type:

float | list[float]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.properties.secondary_structure(sequence, scale=SecondaryStructureMethod.DELEAGE_ROUX, n_workers=None, chunksize=None, method=None)[source]
Return type:

dict[str, float] | list[dict[str, float]]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • scale (str)

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.properties.surface_accessibility(sequence, n_workers=None, chunksize=None, method=None)[source]
Return type:

float | list[float]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

peptacular.sequence.properties.transmembrane_tendency(sequence, n_workers=None, chunksize=None, method=None)[source]
Return type:

float | list[float]

Parameters:
  • sequence (str | ProFormaAnnotation | Sequence[str | ProFormaAnnotation])

  • n_workers (int | None)

  • chunksize (int | None)

  • method (parallelMethod | Literal['process', 'thread', 'sequential'] | None)

Data Classes

Some of the core data classes used throughout Peptacular.

class peptacular.annotation.mod.Interval(start, end, ambiguous=False, mods=None, validate=False)[source]

Bases: object

Parameters:
  • start (int)

  • end (int)

  • ambiguous (bool)

  • mods (Any | None)

  • validate (bool)

class peptacular.annotation.mod.Mod(value, count)[source]

Bases: Generic

A modification with its occurrence count.

Parameters:
  • value (T)

  • count (int)

as_tuple()[source]

Return the modification as a (value, count) tuple.

Return type:

tuple[TypeVar(T, bound= ModificationProtocol), int]

get_charge()[source]

Get total charge for this modification occurrence.

Return type:

int

get_composition()[source]

Get total composition for this modification occurrence.

Return type:

Counter[ElementInfo]

get_mass(monoisotopic=True)[source]

Get total mass for this modification occurrence.

Return type:

float

Parameters:

monoisotopic (bool)

class peptacular.annotation.mod.Mods(mod_type, _mods)[source]

Bases: MassPropertyMixin, Generic

Collection of modifications of a specific type.

Parameters:
  • mod_type (ModType)

  • _mods (dict[str, int] | None)

get_charge()[source]

Get total charge for all modifications.

Return type:

int

get_composition()[source]

Get total composition for all modifications.

Return type:

Counter[ElementInfo]

get_composition_with_delta_mass_charge(monoisotopic=True)[source]

Get total composition and when not possible fall back to delta mass for MassTags.

Return type:

tuple[Counter[ElementInfo], float, int]

Parameters:

monoisotopic (bool)

get_mass(monoisotopic=True)[source]

Get total mass for all modifications.

Return type:

float

Parameters:

monoisotopic (bool)

get_mass_charge(monoisotopic=True)[source]

Get total mass and charge for all modifications.

Return type:

tuple[float, int]

Parameters:

monoisotopic (bool)

property mods: tuple[Mod, ...]

Parse stored modifications into typed Mod objects.

parse_items()[source]

Get raw modification items as (mod, count) tuples.

Return type:

Iterable[tuple[TypeVar(T, bound= ModificationProtocol), int]]

parse_tuples()[source]

Get raw modification items as (mod, count) tuples.

Return type:

Iterable[tuple[TypeVar(T, bound= ModificationProtocol), int]]

serialize()[source]

Serialize modifications to string format.

Return type:

str

class peptacular.spans.Span(start, end, missed_cleavages)[source]

Bases: NamedTuple

Parameters:
  • start (int)

  • end (int)

  • missed_cleavages (int)

end: int

Alias for field number 1

missed_cleavages: int

Alias for field number 2

start: int

Alias for field number 0

class peptacular.isotope.IsotopicData(mass, neutron_count, abundance)[source]

Bases: object

Parameters:
  • mass (float)

  • neutron_count (int)

  • abundance (float)

class peptacular.isotope.IsotopeLookup(mass_step=50, max_isotopes=25, min_abundance_threshold=0.005, use_neutron_count=True, is_abundance_sum=True)[source]

Bases: object

Parameters:
  • mass_step (int)

  • max_isotopes (int)

  • min_abundance_threshold (float)

  • use_neutron_count (bool)

  • is_abundance_sum (bool)

clear_cache()[source]

Clear the cached isotope patterns.

get_cache_size()[source]

Return the number of cached isotope patterns.

Return type:

int

get_isotope_pattern(mass)[source]

Get the isotope pattern for a given mass.

Generates and caches the pattern if not already present.

Return type:

list[IsotopicData]

Parameters:

mass (float)

Data Modules

Peptacular relies on the python tacular package for data on amino acids, modifications, elements, and other chemical entities, but are accessible from the peptacular namespace.

see https://tacular.readthedocs.io