Field-Based Querying Tutorial
This tutorial demonstrates how to use UniProtMapper’s field-based querying functionality.
Basic Field Queries
A simple example on querying UniProtKB through field search:
from UniProtMapper import ProtKB
from UniProtMapper.uniprotkb_fields import reviewed, organism_name
protkb = ProtKB()
# Find reviewed human proteins
query = reviewed(True) & organism_name("human")
result, failed = protkb.get(query)
Note
Running this code will take some time as it retrieves all reviewed human proteins! Each iteration of the displayed progress bar represents 500 entries fetched from UniProtKB.
Complex Queries
You can combine multiple fields with boolean operators, illustrated by the following examples:
Example 1:
from UniProtMapper import ProtKB
from UniProtMapper.uniprotkb_fields import (
organism_name,
length,
mass,
date_modified,
)
protkb = ProtKB()
# Find human proteins:
# - NOT modified after 2023 (in UniProtKB)
# - between 200-300 amino acids
# - mass < 50kDa
query = (
organism_name("human") &
length(200, 300) &
mass("*", 50000) &
(~ date_modified("2023-01-01", "*"))
)
result = protkb.get(query)
Example 2:
from UniProtMapper import ProtKB
from UniProtMapper.uniprotkb_fields import (
xref_count,
organism_id,
reviewed,
fragment,
length,
)
protkb = ProtKB()
# Find human proteins:
# - with 2 or more deposited pdb strctures
# - not fragments fragments
# - reviewed
# - length < 750 amino acids
query = (
xref_count("pdb", 2, "*")
& organism_id(9606)
& reviewed(True)
& fragment(False)
& length("*", 750)
)
result = protkb.get(query)
Note
The fields
parameter is also supported by the ProtKB
API. For a full list of the supported fields, check the Supported fields section of the docs.
Field Types
UniProtMapper supports several types of fields. For full documentation on the fields implemented in the package, check Field Querying.
See below for examples of different field types implemented in UniProtMapper.
Boolean Fields
from UniProtMapper.uniprotkb_fields import reviewed, fragment, is_isoform
# Example: Get reviewed entries that are not fragments
query = reviewed(True) & ~fragment(True)
Range Fields
from UniProtMapper.uniprotkb_fields import length, mass
# Example: Proteins between 200-300 amino acids
query = length(200, 300)
Date Range Fields
from UniProtMapper.uniprotkb_fields import date_created, date_modified
# Example: Entries created in 2023
query = date_created("2023-01-01", "2023-12-31")
Text-Based Fields
from UniProtMapper.uniprotkb_fields import gene_exact, keyword, family
# Example: Proteins in kinase family with ATP-binding
query = family("Kinase*") & keyword("ATP-binding")