Query

Examples

For most of the string parameters you can use % as wildcard (please check the documentation below). All methods have a parameter limit which allows to limit the number of results.

Methods

>>> import pyctd
>>> q = pyctd.query()
>>> q.get_diseases(disease_id='MESH:D000544', definition='%degenerative%')
>>> q.get_genes(gene_symbol='TSP_15922', uniprot_id='E5T972')
>>> q.get_pathways(pathway_name='%bla')
>>> q.get_chemicals(chemical_name='Alz%')
>>> q.get_chem_gene_interaction_action(organism_id='9606', gene_symbol='APP')
>>> q.get_gene__diseases(limit=10)

Properties

>>> import pyctd
>>> q = pyctd.query()
>>> q.gene_forms
>>> q.interaction_actions
>>> q.actions
>>> q.pathways

Query Manager Reference

class pyctd.manager.query.QueryManager(connection=None, echo=False)[source]

Query interface to database.

Parameters:
  • connection (str) – SQLAlchemy
  • echo (bool) – True or False for SQL output of SQLAlchemy engine
actions

Gets the list of allowed actions

Return type:list[str]
direct_evidences
Returns:All available direct evidences for gene disease correlations
Return type:list
gene_forms
Returns:List of strings for all available gene forms
Return type:list[str]
get_action(limit=None, as_df=False)[source]
Parameters:
  • limit
  • as_df
Returns:

get_chem_gene_interaction_actions(gene_name=None, gene_symbol=None, gene_id=None, limit=None, cas_rn=None, chemical_id=None, chemical_name=None, organism_id=None, interaction_sentence=None, chemical_definition=None, gene_form=None, interaction_action=None, as_df=False)[source]

Get all interactions for chemicals on a gene or biological entity (linked to this gene).

Chemicals can interact on different types of biological entities linked to a gene. A list of allowed entities linked to a gene can be retrieved via the attribute gene_forms.

Interactions are classified by a combination of interaction (‘affects’, ‘decreases’, ‘increases’) and actions (‘activity’, ‘expression’, ... ). A complete list of all allowed interaction_actions can be retrieved via the attribute interaction_actions.

Parameters:
  • as_df (bool) – if set to True result returns as pandas.DataFrame
  • interaction_sentence (str) – sentence describing the interactions
  • organism_id (int) – NCBI TaxTree identifier. Example: 9606 for Human.
  • chemical_name (str) – chemical name
  • chemical_id (str) – chemical identifier
  • cas_rn (str) – CAS registry number
  • chemical_definition (str) –
  • gene_symbol (str) – HGNC gene symbol
  • gene_name (str) – gene name
  • gene_id (int) – NCBI Entrez Gene identifier
  • gene_form (str) – gene form
  • interaction_action (str) – combination of interaction and actions
  • limit (int) – maximum number of results
Return type:

list[models.ChemGeneIxn]

See also

pyctd.manager.models.ChemGeneIxn

which is linked to: pyctd.manager.models.Chemical pyctd.manager.models.Gene pyctd.manager.models.ChemGeneIxnPubmed

Available interaction_actions and gene_forms pyctd.manager.database.Query.interaction_actions() pyctd.manager.database.Query.gene_forms()

get_chemical(chemical_name=None, chemical_id=None, cas_rn=None, drugbank_id=None, parent_id=None, parent_tree_number=None, tree_number=None, synonym=None, limit=None, as_df=False)[source]

Get chemical

Parameters:
  • as_df (bool) – if set to True result returns as pandas.DataFrame
  • chemical_name (str) – chemical name
  • chemical_id (str) – cehmical identifier
  • cas_rn (str) – CAS registry number
  • drugbank_id (str) – DrugBank identifier
  • parent_id (str) – identifiers of the parent terms
  • parent_tree_number (str) – identifiers of the parent nodes
  • tree_number (str) – identifiers of the chemical’s nodes
  • synonym (str) – chemical synonym
  • limit (int) – maximum number of results
Returns:

list of pyctd.manager.models.Chemical objects

get_chemical__by__disease(disease_name, limit=None, as_df=False)[source]
Parameters:
  • disease_name
  • limit
  • as_df
Returns:

get_chemical_diseases(direct_evidence=None, inference_gene_symbol=None, inference_score=None, inference_score_operator=None, cas_rn=None, chemical_name=None, chemical_id=None, chemical_definition=None, disease_definition=None, disease_id=None, disease_name=None, limit=None, as_df=False)[source]

Get chemical–disease associations with inference gene

Parameters:
  • direct_evidence – direct evidence
  • inference_gene_symbol – inference gene symbol
  • inference_score – inference score
  • inference_score_operator – inference score operator
  • cas_rn
  • chemical_name – chemical name
  • chemical_id
  • chemical_definition
  • disease_definition
  • disease_id
  • disease_name – disease name
  • limit (int) – maximum number of results
  • as_df (bool) – if set to True result returns as pandas.DataFrame
Returns:

list of pyctd.manager.database.models.ChemicalDisease objects

get_disease(disease_name=None, disease_id=None, definition=None, parent_ids=None, tree_numbers=None, parent_tree_numbers=None, slim_mapping=None, synonym=None, alt_disease_id=None, limit=None, as_df=False)[source]

Get diseases

Parameters:
  • as_df (bool) – if set to True result returns as pandas.DataFrame
  • limit (int) – maximum number of results
  • disease_name (str) – disease name
  • disease_id (str) – disease identifier
  • definition (str) – definition of disease
  • parent_ids (str) – parent identifiers, delimiter |
  • tree_numbers (str) – tree numbers, delimiter |
  • parent_tree_numbers (str) – parent tree numbers, delimiter
  • slim_mapping (str) – term derived from the MeSH tree structure for the “Diseases” [C] branch, that classifies MEDIC diseases into high-level categories
  • synonym (str) – disease synonyms
  • alt_disease_id (str) – alternative disease identifiers
Returns:

list of pyctd.manager.models.Disease object

Todo

normalize parent_ids, tree_numbers and parent_tree_numbers in pyctd.manager.models.Disease

get_disease_pathways(disease_id=None, disease_name=None, pathway_id=None, pathway_name=None, disease_definition=None, limit=None, as_df=False)[source]

Get disease pathway link

Parameters:
  • as_df (bool) – if set to True result returns as pandas.DataFrame
  • disease_id
  • disease_name
  • pathway_id
  • pathway_name
  • disease_definition
  • limit (int) – maximum number of results
Returns:

list of pyctd.manager.database.models.DiseasePathway objects

get_gene(gene_name=None, gene_symbol=None, gene_id=None, synonym=None, uniprot_id=None, pharmgkb_id=None, biogrid_id=None, alt_gene_id=None, limit=None, as_df=False)[source]

Get genes

Parameters:
  • as_df (bool) – if set to True result returns as pandas.DataFrame
  • alt_gene_id
  • gene_name (str) – gene name
  • gene_symbol (str) – HGNC gene symbol
  • gene_id (int) – NCBI Entrez Gene identifier
  • synonym (str) – Synonym
  • uniprot_id (str) – UniProt primary accession number
  • pharmgkb_id (str) – PharmGKB identifier
  • biogrid_id (int) – BioGRID identifier
  • limit (int) – maximum of results
Return type:

list[models.Gene]

get_gene_disease(direct_evidence=None, inference_chemical_name=None, inference_score=None, gene_name=None, gene_symbol=None, gene_id=None, disease_name=None, disease_id=None, disease_definition=None, limit=None, as_df=False)[source]

Get gene–disease associations

Parameters:
  • as_df (bool) – if set to True result returns as pandas.DataFrame
  • gene_id (int) – gene identifier
  • gene_symbol (str) – gene symbol
  • gene_name (str) – gene name
  • direct_evidence (str) – direct evidence
  • inference_chemical_name (str) – inference_chemical_name
  • inference_score (float) – inference score
  • inference_chemical_name – chemical name
  • disease_name – disease name
  • disease_id – disease identifier
  • disease_definition – disease definition
  • limit (int) – maximum number of results
Returns:

list of pyctd.manager.database.models.GeneDisease objects

get_gene_pathways(gene_name=None, gene_symbol=None, gene_id=None, pathway_id=None, pathway_name=None, limit=None, as_df=False)[source]

Get gene pathway link

Parameters:
  • as_df (bool) – if set to True result returns as pandas.DataFrame
  • gene_name (str) – gene name
  • gene_symbol (str) – gene symbol
  • gene_id (int) – NCBI Gene identifier
  • pathway_id
  • pathway_name (str) – pathway name
  • limit (int) – maximum number of results
Returns:

list of pyctd.manager.database.models.GenePathway objects

get_go_enriched__by__chemical_name(chemical_name, limit=None, as_df=False)[source]
Parameters:
  • chemical_name
  • limit
  • as_df
Returns:

get_marker_chemical__by__disease_name(disease_name, limit=None, as_df=False)[source]
Parameters:
  • disease_name
  • limit
  • as_df
Returns:

get_pathway(pathway_name=None, pathway_id=None, limit=None, as_df=False)[source]

Get pathway

Note

Format of pathway_id is KEGG:X* or REACTOME:X* . X* stands for a sequence of digits

Parameters:
  • as_df (bool) – if set to True result returns as pandas.DataFrame
  • pathway_name (str) – pathway name
  • pathway_id (str) – KEGG or REACTOME identifier
  • limit (int) – maximum number of results
Returns:

list of pyctd.manager.models.Pathway objects

get_pathway_enriched__by__chemical_name(chemical_name, limit=None, as_df=False)[source]
Parameters:
  • chemical_name
  • limit
  • as_df
Returns:

get_therapeutic_chemical__by__disease_name(disease_name, limit=None, as_df=False)[source]

Get therapeutic chemical by disease name

Parameters:
  • as_df (bool) – if set to True result returns as pandas.DataFrame
  • limit (int) – maximum number of results
  • disease_name (str) – disease name
Returns:

therapeutic chemical

Return type:

list[models.ChemicalDisease]

interaction_actions
Returns:List of strings for allowed interaction/actions combinations
Return type:list[str]
pathways

Get all pathways

Return type:list[models.Pathway]