Post

How to prioritize disease-gene pairs by calculating the similarity between patient clinical features and all known diseases using Human Phenotype Ontology.

Human Phenotype Ontology (HPO) [ https://hpo.jax.org/ ] is a set of hierarchically structured terms widely used to describe both standard human disease symptoms and clinical phenotypes of individual patients. HPO is a standard vocabulary for annotating disease phenotypes, and is adopted by many different public disease databases. One example of these valuable resources is Orphanet [ https://www.orpha.net/ ]. Aggregate data constantly updated by Orphanet can be accessed from the Orphadata website [http://www.orphadata.org/cgi-bin/index.php], such as rare disease associated genes and clinical symptoms. These datasets are available in nine languages. There are several bioinformatics methods that use Human Phenotype Ontology (HPO) in clinical diagnostics; these methods generally use descriptions of a patient's clinical features encoded with HPO terms, and return a diagnostic prediction based on the ontological similarity between the patient's symptoms and the HPO codes assig

Assigning Variants to Genes (V2G) with OpenTargets API and ghql

Looking for an alternative to quickly annotate a set of variants identified for example by a GWAS? OpenTargets [https://www.opentargets.org/] has a very useful feature that allows you to solve this problem quickly. Citing the OpenTargets documentation : All variants in the variant index are annotated using our Variant-to-Gene (V2G) pipeline. The pipeline integrates V2G evidence that fall into four main data types: Molecular phenotype quantitative trait loci experiments (QTLs) Chromatin interaction experiments, e.g. Promoter Capture Hi-C (PCHi-C) In silico functional predictions, e.g. Variant Effect Predictor (VEP) from Ensembl Distance between the variant and each gene's canonical transcription start site (TSS) Within each data type there are multiple sources of information produced by different experimental methods. Some of these sources can further be broken down into separate tissues or cell types (features). A full list of data sources used in the V2G pipeline can be seen on

How to retrieve the strength and direction of association of a group of genetic variants to > 700+ immunophenotypes

The very powerful ieugwasr library allows you to query tens of thousands of GWAS summary statistics in R (available in the IEU GWAS collection ) very quickly.   Below I show you a nice example of the potential of this library. Suppose we want to retrieve the strength and direction of association of a group of variants to the 700+ immunophenotypes published by Orrù et al in 2020. First, we need to authenticate with ieugwasr: ieugwasr::get_access_token() Next, we need to extract the identification codes of the summary statistics of the article by Orrù et al " Complex genetic signatures in immune cells underlie autoimmunity and inform therapy ": ieugwasr::get_access_token() gwi <- gwasinfo() immune <- subset(gwi, gwi$pmid=='32929287') At this point, with a few lines of code, we can retrieve the association data for a group of variants (in this case two) and save the results in a file for each variant: i <- associations(variants = c('rs11651270&