Assigning Variants to Genes (V2G) with OpenTargets API and ghql



Looking for an alternative to quickly annotate a set of variants identified for example by a GWAS?

OpenTargets [https://www.opentargets.org/] has a very useful feature that allows you to solve this problem quickly.

Citing the OpenTargets documentation:
All variants in the variant index are annotated using our Variant-to-Gene (V2G) pipeline. The pipeline integrates V2G evidence that fall into four main data types: Molecular phenotype quantitative trait loci experiments (QTLs) Chromatin interaction experiments, e.g. Promoter Capture Hi-C (PCHi-C) In silico functional predictions, e.g. Variant Effect Predictor (VEP) from Ensembl Distance between the variant and each gene's canonical transcription start site (TSS) Within each data type there are multiple sources of information produced by different experimental methods. Some of these sources can further be broken down into separate tissues or cell types (features). A full list of data sources used in the V2G pipeline can be seen on the Data Sources page.

Given a certain variant, OpenTargets returns a set of genes to which that variant can be referred, with a probability value for each gene.

An example can be seen on this page.

The same results visible here can also be obtained through a script in R, using the ghql library.
Through this library it is possible to exploit the interesting Open Targets Genetics GraphQL API.
These APIs have a not very exhaustive documentation, but enough to build very useful queries.

The following example is a query that returns the Variant To Gene annotations preprocessed by OpenTargets for a set of variants.

library(ghql)
library(jsonlite)

# create GraphQL client
cli <- GraphqlClient$new(
  url = "https://genetics-api.opentargets.io/graphql"
)

# create query object
qry <- Query$new()
qry$query('my_query', 'query getGenesForVariant($varId: String!){
  genesForVariant(variantId: $varId) {
    variant, overallScore, 
    gene {
      id, symbol
    }
  }
}')
res <- data.frame()
VARIANTS <- c("1_154445939_T_C", "1_154453788_C_T")
for (VAR in VARIANTS){
  variables <- list(varId = VAR)
  res <- rbind(res, fromJSON(cli$exec(qry$queries$my_query, variables), flatten = TRUE)$data$genesForVariant)
}
print(res)


Commenti

Post popolari in questo blog

How to retrieve the strength and direction of association of a group of genetic variants to > 700+ immunophenotypes

How to prioritize disease-gene pairs by calculating the similarity between patient clinical features and all known diseases using Human Phenotype Ontology.