Methods & Resources
Search Approach for GWAS
An experienced reviewer periodically searched the literature and reviewed each abstract and publication to determine if criteria for inclusion were met. The search terms in the Table below were applied. Annual checks were also made against other existing GWAS databases to ensure publications were not missed.
Occasional searches of GoogleScholar were conducted with the search terminologies above which identified a few additional studies including some in journals not indexed by PubMed. If articles or portions of their supplementary material were missing or unavailable authors were contacted with reprint requests in order to determine if the study met inclusion criteria. All GWAS included in GRASP Build 22.214.171.124 were first published on or before July 8, 2013.
Study Inclusion and Exclusion Criteria
We define GWAS as studies that reported testing ≥25,000 human genetic markers for 1 or more trait. We exclude studies for the following reasons: CNV-only studies, replication/follow-up studies testing <25K markers, non-human only studies, article not available in English, gene-environment or gene-gene GWAS where single SNP main effects are not given, linkage only studies, aCGH/LOH only studies, studies only presenting gene-based or pathway-based results, heterozygosity/homozygosity (genome-wide or long run) studies, simulation-only studies, studies which we judge as redundant with prior studies since they do not provide significant inclusion of new samples or exposure of new results (e.g., many methodological papers on the WTCCC and FHS GWAS).
GWAS Data Extraction
GWAS data extraction was performed by experienced researchers. QUOSA (Waltham, MA)
was used to automate the download of article PDFs where possible. Supplementary
text, tables and figures were all examined to determine if they contained any genetic
marker results. All manuscript materials were manually scanned to determine if there
was indication of additional GWAS results available at an external site or database,
in which case attempts were made to download all available results. If obtaining
GWAS results required institutional approvals or other extensive application these
were not pursued.
A set of information was collected at the level of each individual article. Basic
information about the article was collected (PubMed identifier, date of first publication,
journal, article title). On 9/15/14 studies in GRASP were compared
by PubMedID against the NHGRI GWAS catalog to index whether studies from GRASP were
included. An overall phenotype description(s) was assigned to each article based
on the GWAS conducted. Genotyping platforms (Affymetrix, Illumina, Perlegen, Sequenom
or other combinations) were recorded. The number of SNP markers included in post-QC
analyses were recorded if clearly given, otherwise they were approximated based
on the SNP array(s) or imputation.
All SNP associations in the Full Download version are mapped to the genome [hg19] build and reference SNP database [dbSNP build 141]. All SNPs and Genes in the Query Search are based on current NCBI builds and may differ from the Download version. Individual SNP-phenotype entries in GRASP have been annotated for assigned SNP function [dbSNP functional classifications], location within or nearby to protein coding genes [based on RefSeq genes], non-coding RNAs [based on Cabili PMID 21890647], microRNAs [based on miRbase version 18, Kozomara PMID 21037258], microRNA target binding sites [based on PolymiRTS, Ziebarth PMID 22080514], validated human enhancer regions [based on Vista Enhancers, Visel PMID 17130149] and other known regulatory elements [based on ORegAnno, Griffith PMID 18006570], amino acid changes and their predicted consequences [PolyPhen2, Adzhubei PMID 20354512; SIFT, Kumar PMID 19561590; LRT, Chun PMID 19602639] and post-translational modifications and other protein functional features [based on mapping of UniProt features to amino acid positions in the current genome build, UniProt PMID 21051339].
An NHLBIkey unique to each individual result was generated in the final database using a concatenation of the PubMed ID + the row number. A “Creation Date” and “LastCurationDate” were assigned to each result upon its creation in GRASP. In the event that entries are edited over additional Builds the LastCurationDate will not match the CreationDate but the NHLBIkey will remain the same.
Phenotype Terminology Assignment
We have assigned our own phenotype categories to facilitate searching and categorization.
Studies with No Available Results
Some studies conducted GWAS but do not report any SNP-specific results in their manuscripts or via supplemental materials that are readily accessed. These studies are included in the overall GWAS study list but will not have any results found. A separate list of these studies is maintained here.