Skip to content

select is the primary data retrieval method for the SomaScan.db database. select will retrieve a data frame of SomaScan annotations based on the parameters provided by the keys, columns, and keytype arguments. The default keytype is "PROBEID", e.g. the SomaLogic SeqId; this value is used to tie all annotations back to a SomaScan-specific identifier.

Usage

# S4 method for SomaDb
select(x, keys, columns, keytype, menu = NULL, match = FALSE, ...)

Arguments

x

the AnnotationDb object. But in practice this will mean an object derived from an AnnotationDb object such as a OrgDb or ChipDb object.

keys

the keys to select records for from the database. All possible keys are returned by using the keys method.

columns

the columns or kinds of things that can be retrieved from the database. As with keys, all possible columns are returned by using the columns method.

keytype

the keytype that matches the keys used. For the select methods, this is used to indicate the kind of ID being used with the keys argument. For the keys method this is used to indicate which kind of keys are desired from keys

menu

a character string identifying a SomaScan menu version (optional). Possible options include: "5k","7k", or "11k", as well as the version numbers for those menus ("v4.0", "v4.1", or "v5.0", respectively). May only be used when keytype = "PROBEID". This argument will filter the keys to the specified menu and only return data associated with analytes present in that menu. By default, all annotations from all analytes are available.

match

a logical (TRUE/FALSE). Must be used with the "SYMBOL", "ALIAS", or "GENENAME" keytypes only. If true, the character string provided for keys will be used as a search term. The string will be used to match symbols that also start with that string (ex. a key of "CASP1" will return annotations for both the CASP10 & CASP14 genes).

...

Arguments passed on to AnnotationDbi::select

Value

A data.frame containing the retrieved annotations.

Details

Users should be aware that if they call select and request columns that have multiple matches for the provided keys (e.g. GO terms), select will return a data.frame with one row for each possible match. This can have a multiplicative effect and result in a large number of returned values. In general, if a user needs to retrieve a column that has a many-to-one relationship to the original keys, it is best to extract data from that column in its own query.

Author

Amanda Hiser

Examples

# Retrieve a set of example keys
keys <- head(keys(SomaScan.db))
keys
#> [1] "10000-28" "10001-7"  "10003-15" "10006-25" "10008-43" "10010-10"

# Look up the gene symbol and gene type for all example keys
select(SomaScan.db, keys = keys, columns = c("SYMBOL", "GENETYPE"))
#> 'select()' returned 1:1 mapping between keys and columns
#>    PROBEID SYMBOL       GENETYPE
#> 1 10000-28 CRYBB2 protein-coding
#> 2  10001-7   RAF1 protein-coding
#> 3 10003-15  ZNF41 protein-coding
#> 4 10006-25   ELK1 protein-coding
#> 5 10008-43 GUCA1A protein-coding
#> 6 10010-10  BECN1 protein-coding

# Look up SomaScan SeqIds & proteins associated with a gene of interest
select(SomaScan.db, keys = "NOTCH3", keytype = "SYMBOL", 
      columns = c("PROBEID", "UNIPROT"))
#> 'select()' returned 1:many mapping between keys and columns
#>   SYMBOL PROBEID UNIPROT
#> 1 NOTCH3 5108-72  Q9UEB3
#> 2 NOTCH3 5108-72  Q9UM47
#> 3 NOTCH3 5108-72  Q9UPL3
#> 4 NOTCH3 5108-72  Q9Y6L8