Get Analyte Annotation Information
Source:R/getAnalyteInfo.R
, R/getTargetNames.R
, R/z-deprecated.R
getAnalyteInfo.Rd
Uses the Col.Meta
attribute (analyte annotation data that appears above
the protein measurements in the *.adat
text file) of a soma_adat
object,
adds the AptName
column key, conducts a few sanity checks, and
generates a "lookup table" of analyte data that can be used for simple
manipulation and indexing of analyte annotation information.
Most importantly, the analyte column names of the soma_adat
(e.g. seq.XXXX.XX
) become the AptName
column of the lookup table and
represents the key index between the table and soma_adat
from which it comes.
Arguments
- adat
A
soma_adat
object (with intact attributes), typically created usingread_adat()
.- tbl
A
tibble
object containing analyte target annotation information. This is usually the result of a call togetAnalyteInfo()
.
Value
A tibble
object with columns corresponding
to the column meta data entries in the soma_adat
. One row per analyte.
Functions
getTargetNames()
: creates a lookup table (or dictionary) as a named list object ofAptNames
and Target names in key-value pairs. This is a convenient tool to quickly access aTargetName
given theAptName
in which the key-value pairs map theseq.XXXX.XX
to its correspondingTargetName
intbl
. This structure which provides a convenient auto-completion mechanism at the command line or for generating plot titles.
Examples
# Get Aptamer table
anno_tbl <- getAnalyteInfo(example_data)
anno_tbl
#> # A tibble: 5,284 × 22
#> AptName SeqId SeqIdVersion SomaId TargetFullName Target UniProt
#> <chr> <chr> <dbl> <chr> <chr> <chr> <chr>
#> 1 seq.10000.28 10000-… 3 SL019… Beta-crystall… CRBB2 P43320
#> 2 seq.10001.7 10001-7 3 SL002… RAF proto-onc… c-Raf P04049
#> 3 seq.10003.15 10003-… 3 SL019… Zinc finger p… ZNF41 P51814
#> 4 seq.10006.25 10006-… 3 SL019… ETS domain-co… ELK1 P19419
#> 5 seq.10008.43 10008-… 3 SL019… Guanylyl cycl… GUC1A P43080
#> 6 seq.10011.65 10011-… 3 SL019… Inositol poly… OCRL Q01968
#> 7 seq.10012.5 10012-5 3 SL014… SAM pointed d… SPDEF O95238
#> 8 seq.10013.34 10013-… 3 SL025… Fc_MOUSE Fc_MO… Q99LC4
#> 9 seq.10014.31 10014-… 3 SL007… Zinc finger p… SLUG O43623
#> 10 seq.10015.119 10015-… 3 SL014… Voltage-gated… KCAB2 Q13303
#> # ℹ 5,274 more rows
#> # ℹ 15 more variables: EntrezGeneID <chr>, EntrezGeneSymbol <chr>,
#> # Organism <chr>, Units <chr>, Type <chr>, Dilution <chr>,
#> # PlateScale_Reference <dbl>, CalReference <dbl>,
#> # Cal_Example_Adat_Set001 <dbl>, ColCheck <chr>,
#> # CalQcRatio_Example_Adat_Set001_170255 <dbl>,
#> # QcReference_170255 <dbl>, Cal_Example_Adat_Set002 <dbl>, …
# Use `dplyr::group_by()`
dplyr::tally(dplyr::group_by(anno_tbl, Dilution)) # print summary by dilution
#> # A tibble: 4 × 2
#> Dilution n
#> <chr> <int>
#> 1 0 12
#> 2 0.005 173
#> 3 0.5 828
#> 4 20 4271
# Columns containing "Target"
anno_tbl |>
dplyr::select(dplyr::contains("Target"))
#> # A tibble: 5,284 × 2
#> TargetFullName Target
#> <chr> <chr>
#> 1 Beta-crystallin B2 CRBB2
#> 2 RAF proto-oncogene serine/threonine-protein kinase c-Raf
#> 3 Zinc finger protein 41 ZNF41
#> 4 ETS domain-containing protein Elk-1 ELK1
#> 5 Guanylyl cyclase-activating protein 1 GUC1A
#> 6 Inositol polyphosphate 5-phosphatase OCRL-1 OCRL
#> 7 SAM pointed domain-containing Ets transcription factor SPDEF
#> 8 Fc_MOUSE Fc_MOUSE
#> 9 Zinc finger protein SNAI2 SLUG
#> 10 Voltage-gated potassium channel subunit beta-2 KCAB2
#> # ℹ 5,274 more rows
# Rows of "Target" starting with MMP
anno_tbl |>
dplyr::filter(grepl("^MMP", Target))
#> # A tibble: 15 × 22
#> AptName SeqId SeqIdVersion SomaId TargetFullName Target UniProt
#> <chr> <chr> <dbl> <chr> <chr> <chr> <chr>
#> 1 seq.15419.15 15419-15 3 SL012… Matrix metall… MMP20 O60882
#> 2 seq.2579.17 2579-17 5 SL000… Matrix metall… MMP-9 P14780
#> 3 seq.2788.55 2788-55 1 SL000… Stromelysin-1 MMP-3 P08254
#> 4 seq.2789.26 2789-26 2 SL000… Matrilysin MMP-7 P09237
#> 5 seq.2838.53 2838-53 1 SL003… Matrix metall… MMP-17 Q9ULZ9
#> 6 seq.4160.49 4160-49 1 SL000… 72 kDa type I… MMP-2 P08253
#> 7 seq.4496.60 4496-60 2 SL000… Macrophage me… MMP-12 P39900
#> 8 seq.4924.32 4924-32 1 SL000… Interstitial … MMP-1 P03956
#> 9 seq.4925.54 4925-54 2 SL000… Collagenase 3 MMP-13 P45452
#> 10 seq.5002.76 5002-76 1 SL002… Matrix metall… MMP-14 P50281
#> 11 seq.5268.49 5268-49 3 SL003… Matrix metall… MMP-16 P51512
#> 12 seq.6425.87 6425-87 3 SL007… Matrix metall… MMP19 Q99542
#> 13 seq.8479.4 8479-4 3 SL000… Stromelysin-2 MMP-10 P09238
#> 14 seq.9172.69 9172-69 3 SL000… Neutrophil co… MMP-8 P22894
#> 15 seq.9719.145 9719-145 3 SL003… Matrix metall… MMP-16 P51512
#> # ℹ 15 more variables: EntrezGeneID <chr>, EntrezGeneSymbol <chr>,
#> # Organism <chr>, Units <chr>, Type <chr>, Dilution <chr>,
#> # PlateScale_Reference <dbl>, CalReference <dbl>,
#> # Cal_Example_Adat_Set001 <dbl>, ColCheck <chr>,
#> # CalQcRatio_Example_Adat_Set001_170255 <dbl>,
#> # QcReference_170255 <dbl>, Cal_Example_Adat_Set002 <dbl>,
#> # CalQcRatio_Example_Adat_Set002_170255 <dbl>, Dilution2 <dbl>
# Target names
tg <- getTargetNames(anno_tbl)
# how to use for plotting
feats <- sample(anno_tbl$AptName, 6)
op <- par(mfrow = c(2, 3))
sapply(feats, function(.x) plot(1:10, main = tg[[.x]]))
#> $seq.16882.27
#> NULL
#>
#> $seq.7145.1
#> NULL
#>
#> $seq.12720.71
#> NULL
#>
#> $seq.2190.55
#> NULL
#>
#> $seq.5090.49
#> NULL
#>
#> $seq.8633.18
#> NULL
#>
par(op)