Skip to content

Occasionally, additional clinical data is obtained after samples have been submitted to SomaLogic, or even after 'SomaScan' results have been delivered. This requires the new clinical variables, i.e. non-proteomic, data to be merged with 'SomaScan' data into a "new" ADAT prior to analysis. merge_clin() easily merges such clinical variables into an existing soma_adat object and is a simple wrapper around dplyr::left_join().

Usage

merge_clin(x, clin_data, by = NULL, by_class = NULL, ...)

Arguments

x

A soma_adat object (with intact attributes), typically created using read_adat().

clin_data

One of 2 options:

  • a data frame containing clinical variables to merge into x, or

  • a path to a file, typically a *.csv, containing clinical variables to merge into x.

by

A character vector of variables to join by. See dplyr::left_join() for more details.

by_class

If clin_data is a file path, a named character vector of the variable and its class. This ensures the "by-key" is compatible for the join. For example, c(SampleId = "character"). See read.table() for details about its colClasses argument, and also the examples below.

...

Additional parameters passed to dplyr::left_join().

Value

A soma_adat with new clinical variables merged.

Details

This functionality also exists as a command-line tool (R script) contained in merge_clin.R that lives in the cli/merge system file directory. Please see:

Author

Stu Field

Examples

# retrieve clinical data
clin_file <- system.file("cli/merge", "meta.csv",
                         package = "SomaDataIO",
                         mustWork = TRUE)
clin_file
#> [1] "/Users/runner/work/_temp/Library/SomaDataIO/cli/merge/meta.csv"

# view clinical data to be merged:
# 1) `group`
# 2) `newvar`
clin_df <- read.csv(clin_file, colClasses = c(SampleId = "character"))
clin_df
#>   SampleId group    newvar
#> 1        1     a -0.757960
#> 2        3     b -0.363479
#> 3        5     a  1.010235
#> 4        7     b  1.342776
#> 5        9     a -3.010827

# create mini-adat
apts <- withr::with_seed(123, sample(getAnalytes(example_data), 2L))
adat <- head(example_data, 9L) |>   # 9 x 2
  dplyr::select(SampleId, all_of(apts))

# merge clinical variables
merged <- merge_clin(adat, clin_df, by = "SampleId")
merged
#> ══ SomaScan Data ═════════════════════════════════════════════════════════
#>      SomaScan version     V4 (5k)
#>      Signal Space         5k
#>      Attributes intact    
#>      Rows                 9
#>      Columns              5
#>      Clinical Data        3
#>      Features             2
#> ── Column Meta ───────────────────────────────────────────────────────────
#>  SeqId, SeqIdVersion, SomaId, TargetFullName, Target, UniProt,
#>  EntrezGeneID, EntrezGeneSymbol, Organism, Units, Type,
#>  Dilution, PlateScale_Reference, CalReference,
#>  Cal_Example_Adat_Set001, ColCheck,
#>  CalQcRatio_Example_Adat_Set001_170255, QcReference_170255,
#>  Cal_Example_Adat_Set002, CalQcRatio_Example_Adat_Set002_170255,
#>  Dilution2
#> ── Tibble ────────────────────────────────────────────────────────────────
#> # A tibble: 9 × 6
#>   row_names      SampleId seq.19251.56 seq.19328.51 group newvar
#>   <chr>          <chr>           <dbl>        <dbl> <chr>  <dbl>
#> 1 258495800012_3 1               2933.         504  a     -0.758
#> 2 258495800004_7 2               1995.         439. NA    NA    
#> 3 258495800010_8 3               3424.         421. b     -0.363
#> 4 258495800003_4 4               2989.         468. NA    NA    
#> 5 258495800009_4 5               5078.         474. a      1.01 
#> 6 258495800012_8 6               6131.         546. NA    NA    
#> 7 258495800001_3 7               3865          468. b      1.34 
#> 8 258495800004_8 8               6865.         469. NA    NA    
#> 9 258495800001_8 9               9204.         494. a     -3.01 
#> ══════════════════════════════════════════════════════════════════════════

# Alternative syntax:
#   1) pass file path
#   2) merge on different variable names
#   3) convert join type on-the-fly
clin_file2 <- system.file("cli/merge", "meta2.csv",
                          package = "SomaDataIO",
                          mustWork = TRUE)

id_type <- typeof(adat$SampleId)
merged2 <- merge_clin(adat, clin_file2,                # file path
                      by = c(SampleId = "ClinKey"),    # join on 2 variables
                      by_class = c(ClinKey = id_type)) # match types
merged2
#> ══ SomaScan Data ═════════════════════════════════════════════════════════
#>      SomaScan version     V4 (5k)
#>      Signal Space         5k
#>      Attributes intact    
#>      Rows                 9
#>      Columns              5
#>      Clinical Data        3
#>      Features             2
#> ── Column Meta ───────────────────────────────────────────────────────────
#>  SeqId, SeqIdVersion, SomaId, TargetFullName, Target, UniProt,
#>  EntrezGeneID, EntrezGeneSymbol, Organism, Units, Type,
#>  Dilution, PlateScale_Reference, CalReference,
#>  Cal_Example_Adat_Set001, ColCheck,
#>  CalQcRatio_Example_Adat_Set001_170255, QcReference_170255,
#>  Cal_Example_Adat_Set002, CalQcRatio_Example_Adat_Set002_170255,
#>  Dilution2
#> ── Tibble ────────────────────────────────────────────────────────────────
#> # A tibble: 9 × 6
#>   row_names      SampleId seq.19251.56 seq.19328.51 group newvar
#>   <chr>          <chr>           <dbl>        <dbl> <chr>  <dbl>
#> 1 258495800012_3 1               2933.         504  a     -0.758
#> 2 258495800004_7 2               1995.         439. NA    NA    
#> 3 258495800010_8 3               3424.         421. b     -0.363
#> 4 258495800003_4 4               2989.         468. NA    NA    
#> 5 258495800009_4 5               5078.         474. a      1.01 
#> 6 258495800012_8 6               6131.         546. NA    NA    
#> 7 258495800001_3 7               3865          468. b      1.34 
#> 8 258495800004_8 8               6865.         469. NA    NA    
#> 9 258495800001_8 9               9204.         494. a     -3.01 
#> ══════════════════════════════════════════════════════════════════════════