Chapter 3 Merge Data

Data pertaining to individual samples and stored in external files can be merged with the proteomic data in an ADAT file using the Merge Data panel.

The Merge Data panel

3.1 Selecting a Data File to Merge

External data must be stored in a comma-delimited or a tab-delimited file, and the first row should contain column names. The file containing the data can be uploaded to ProViz by selecting the Browse button and navigating to the file. Once uploaded, a preview of the data file will be displayed below the ADAT Preview.

3.2 Selecting Columns in the ADAT and the Data File

In order to merge external data with the data in the ADAT file, each file should have one column that provides a way to match rows between the two files. Ideally, the columns should have all unique values and should match in a one-to-one fashion between the ADAT file and the external data file. The columns do not need to have the same name. If there are duplicate IDs for the selected matching column in either the ADAT file or the external file, rows may be duplicated in order to match all-to-all, leading to a potentially confusing data file.

The target column in the ADAT can be selected with the ADAT Merge Column selection box, and the column in the external file can be selected with the Data Merge Column selection box.

In this example, each file has a column titled SampleId. All entries in this column in the external file are unique and correspond to the clinical samples’ SampleId values in the ADAT.

3.3 Specifying the Type of Merge

The ADAT file typically contains control samples for which external data is not likely available, or the external data file may not have data for all clinical samples. Choosing Keep All ADAT Rows under Type of Merge will retain all rows in the ADAT file and include NA for those rows not found in the external file. Alternatively, only those rows found in both files can be retained by selecting Keep Only Intersection.

3.4 Merge

Once all selections are made, initiate the merge by pressing the Merge button. If an error occurs, a message will be displayed in the message box at the bottom of the panel, otherwise, the message box will contain updated data dimension information. The message box will provide the exact error message from the underlying R code, and may not be easily interpreted. Typically, if an error occurs, it is due to selecting columns in either the ADAT or the external file that are not compatible. Careful construction of the external data file and selection of matching columns is critical.

3.5 Download Adat

Once merging has been performed, a version of the ADAT can be downloaded by clicking the Download ADAT button. As this new ADAT file may be significantly modified relative to the original, ensure that a new, unique, descriptive file name is specified so that the original ADAT is not overwritten. Keeping good notes regarding the filtering operations is essential to recall how the ADAT was modified when it is returned to at a later date.