New R installation with your old library collection: a scripting solution

Few days ago, I installed a 'fresh' Windows and used this opportunity to migrate to the newest version of R (4.0.3) and R Studio (1.3.1093). At the same time, as most users, I wanted to keep my library collection from the previous installations. There are many ways to achieve that: you may copy your old library folder to the 'fresh' one and update the packages in R, for example. But, because each version of base R has obviously a bit changed internals, you can be almost sure, that the update will fail for a quite number of them.
 
This time I took it as a little challenge and set off to develop a simple tool which 'migrates' my libraries to the new R version. The first step was to get a list of the installed packages in the old R. Provided you it is still up, you may simply retrieve a named matrix with names of the installed packages by calling installed.packages() function. The output is a bit verbose, n fact, you just need the 'Package' column to proceed: it contains the package names sufficient for the installation.
 
The case is a bit complicated, when your old R is already down but you still have the old library folder. For this case I wrote a tiny R script (important: it should work with base R, no extra packages!):
 
# reading the path data -----

  lib_path_file <- file(description = 'lib_path.txt',
                        open = 'r')
    
  lib_path <- readLines(con = lib_path_file,
                        n = 1,
                        warn = F)
 
  close(lib_path_file)
 
# obtaining the library folder names ----
 
  inst_packgs <- list.files(lib_path)
 
# saving the library list as a table on the disc ----
 
  write(inst_packgs,
        file = 'installed_pcgs.txt') 

 
With either way, I got the names of the installed packages. Next, I wanted to write another simple tool to install the packages provided their names. The tool should avoid re-installing the already installed packages (note, some will be installed with R Studio!) - the prime reason for that is time. A simple check, if the package name is correct would help as well. Another issue is that, in my case, not all needed packages are provided by CRAN: quite a significant number of them comes from Bioconductor. For this reason, I decided to use the Bioconductor installer function, BiocManager::install(), which checks for the packages also at CRAN by default.
 
The installing script uses two functions: 
 
inst_message <- function(msg_text, sep = T) {
    
    if(sep) {
      
      separator <- paste(rep('>', 40), collapse = '')
      
    } else {
      
      separator <- NULL
      
    }
    
    message(paste(msg_text, separator))
    
  } 

 
for printing user friendly messages in the R console and:
 

package_installer <- function(package_name_vec) {
    
    ## installs the packages from the list, if not already installed
    ## works with BiocInstaller

    
    base_pcks <- unname(installed.packages()[, 'Package']) ## packages installed before starting the function
    curr_installed <- base_pcks ## a list of the installed packages updated during the installation process
    
    if(!'BiocManager' %in% base_pcks) {
      
      inst_message('Installing BiocManager')
      
      install.packages('BiocManager') ## installing the BiocManager, if not done before
    
    }
    
    bioc_available <- BiocManager::available() ## packages that can be installed by BiocManager
    inst_result <- data.frame(Package = package_name_vec,
                              Installed = rep(NA, length(package_name_vec))) ## a data table holding the installation results
    
    
    for(pckg_name in package_name_vec) {
      
      inst_message(paste('Installing:', pckg_name))
      
      if(!pckg_name %in% bioc_available) {
        
        inst_result[inst_result[, 'Package'] == pckg_name, 'Installed'] <- 'Not available'
        
        inst_message('Package unavailable', sep = F)
        
        next
        
      } else if(pckg_name %in% curr_installed){
        
        inst_result[inst_result[, 'Package'] == pckg_name, 'Installed'] <- 'Already installed'
        
        inst_message('Package already installed', sep = F)
        
        next
        
      } else {
        
        BiocManager::install(pckg_name)
        
        inst_result[inst_result[, 'Package'] == pckg_name, 'Installed'] <- 'Newly installed'
        
        inst_message('Package successfully installed', sep = F)
        
      }
      
    }
    
    return(inst_result)
    
  }


The second function has one minor drawback: the dependencies installed along with the given package are listed in the result table as 'Already installed' - one may consider improving it, for my purposes the function was more than enough. The rest of the script is straight forward, in this case the list of old packages is provided in the text file: 
 
# reading the content of the package list file -----

  pckg_lst_file <- file('installed_pcgs.txt',
                        open = 'r')

  pckg_lst <- readLines(pckg_lst_file)
 
  close(pckg_lst_file)
 
# installing the packages -----
 
  inst_results <- package_installer(pckg_lst)
 
  write.table(inst_results,
              file = 'inst_results.txt',
              sep = '\t')
 
# END ---

 
The script saves at the end a handy table with installation results. Have fun!



 

 

 

 




 
 


Comments

Popular posts from this blog

Fact check: does anti-SARS-CoV-2 vaccination work?

'Mild-course', home-isolated: still a missing puzzle to understand the COVID-19 pandemic

Fact check: is Omicron less dangerous than Delta?