14  Appendix: BJR Benchmarks

When restricted to one CPU thread, my R implementation of the BJR estimator is about 240 times faster than Jochmans’ original Octave/MATLAB implementation.

When processing 10,000 rings and using 10 CPU threads, my multi-threaded R implementation is 7 times faster than my R implementation in single-threaded mode.

The benchmarks were performed on an AMD Ryzen Threadripper 3970X CPU.

The Octave implementation is a modification of the original implementation included with the BJR paper. Code that was not involved in estimating the CDF, such as mean estimation code, was removed. The hardcoded limit of 4 repeated measurements was raised to 16.

14.1 Comparison of R and original Octave implementation

Generate the test data and estimate the model in an R session:

library(decoyanalysis)

set.seed(314)

generated.data <- gen.bjr.normal.data(N = 100, T = 16, K = 2,
  theta1 = c(0, 3), theta2 = c(1, 1), omega = c(0.3, 0.7))

write.table(generated.data, file = "octave/bjr-benchmark-data.csv",
  sep = ",", col.names = FALSE, row.names = FALSE)

system.time({
  generated.data.results <- bjr(generated.data,
    II = 10, K = 2, cdf.points = seq(-4, 7, by = 0.25),
    estimate.mean.sd = FALSE)
})

The system.time() function will print the number of seconds it took to complete the estimation. “elapsed” is the wall-clock time. On the benchmark machine, the elapsed time was 9.6 seconds.

Next, install Octave, which is an open-source implementation of the proprietary MATLAB programming language. On Linux, it is probably easiest to install the Flatpak version:

sudo flatpak remote-add --if-not-exists flathub https://dl.flathub.org/repo/flathub.flatpakrepo
sudo flatpak install flathub org.octave.Octave

Go to the octave directory of OSPEAD-docs and initiate an Octave session:

cd OSPEAD-docs/octave
flatpak run org.octave.Octave

Then import the test data generated by R and estimate the model:

y = importdata("bjr-benchmark-data.csv", ",");
K = 2;
II = 10;
tic; [OUT XI L] = bjrCDFonly16T(y, K, II); toc

Once done, Octave should print the elapsed time. On the benchmark machine, the Octave implementation took 2307.5 seconds to run.

14.2 Comparison of R single- and multi-threaded implementation

library(decoyanalysis)

threads <- 10

set.seed(314)

generated.data.large <- gen.bjr.normal.data(N = 10000, T = 16, K = 2,
  theta1 = c(0, 3), theta2 = c(1, 1), omega = c(0.3, 0.7))

future::plan(future::sequential)

system.time({
  generated.data.results <- bjr(generated.data.large,
    II = 10, K = 2, cdf.points = seq(-4, 7, by = 0.25),
    estimate.mean.sd = FALSE)
})

future::plan(future::multisession(workers = threads))

system.time({
  generated.data.results <- bjr(generated.data.large,
    II = 10, K = 2, cdf.points = seq(-4, 7, by = 0.25),
    estimate.mean.sd = FALSE)
})

future::plan(future::sequential)

This time, the number of rings is set to 10,000 instead of 100. The first printed elapsed time will be the single-threaded execution time. On the benchmark machine, this time was 434.8 seconds. The second printed time will be the execution time with 10 threads. On the benchmark machine, this time was 62.9 seconds.