library(decoyanalysis)
set.seed(314)
<- gen.bjr.normal.data(N = 100, T = 16, K = 2,
generated.data theta1 = c(0, 3), theta2 = c(1, 1), omega = c(0.3, 0.7))
write.table(generated.data, file = "octave/bjr-benchmark-data.csv",
sep = ",", col.names = FALSE, row.names = FALSE)
system.time({
<- bjr(generated.data,
generated.data.results II = 10, K = 2, cdf.points = seq(-4, 7, by = 0.25),
estimate.mean.sd = FALSE)
})
14 Appendix: BJR Benchmarks
When restricted to one CPU thread, my R implementation of the BJR estimator is about 240 times faster than Jochmans’ original Octave/MATLAB implementation.
When processing 10,000 rings and using 10 CPU threads, my multi-threaded R implementation is 7 times faster than my R implementation in single-threaded mode.
The benchmarks were performed on an AMD Ryzen Threadripper 3970X CPU.
The Octave implementation is a modification of the original implementation included with the BJR paper. Code that was not involved in estimating the CDF, such as mean estimation code, was removed. The hardcoded limit of 4 repeated measurements was raised to 16.
14.1 Comparison of R and original Octave implementation
Generate the test data and estimate the model in an R session:
The system.time()
function will print the number of seconds it took to complete the estimation. “elapsed” is the wall-clock time. On the benchmark machine, the elapsed time was 9.6 seconds.
Next, install Octave, which is an open-source implementation of the proprietary MATLAB programming language. On Linux, it is probably easiest to install the Flatpak version:
sudo flatpak remote-add --if-not-exists flathub https://dl.flathub.org/repo/flathub.flatpakrepo
sudo flatpak install flathub org.octave.Octave
Go to the octave
directory of OSPEAD-docs and initiate an Octave session:
cd OSPEAD-docs/octave
flatpak run org.octave.Octave
Then import the test data generated by R and estimate the model:
y = importdata("bjr-benchmark-data.csv", ",");
K = 2;
II = 10;
tic; [OUT XI L] = bjrCDFonly16T(y, K, II); toc
Once done, Octave should print the elapsed time. On the benchmark machine, the Octave implementation took 2307.5 seconds to run.
14.2 Comparison of R single- and multi-threaded implementation
library(decoyanalysis)
<- 10
threads
set.seed(314)
<- gen.bjr.normal.data(N = 10000, T = 16, K = 2,
generated.data.large theta1 = c(0, 3), theta2 = c(1, 1), omega = c(0.3, 0.7))
::plan(future::sequential)
future
system.time({
<- bjr(generated.data.large,
generated.data.results II = 10, K = 2, cdf.points = seq(-4, 7, by = 0.25),
estimate.mean.sd = FALSE)
})
::plan(future::multisession(workers = threads))
future
system.time({
<- bjr(generated.data.large,
generated.data.results II = 10, K = 2, cdf.points = seq(-4, 7, by = 0.25),
estimate.mean.sd = FALSE)
})
::plan(future::sequential) future
This time, the number of rings is set to 10,000 instead of 100. The first printed elapsed time will be the single-threaded execution time. On the benchmark machine, this time was 434.8 seconds. The second printed time will be the execution time with 10 threads. On the benchmark machine, this time was 62.9 seconds.