12 Parametric Fit
Having estimated the nonparametric PMF, we now need to fit a parametric distribution.
First, the weekly PMFs must be aggregated. The value of the weekly PMFs at each support point can simply be averaged. Some weeks are excluded because of data availability problems or exogenous shocks:
- No txpool data for weeks 2023-13 and 2023-14
- Volatility caused by Binance Monitoring Tag during weeks 2023-51, 2023-52, 2024-01, and 2024-02
- Binance delisting during week 2024-06
- Suspected black marble spam during weeks 2024-09, 2024-10, 2024-11, 2024-12, 2024-13, 2024-14, 2024-15, and 2024-16
- Failure to estimate for unknown reasons for week 2024-19
As stated in Section 11.2 , the mixing proportion of the second-largest component distribution was estimated to be much larger during some of the exogenous events, suggesting inappropriate splitting of the wallet2
ring distribution. In theory, the first and second estimated components could be re-combined by computing their weighted sum, with the estimated mixing proportions as their weights. The simpler solution is to exclude these weeks as anomalous, which is what I have done.
12.1 Objective function
Usually, fitting a parametric distribution to nonparametric data involves minimization or maximization of an objective function by adjusting the parametric distribution’s parameters until the best fit is achieved. In OSPEAD, the new decoy parametric probability distribution will be chosen by minimizing the probability that the Maximum A Posteriori (MAP) Decoder Attack described by Aeeneh et al. (2021) is successful. Let be the total number of outputs eligible to be spent in a RingCT ring. Let be the real spend probability mass function (PMF). Let be a proposed decoy PMF. Let be the indicator function that takes value 1 when the statement within the braces is true and 0 otherwise. The average success probability of the MAP Decoder attack is
The weights the attack success probability by the probability mass at the th output. Very old outputs are rarely spent, so they are given low weight. Equation 12.1 can be modified to give more weight to old outputs. The weight can be changed into a convex combination of and giving each output an equal weight, parameterized by : . The more general objective function is
We shall use and .
12.2 Parametric distributions
The following parametric distributions will be fit:
- Gamma
- Non-central F
- Right-Pareto Lognormal (RPLN) (Reed and Jorgensen 2004)
- Generalized Extreme Value (GEV)
- Birnbaum-Saunders (BS)
- Generalized Beta of the Second Kind (GB2)
Each distribution will be fitted on the raw nonparametric distribution and a log transformation of it.
12.3 Implementation details
12.3.1 Computational expense
Notice the double summation over in Equation 12.1 . is over 100 million. A naive implementation of Equation 12.1 would require over 10 quadrillion comparisons. Furthermore, Equation 12.1 needs to be evaluated several hundred times during the numerical optimization algorithm. The computational issues were handled by:
Developing a fast implementation of Equation 12.1 that leverages sorting logic. The implementation is in the code section as
map.decoder.success.prob()
below.Selecting only 10 percent of all outputs as a sample instead of the entire population. The probability of the th output being selected is proportional to the probability mass . About 96 percent of the total probability mass is included because high-probability outputs are more likely to be selected.
12.3.2 Excess tail penalty
During fitting, the probability mass that is older than the oldest output is added back to the rest of the probability distribution. This procedure reflects the fact that a wallet would re-draw a random output that is older than the oldest RingCT output. During fitting, I sometimes observed excessive “flattening” of the probability distribution where the portion of the distribution that extended past the oldest output exceeded 70 percent. There was a need to discourage the optimizer from flattening the distribution.
I added an excess tail penalty. When 10 percent or more of the distribution’s mass extends past the oldest output, a proportional penalty is added to the optimizer’s objective function.
12.3.3 Parameter initial values
I use the Nelder-Mead algorithm to minimize the objective functions. Nelder-Mead, like most optimization algorithms, needs initial parameter values to start the optimization procedure. The starting values are obtained by drawing 10,000 observations from the real spend distribution and estimating the appropriate maximum likelihood model parameter estimates. Of course, the maximum likelihood methods need starting values. See the documentation of the maximum likelihood R functions in the code below for details. There was no automatic method to select starting values for the noncentral F distribution. In that case, I manually specified reasonable starting values.
12.4 Code
This code can be run in a new R session. It may take a day or two to run on a powerful machine.
library(data.table)
<- "results"
results.dir
<- c("2023-13.qs", "2023-14.qs", "2023-51.qs", "2023-52.qs", "2024-01.qs", "2024-02.qs",
excl.weeks "2024-06.qs", "2024-09.qs", "2024-10.qs", "2024-11.qs", "2024-12.qs", "2024-13.qs", "2024-14.qs",
"2024-15.qs", "2024-16.qs", "2024-17.qs", "2024-19.qs")
<- c(1, 0.5)
lambda.params
<- paste0(results.dir, "/results-01/")
results.dir.run
<- qs::qread(file = paste0(results.dir.run, "nonparametric-real-spend.qs"))
nonparametric.real.spend
<- nonparametric.real.spend$rucknium
nonparametric.real.spend
<- nonparametric.real.spend$weekly.real.spend.cdfs
weekly.real.spend.cdfs
<- nonparametric.real.spend$all.weeks.weighted.v.mean
all.weeks.weighted.v.mean
<- mean(all.weeks.weighted.v.mean)
weighted.v.mean
<- nonparametric.real.spend$support.max
support.max
<- all.weeks.weighted.v.mean[! names(all.weeks.weighted.v.mean) %in% excl.weeks]
all.weeks.weighted.v.mean
<- diff(weekly.real.spend.cdfs[[1]](as.numeric(0:(support.max + 1))))
aggregate.real.spend.pmf
for (week in setdiff(names(weekly.real.spend.cdfs)[-1], excl.weeks)) {
cat(week, "\n")
<- aggregate.real.spend.pmf +
aggregate.real.spend.pmf diff(weekly.real.spend.cdfs[[week]](as.numeric(0:(support.max + 1))))
}
<- aggregate.real.spend.pmf/sum(aggregate.real.spend.pmf)
aggregate.real.spend.pmf
stopifnot(all.equal(1, sum(aggregate.real.spend.pmf)))
<- function(cdf, v, z, sub.supp, log.trans = FALSE, ...) {
get.decoy.pmf
<- function(x) {
G cdf(x, ...)
}
if (log.trans) {
<- function(x) {
G_star G(log1p(x*v))/G(log1p(z*v))
}else {
} <- function(x) {
G_star G(x*v)/G(z*v)
}
}
G_star(sub.supp) - G_star(sub.supp - 1)
}
<- list()
param.trans
<- function(param, v, z, sub.supp, get.decoy.pmf) {
f_D.lgamma list(decoy.pmf = get.decoy.pmf(actuar::plgamma, v, z, sub.supp = sub.supp, shapelog = exp(param[1]), ratelog = exp(param[2])),
tail.beyond.support = 1 - actuar::plgamma(z*v, shapelog = exp(param[1]), ratelog = exp(param[2])))
}
$lgamma <- c(exp, exp)
param.trans
<- function(param, v, z, sub.supp, get.decoy.pmf, log.trans) {
f_D.gamma list(decoy.pmf = get.decoy.pmf(stats::pgamma, v, z, sub.supp = sub.supp, log.trans = log.trans, shape = exp(param[1]), rate = exp(param[2])),
tail.beyond.support = 1 - stats::pgamma(ifelse(log.trans, log1p(z*v), z*v), shape = exp(param[1]), rate = exp(param[2])))
}
$gamma <- c(exp, exp)
param.trans
<- function(param, v, z, sub.supp, get.decoy.pmf, log.trans) {
f_D.f list(decoy.pmf = get.decoy.pmf(cdf = stats::pf, v, z, sub.supp, log.trans = log.trans, df1 = exp(param[1]), df2 = exp(param[2]), ncp = exp(param[3])),
tail.beyond.support = 1 - stats::pf(ifelse(log.trans, log1p(z*v), z*v), df1 = exp(param[1]), df2 = exp(param[2]), ncp = exp(param[3])))
}
$f <- c(exp, exp, exp)
param.trans
<- function(param, v, z, sub.supp, get.decoy.pmf, log.trans, return.log = FALSE, ...) {
f_D.rpln list(decoy.pmf = get.decoy.pmf(distributionsrd::prightparetolognormal, v, z, sub.supp, log.trans = log.trans, shape2 = exp(param[1]),
meanlog = param[2], sdlog = exp(param[3]), lower.tail = TRUE, log.p = FALSE),
tail.beyond.support = 1 - distributionsrd::prightparetolognormal(ifelse(log.trans, log1p(z*v), z*v), shape2 = exp(param[1]), meanlog = param[2], sdlog = exp(param[3]), lower.tail = TRUE, log.p = FALSE))
}
$rpln <- c(exp, I, exp)
param.trans
<- function(param, v, z, sub.supp, get.decoy.pmf, log.trans) {
f_D.gev list(decoy.pmf = get.decoy.pmf(VGAM::pgev, v, z, sub.supp = sub.supp, log.trans = log.trans, location = param[1], scale = exp(param[2]), shape = param[3]),
tail.beyond.support = 1 - VGAM::pgev(ifelse(log.trans, log1p(z*v), z*v), location = param[1], scale = exp(param[2]), shape = param[3]))
}
$gev <- c(I, exp, I)
param.trans
<- function(param, v, z, sub.supp, get.decoy.pmf, log.trans) {
f_D.bs list(decoy.pmf = get.decoy.pmf(cdf = bsgof::pbs, v, z, sub.supp, log.trans = log.trans, alpha = exp(param[1]), beta = exp(param[2])),
tail.beyond.support = 1 - bsgof::pbs(ifelse(log.trans, log1p(z*v), z*v), alpha = exp(param[1]), beta = exp(param[2])))
}
$bs <- c(exp, exp)
param.trans
<- function(param, v, z, sub.supp, get.decoy.pmf, log.trans) {
f_D.gb2 list(decoy.pmf = get.decoy.pmf(cdf = GB2::pgb2, v, z, sub.supp, log.trans = log.trans, shape1 = exp(param[1]), scale = exp(param[2]), shape2 = exp(param[3]), shape3 = exp(param[4])),
tail.beyond.support = 1 - GB2::pgb2(ifelse(log.trans, log1p(z*v), z*v), shape1 = exp(param[1]), scale = exp(param[2]), shape2 = exp(param[3]), shape3 = exp(param[4])))
}
$gb2 <- c(exp, exp, exp, exp)
param.trans
<- function(f_S, f_D) {
map.decoder.success.prob # This implementation assumes that every output selected in a ring
# is unique (i.e. does not handle multi-output aggregates like
# pseudo-blocks). The inequality comparison is strict.
<- f_S/f_D
cut.vector rm(f_S)
names(cut.vector) <- as.character(1:length(cut.vector))
<- data.table(f_D = f_D, cut.vector.var = cut.vector)
y
setorder(y, cut.vector.var)
<- sort(cut.vector)
cut.vector <- cut.vector[!duplicated(cut.vector)]
cut.vector.unique # Using duplicated() keeps names. unique() does not keep names.
<- as.integer(names(cut.vector.unique))
cut.vector.name.unique
:= cut.vector.name.unique[ .bincode(cut.vector.var, c(-1, cut.vector.unique), right = FALSE)] ]
y[, cut.vector.cut
setorder(y, cut.vector.var)
:= NULL]
y[, cut.vector.var
:= cumsum(f_D)]
y[, f_D rm(f_D)
<- y[, .(success.prob = f_D[.N]), by = cut.vector.cut]
y
<- merge(y, data.table(cut.vector.cut = cut.vector.name.unique, cut.vector = cut.vector.unique))
y rm(cut.vector.name.unique)
:= NULL]
y[, cut.vector.cut
<- merge(y, data.table(cut.vector.cut.names = as.integer(names(cut.vector)), cut.vector = unname(cut.vector)),
y all = TRUE, by = "cut.vector")
:= NULL]
y[, cut.vector setorder(y, cut.vector.cut.names)
:= NULL]
y[, cut.vector.cut.names
setDF(y)
$success.prob[is.na(y$success.prob)] <- 0
y# At the point(s) where f_S/f_D is at a minimum, the attack would always
# choose another block height, so the attack success probability is zero
<- y$success.prob
y
# gc()
y
}
<- function(param, f_D, v, z, sub.supp, theta_i, get.decoy.pmf, map.decoder.success.prob, lambda = 1, log.trans) {
L_Attack
<- f_D(param, v, z, sub.supp, get.decoy.pmf, log.trans)
f_D.return
<- f_D.return$decoy.pmf
a_i <- f_D.return$tail.beyond.support
tail.penalty rm(f_D.return)
if (any(! is.finite(a_i))) { return(1) }
<= 0] <- .Machine$double.eps
a_i[a_i
<- map.decoder.success.prob(f_S = theta_i/sum(theta_i), f_D = a_i/sum(a_i))
values.map.decoder.success.prob
<- 15
n.decoys
cat(round(tail.penalty, 5), "<>")
<- tail.penalty - 0.10 # 0.05
tail.penalty
if (tail.penalty <= 0) {tail.penalty <- 0}
stopifnot(length(values.map.decoder.success.prob) == length(theta_i))
<- lambda * (theta_i/sum(theta_i)) + (1 - lambda) * 1/length(theta_i)
weight
sum(weight * (values.map.decoder.success.prob)^n.decoys) + tail.penalty
}
<- aggregate.real.spend.pmf
theta_i == 0] <- .Machine$double.eps
theta_i[theta_i
<- length(theta_i)
z <- mean(all.weeks.weighted.v.mean)
v
set.seed(314)
<- wrswoR::sample_int_expj(length(theta_i), ceiling(length(theta_i)/10), prob = theta_i)
sub.supp # wrswoR::sample_int_expj is much faster than sample() when the prob vector is long
# start at 2 so G(x - 1) is not 0
<- sort(sub.supp)
sub.supp <- theta_i[sub.supp]
theta_i
<- stepfun( cumsum(theta_i/sum(theta_i)), c(0, as.numeric(sub.supp)) )
gen.data
<- gen.data(runif(10000))
mle.generated.data <- mle.generated.data[mle.generated.data > 0]
mle.generated.data
<- Rfast::gammamle(mle.generated.data)
gamma.fit <- log(gamma.fit$param)
gamma.fit.coef
<- Rfast::gammamle(log1p(mle.generated.data))
gamma.fit <- log(gamma.fit$param)
gamma.fit.coef.log
<- fitdistrplus::fitdist(mle.generated.data, "f", method = "mle", start = list(df1 = exp(-1), df2 = exp(0), ncp = exp(2)))
f.fit <- log(coef(f.fit))
f.fit.coef
<- fitdistrplus::fitdist(log1p(mle.generated.data), "f", method = "mle", start = list(df1 = exp(-1), df2 = exp(0), ncp = exp(2)))
f.fit <- log(coef(f.fit))
f.fit.coef.log
<- distributionsrd::rightparetolognormal.mle(mle.generated.data)
rpln.fit <- c(log(rpln.fit$coefficients[["shape2"]]), rpln.fit$coefficients[["meanlog"]],
rpln.fit.coef log(rpln.fit$coefficients[["sdlog"]]) )
<- distributionsrd::rightparetolognormal.mle(log1p(mle.generated.data))
rpln.fit <- c(log(rpln.fit$coefficients[["shape2"]]), rpln.fit$coefficients[["meanlog"]],
rpln.fit.coef.log log(rpln.fit$coefficients[["sdlog"]]) )
<- VGAM::vglm(mle.generated.data ~ 1, VGAM::gevff, control = VGAM::vglm.control(maxit = 1000, trace = TRUE))
gev.fit <- VGAM::Coef(gev.fit)
gev.fit.coef 2] <- log(gev.fit.coef[2])
gev.fit.coef[
<- VGAM::vglm(log1p(mle.generated.data) ~ 1, VGAM::gevff, control = VGAM::vglm.control(maxit = 1000, trace = TRUE))
gev.fit <- VGAM::Coef(gev.fit)
gev.fit.coef.log 2] <- log(gev.fit.coef.log[2])
gev.fit.coef.log[
<- bsgof::bs.mle(mle.generated.data)
bs.fit <- log(c(bs.fit$alpha, bs.fit$beta))
bs.fit.coef
<- bsgof::bs.mle(log1p(mle.generated.data))
bs.fit <- log(c(bs.fit$alpha, bs.fit$beta))
bs.fit.coef.log
<- GB2::mlfit.gb2(mle.generated.data)
gb2.fit <- log(gb2.fit[[2]]$par)
gb2.fit.coef
<- GB2::mlfit.gb2(log1p(mle.generated.data + 1))
gb2.fit <- log(gb2.fit[[2]]$par)
gb2.fit.coef.log
<- expand.grid(
run.iters.simple f_D = c(
f_D.gamma = f_D.gamma,
f_D.f = f_D.f,
f_D.rpln = f_D.rpln,
f_D.gev = f_D.gev,
f_D.bs = f_D.bs,
f_D.gb2 = f_D.gb2
),flavor = 1,
week = 0,
est.method = 0,
L = c(L_Attack = L_Attack),
lambda = lambda.params,
log.trans = c(FALSE, TRUE)
)
<- list(
start.params f_D.gamma = list(no.log = gamma.fit.coef, log = gamma.fit.coef.log),
f_D.f = list(no.log = f.fit.coef, log = f.fit.coef.log),
f_D.rpln = list(no.log = rpln.fit.coef, log = rpln.fit.coef.log),
f_D.gev = list(no.log = gev.fit.coef, log = gev.fit.coef.log),
f_D.bs = list(no.log = bs.fit.coef, log = bs.fit.coef.log),
f_D.gb2 = list(no.log = gb2.fit.coef, log = gb2.fit.coef.log))
<- floor(nrow(run.iters.simple)/2)
threads ::plan(future::multisession(workers = threads))
future
<- base::date()
keep.time
<- future.apply::future_lapply(1:nrow(run.iters.simple), function(iter) {
fit.results
<- run.iters.simple$f_D[[iter]]
f_D.fun <- run.iters.simple$L[[iter]]
L.fun <- run.iters.simple$flavor[[iter]]
flavor <- run.iters.simple$lambda[[iter]]
lambda <- run.iters.simple$log.trans[[iter]]
log.trans if (log.trans) {
<- start.params[[names(run.iters.simple$f_D[iter])]]$log
start.params.optim else {
} <- start.params[[names(run.iters.simple$f_D[iter])]]$no.log
start.params.optim
}
<- optim(
optim.results
start.params.optim,
L.fun,f_D = f_D.fun,
v = v,
z = z,
sub.supp = sub.supp,
get.decoy.pmf = get.decoy.pmf,
map.decoder.success.prob = map.decoder.success.prob,
log.trans = log.trans,
lambda = lambda,
method = "Nelder-Mead",
control = list(trace = 6, maxit = 10000),
theta_i = theta_i)
optim.results
future.globals = c("run.iters.simple", "start.params", "v", "z", "sub.supp", "get.decoy.pmf",
}, "map.decoder.success.prob", "theta_i", "lamba.iter"),
future.packages = "data.table", future.seed = TRUE)
print(keep.time)
::date() base