05:35:25 <Rucknium[m]> I find a lot of the statistical work in Moser et al. (2018) to be extremely hand-wavy. What I read is "we plotted the distributions and they looked similar to the naked eye, so they are probably pretty similar" 🧐
12:25:06 <coinstudent2048[> Rucknium: I have very poor understanding of stats right now, hence I have stupid questions: Is there process/es of 'fitting' 1) discrete and 2) continuous prob distribution to let's say a histogram? I just need topic name and some resources to study. I kind of need it not cryptocurrency related. Thanks!
14:03:16 <Rucknium[m]> coinstudent2048: Yes. This is called estimation of a probability density function (PDF) if the values are continuous and probability mass function (PMF) if the values are discrete. The most widely-used way to do this is through Maximum Likelihood Estimation (MLE). You may run into MLE as applied to "regression", but that is different. The idea is to guess the type of probability distribution, e.g. normal, uniform, exponential,
14:03:16 <Rucknium[m]> gamma, Poisson, etc., and then determine the parameters for that type of distribution that would best fit the data. So for normal distribution, MLE would find the best mean and variance of a normal distribution that best fits the data. The advantage of MLE is that if the distribution of your data is truly the chosen family, e.g. if you choose to use a Poisson distribution for MLE and the actual underlying data is actually
14:03:16 <Rucknium[m]> Poisson, then your estimate will have minimum variance, i.e. your approximation error will typically reach the Cramér-Rao lower bound. The bad news is if you choose incorrectly, e.g. choose normal distribution when the data is actually uniform, then your MLE estimates could be very inaccurate. You won't know you've chosen wrongly unless you do further tests. MLE is in a class of estimators known as parametric estimators since
14:03:16 <Rucknium[m]> at the end of the estimation procedure you get a small set of parameters, i.e. numbers, to look at and work with. For a normal distribution, again, the parameters would be mean and variance....
14:10:22 <Rucknium[m]> There is another class of PDF estimators called nonparametric estimators. Kernel density estimation is one of the most popular nonparametric estimators for PDFs. It is sort of like a more sophisticated histogram, but rather than having discrete bins as in a histogram, instead you have a rolling smoothing function. The advantage of kernel density estimation is that you don't have to guess a particular probability family, so you
14:10:22 <Rucknium[m]> cannot accidentally guess wrong and be way off the mark as in MLE. Some disadvantages: Since it is nonparametric, you do not get a few numbers to work with at the end of the estimation. Basically, you get a graph to look at. This makes the result of the estimation not very interpretable at times. Also, you have a choice of k, which is basically the bin width. Most software has sensible defaults. However, changing k has a
14:10:22 <Rucknium[m]> bias-variance tradeoff. Finally, to get good estimates for kernel density estimation, you need a lot more observations than for MLE.
14:10:40 <Rucknium[m]> Does that get you started?
14:11:07 <Rucknium[m]> By observations I mean sample size
15:15:53 <coinstudent2048[> Yes! I already heard of some of those keywords, and there are some that are new to me. I copied your answers in a text file. Many thanks!
15:19:37 <merope> Rucknium: if I may pick your brain as well: is there a variant/analogue of Principal Component Analysis that works on binary data? Basically I'm looking at block nonces, and I'm looking for any patterns/correlations between each individual bit
15:22:23 <Rucknium[m]> endor00: PCA is almost never used in economics, so I am not familiar with a PCA method that works on binary data. There could be another class of techniques that I am familiar with that could accomplish what you are trying to do.
15:25:46 <Rucknium[m]> Maybe any of the various clustering techniques. Or just calculate a correlation matrix to start?
15:33:49 <nioc> fyi, I cannot see any comment from endor00 here on IRC, may need to be voiced 
15:39:51 <coinstudent2048[> endor00 doesn't see Rucknium's answers yet...
15:43:16 <nioc> yes matrix person :)
15:44:02 <nioc> here on IRC endor00 does not exist 
15:44:21 <nioc> I assume that he would need to be voiced
15:52:14 <selsta> nioc: no more voice required here
15:52:21 <nioc> thx
15:53:31 <nioc> haven't seen endor for weeks
15:53:49 <selsta> weird
15:57:52 <merope> Maybe it's because I'm merope on irc
15:58:29 <merope> Not sure how it shows up with the different nickname
15:58:56 <nioc> oh, I should have looked closer
15:58:58 <merope> <Rucknium[m]> "Maybe any of the various cluster" <- I'll look into it, thanks!
15:58:59 <nioc> my bad
15:59:46 <nioc> merope> @endor00:matrix.org 
15:59:51 <nioc> ^^ what I see 
16:00:51 * merope uploaded an image: (24KiB) < https://libera.ems.host/_matrix/media/r0/download/matrix.org/LUYkGgWUUhWbpCQnYHvlVnqY/Screenshot_20210807-180011_Element.jpg >
16:01:34 <Rucknium[m]> endor00: I would be happy to take a look at results to aid interpretation.
16:02:26 <merope> Sorry for the confusion - I registered the irc account before I knew about the possibility to link them, so I ended up with two different usernames
16:03:53 <merope> Rucknium I'm out atm, I'll send some pics here (or in your DM if it's too much spam here) when I'm in front of the computer again
16:05:19 <merope> Oh, actually - a slightly outdated pic https://imgur.com/a/ZzveBeY
16:06:28 <merope> I wrote some c++ to (attempt to) do some apriori analysis and try to extract some nonce patterns
16:07:15 <merope> I'm getting some results, but I'm kinda struggling with their interpretation (or wether they're even correct and meaningful at all)
16:10:25 <Rucknium[m]> So the 20th-24th bits are much more likely to be zero?
16:11:29 <merope> So it would seem
16:12:05 <merope> But the question is: are they zero because there's an actual bias in the algo, or just because nobody ever picks them?
16:12:31 <selsta> sech1: ^^
16:12:42 <merope> But figuring out the causality is quite the complicated mess
16:13:24 <merope> There are a whole bunch of things that I wanna try throwing at the data to see what sticks, it's just a matter of finding the time to learn how they work
16:13:57 <Rucknium[m]> Wouldn't naive mining software just count the nonce upward, incrementally +1, so that it is less likely that the software will ever need to reach those later bits to increment upward, since they will have already found a block by then?
16:14:33 <merope> That's the basic assumption
16:14:44 <merope> But then there are bits 25-32
16:15:09 <merope> Which are typically due to mining proxies and the way they split the work among their clients
16:16:01 <merope> But again it brings the question of intrinsic biases and causality
16:18:45 <Rucknium[m]> I guess one use of the data could be to try to sniff out an unknown miners who are using custom nonce-choosing algorithms.
16:19:16 <merope> Also, there's the weird dip around bits 11-15 which I have no idea how to explain, and does not fit the model of naively incrementing from 0
16:20:23 <merope> Yeah, that would be one application
16:30:48 <merope> What I would really like to have is a metric ton of data about the valid shares submitted to the pools (block template, target diff, nonce, resulting hash)
16:32:17 <merope> Because atm I can only look at the winning block nonces on the main chain, but it would be nice to use the target diff as a parameter and see if it has any influence
16:39:05 <Rucknium[m]> What is the ultimate research question? Or is it just mostly exploratory for now?
16:41:52 <merope> Partly exploratory fun, partly looking for any hidden biases in RandomX's input
16:42:57 <merope> (And using the broad research question as a starting point to learn data analysis)
16:45:05 <merope> Either I find something, which would have an actual impact on RandomX and the mining landscape, or I find nothing, and RandomX passes another test and can be considered that much "safer" - and in both cases I've learned stuff along the way
16:45:42 <merope> Plus, the same methodology could be applied to any other coin with any other mining algo
16:51:52 <sech1> merope bits 24-31 are used by Nicehash and xmrig-proxy to split work
16:52:02 <sech1> 25-32 in your notation
16:53:01 <sech1> xmrig checks numbers in increasing order until it finds a solution, so bits 21-24 are almost always 0: solution is found before it gets to such large numbers
16:54:01 <sech1> bump at bit 16 (actually bit 15 if count them properly) on your graph is because xmrig splits search space in chunks of the size 2^15 (32768).
16:54:34 <sech1> each thread increments nonce by 1 until it finds a solution, and each thread starts from a multiple of 32768
16:55:25 <sech1> from what I can see, per-bit frequencies match what xmrig should produce
16:55:53 <merope> Yep, I've tried playing with xmrig's nonce selection too (though it's hard to get any statistically significant results when all you have is 1.6 kH/s)
16:56:43 <merope> Bit 15 isn't actually a bump though, but rather it goes (almost) back to 50%
16:57:17 <merope> (Look at the overall bit frequencies in the third plot in my link)
16:57:24 <sech1> 50% means it's changed a lot, so it counts as a bump
16:57:40 <sech1> anything above 50% shouldn't happen normally
16:58:05 <merope> Shouldn't be below 50% either though - should it?
16:58:21 <sech1> it should
16:58:38 <sech1> https://en.wikipedia.org/wiki/Benford%27s_law
16:59:19 <sech1> as long as numbers run through several orders of magnitude (2^N in this case), it applies
17:00:18 <merope> Hmmm
17:00:39 <sech1> why would you expect bit 23, for example, to be at 50% if it's almost always 0?
17:00:59 <sech1> because solution is found before it gets to 2^23
17:01:31 <merope> As I mentioned above, it does explain the massive decrease in frequency of the higher bits
17:01:48 <merope> But it doesn't explain why bits 15-16 are way higher than 13-14
17:02:05 <sech1> I explained. "each thread starts from a multiple of 32768"
17:02:31 <sech1> so they're changed a lot more often on multi-core CPUs
17:02:33 <merope> Yeah, but the search is still incremental after that
17:02:50 <sech1> incremental, but bit 15 is already set in each thread
17:03:18 <sech1> 8-core CPU starts search from values 0, 32768, ..., 229376
17:03:44 <sech1> and it can find a solution randomly on each core, so bit 15 will be random
17:04:34 <merope> But that would mean that the individual threads are way to slow to "fully finish" the search in each chunk before a new template shows up
17:04:47 <sech1> when thread exhausts 32768 nonce values, it takes a new multiple of 32768
17:05:06 <merope> ...which at ~600 H/s/thread would actually take 55 seconds per chunk, so it kinda makes sense actually
17:05:08 <sech1> each runs typically runs at 500-900 h/s depending on CPU, so it has enough time to exhaust it
17:05:10 <merope> Huh
17:05:39 <sech1> plus some blocks are really slow, much slower than 2 minutes
17:06:16 <merope> Hadn't considered the effect of the chunk size on the nonce distribution, interesting
17:07:01 <merope> When I tried messing with the nonce selection in xmrig, I reduced the chunk size to 2048 to get more granular control of the nonce pattern
17:13:19 <merope> Is there an actual performance benefit to splitting the search in chunks of 2^15, vs any other number?
17:42:56 <sech1> every new chunk requires synchronization between threads, so there is a benefit