Under Review Since : 2020-08-10

Whether the goal is to estimate the number of people that live in a congressional district, to estimate the number of individuals that have died in an armed conflict, or to disambiguate individual authors using bibliographic data, all these applications have a common theme - integrating information from multiple sources. Before such questions can be answered, databases must be cleaned and integrated in a systematic and accurate way, commonly known as record linkage, de-duplication, or entity resolution. In this article, we review motivational applications and seminal papers that have led to the growth of this area. Specifically, we review the foundational work that began in the 1940's and 50's that have led to modern probabilistic record linkage. We review clustering approaches to entity resolution, semi- and fully supervised methods, and canonicalization, which are being used throughout industry and academia in applications such as human rights, official statistics, medicine, citation networks, among others. Finally, we discuss current research topics of practical importance.

Under Review Since : 2020-08-03

This paper presents a model for the consumption of a cultural good where consumers can either purchase or pirate the good (or not consume it). Because of the specificity of the cultural good, active consumers (users), buyers and pirates, derive a network utility that depends on the numbers of users of the goods with which they can share their experience of the cultural good. It is shown that the monopoly firm selling the cultural good may obtain a higher profit when piracy is possible than when it is not. Consequently, it is presented that increasing the cost of piracy has a non monotonic effect on a firm's profit and welfare.

Under Review Since : 2020-08-12

The *maximization of entropy* S within a closed system is accepted as an inevitability (as the second law of thermodynamics) by statistical inference alone.

The *Maximum Entropy Production Principle* (MEPP) states that such a system will maximize its entropy as fast as possible.

There is still no consensus on the general validity of this MEPP, even though it shows remarkable explanatory power (both qualitatively and quantitatively), and has been empirically demonstrated for many domains.

In this theoretical paper I provide a generalization of state-spaces, to fundamentally show that the MEPP actually follows from the same statistical inference, as that of the 2nd law of thermodynamics.

For this generalization I introduce the concepts of the *poly-dimensional statespace* and *microstate-density*.

These concepts also allows for the abstraction of 'Self Organizing Criticality' to a bifurcating local difference in this density.

Ultimately, the inevitability of the maximization of entropy production has significant implications for the models we use in developing and monitoring socio-economic and financial policies, explaining organic life at any scale, and in curbing the growth of our technological progress, to name a few areas.

Under Review Since : 2020-06-28

Predicting the response at an unobserved location is a fundamental problem in spatial statistics. Given the difficulty in modeling spatial dependence, especially in non-stationary cases, model-based prediction intervals are at risk of misspecification bias that can negatively affect their validity. Here we present a new approach for model-free spatial prediction based on the *conformal prediction* machinery. Our key observation is that spatial data can be treated as exactly or approximately exchangeable in a wide range of settings. For example, when the spatial locations are deterministic, we prove that the response values are, in a certain sense, locally approximately exchangeable for a broad class of spatial processes, and we develop a local spatial conformal prediction algorithm that yields valid prediction intervals without model assumptions. Numerical examples with both real and simulated data confirm that the proposed conformal prediction intervals are valid and generally more efficient than existing model-based procedures across a range of non-stationary and non-Gaussian settings.

Under Review Since : 2020-06-13

Comment on the proposal to rename the R.A. Fisher Lecture.

Under Review Since : 2020-06-11

Peters [2011a] defined an optimal leverage which maximizes the time-average growth rate of an investment held at constant leverage. It was hypothesized that this optimal leverage is attracted to 1, such that, e.g., leveraging an investment in the market portfolio cannot yield long-term outperformance. This places a strong constraint on the stochastic properties of prices of traded assets, which we call "leverage efficiency." Market conditions that deviate from leverage efficiency are unstable and may create leverage-driven bubbles. Here we expand on the hypothesis and its implications. These include a theory of noise that explains how systemic stability rules out smooth price changes at any pricing frequency; a resolution of the so-called equity premium puzzle; a protocol for central bank interest rate setting to avoid leverage-driven price instabilities; and a method for detecting fraudulent investment schemes by exploiting differences between the stochastic properties of their prices and those of legitimately-traded assets. To submit the hypothesis to a rigorous test we choose price data from different assets: the S&P500 index, Bitcoin, Berkshire Hathaway Inc., and Bernard L. Madoff Investment Securities LLC. Analysis of these data supports the hypothesis.

Published Date : 2020-06-08

This version of my PhD thesis has been produced for the open access open peer review platform researchers.one. I am interested in reviewer feedback. Please feel free to upload your reviews, (dis)agreements, typos, errors, *etc*. directly to researchers.one or email me. Compared to the original submission this version contains only minor corrections with regard to *e.g.* typos, misplaced citations and some resolved ordering issues in the bibliography.

Under Review Since : 2020-05-14

The spread of infectious disease in a human community or the proliferation of fake news on social media can be modeled as a randomly growing tree-shaped graph. The history of the random growth process is often unobserved but contains important information such as thesource of the infection. We consider the problem of statistical inference on aspects of the latent history using only a single snapshot of the final tree. Our approach is to apply random labels to the observed unlabeled tree and analyze the resulting distribution of the growth process, conditional on the final outcome. We show that this conditional distribution is tractable under a shape-exchangeability condition, which we introduce here, and that this condition is satisfied for many popular models for randomly growing trees such as uniform attachment, linear preferential attachment and uniform attachment on a D-regular tree. For inference of the rootunder shape-exchangeability, we propose computationally scalable algorithms for constructing confidence sets with valid frequentist coverage as well as bounds on the expected size of the confidence sets. We also provide efficient sampling algorithms which extend our methods to a wide class of inference problems.

Under Review Since : 2020-05-03

In this paper we review the current personal protective equipment (PPE) recommendations for healthcare workers in the setting of COVID19 pandemic and analyze the framework upon which authorities currently make these recommendations. We examine multiple uncertainties within the model assumptions and conclude that precaution dictates that we should adopt a more stringent PPE policy for our healthcare workforce even in more routine helthcare settings.

Under Review Since : 2020-05-01

Biochemical mechanisms are complex and consist of many interacting proteins, genes, and metabolites. Predicting the future states of components in biochemical processes is widely applicable to biomedical research. Here we introduce a minimal model of biochemical networks using a system of coupled linear differential equations and a corresponding numerical model. To capture biological reality, the model includes parameters for stochastic noise, constant time delay, and basal interactions from an external environment. The model is sufficient to produce key biochemical patterns including accumulation, oscillation, negative feedback, and homeostasis. Applying the model to the well-studied {\it lac} operon regulatory network reproduces key experimental observations under different metabolic conditions. By component subtraction, the model predicts the effect of genetic or chemical inhibition in the same {\it lac} regulatory network. Thus, the minimal model may lead to methods for motivating therapeutic targets and predicting the effects of experimental perturbations in biochemical networks.

Under Review Since : 2020-04-29

Behavioural economics provides labels for patterns in human economic behaviour. Probability weighting is one such label. It expresses a mismatch between probabilities used in a formal model of a decision (*i.e.* model parameters) and probabilities inferred from real people's decisions (the same parameters estimated empirically). The inferred probabilities are called ``decision weights.'' It is considered a robust experimental finding that decision weights are higher than probabilities for rare events, and (necessarily, through normalisation) lower than probabilities for common events. Typically this is presented as a cognitive bias, *i.e.* an error of judgement by the person. Here we point out that the same observation can be described differently: broadly speaking, probability weighting means that a decision maker has greater uncertainty about the world than the observer. We offer a plausible mechanism whereby such differences in uncertainty arise naturally: when a decision maker must estimate probabilities as frequencies in a time series while the observer knows them *a priori*. This suggests an alternative presentation of probability weighting as a principled response by a decision maker to uncertainties unaccounted for in an observer's model.

Under Review Since : 2020-04-15

**In order to more effectively combat the coronavirus pandemic, the authors propose a system for daily analysis of residential wastewater at points of discharge from buildings. Results of testing should be used for the implementation of local quarantines as well as informed administration of tests for individuals. **

Published Date : 2020-04-07

A response to John Ioannidis's article *A fiasco in the making? As the coronavirus pandemic takes hold, we are making decisions without reliable data* (March 17, 2020) in which he downplays the severe risks posed by coronavirus pandemic.

Under Review Since : 2020-04-06

Sparse PCA is one of the most popular tools for the dimensional reduction of high-dimensional data. Although many computational methods have been proposed for sparse PCA, Bayesian methods are still very few. In particular, there is a lack of fast and efficient algorithms for Bayesian sparse PCA. To fill this gap, we propose two efficient algorithms based on the expectation–maximization (EM) algorithm and the coordinate ascent variational inference (CAVI) algorithm—the double parameter expansion-EM (dPX-EM) and the PX-coordinate ascent variation inference (PX-CAVI) algorithms. By using a new spike-and-slab prior and applying the parameter expansion approach, we are able to avoid directly dealing with the orthogonal constraint between eigenvectors, and thus making it easier to compute the posterior. Simulation studies showed that the PX-CAVI outperforms the dPX-EM algorithm as well as other two existing methods. The corresponding R code is available on the website https://github.com/Bo-Ning/Bayesian-sparse-PCA.

Under Review Since : 2020-04-06

Published Date : 2020-04-04

The lack of an explicit brain model has been holding back AI improvements leading to applications that don’t model language in theory. This paper explains Patom theory (PT), a theoretical brain model, and its interaction with human language emulation.

Patom theory explains what a brain does, rather than how it does it. If brains just store, match and use patterns comprised of hierarchical bidirectional linked-sets (sets and lists of linked elements), memory becomes distributed and matched both top-down and bottom-up using a single algorithm. Linguistics shows the top-down nature because meaning, not word sounds or characters, drives language. For example, the pattern-atom (Patom) “object level” that represents the multisensory interaction of things, is uniquely stored and then associated as many times as needed with sensory memories to recognize the object accurately in each modality. This is a little like a template theory, but with multiple templates connected to a single representation and resolved by layered agreement.

In combination with Role and Reference Grammar (RRG), a linguistics framework modeling the world’s diverse languages in syntax, semantics and discourse pragmatics, many human-like language capabilities become *demonstrable*. Today’s natural language understanding (NLU) systems built on intent classification cannot deal with natural language in theory beyond simplistic sentences because the science it is built on is too simplistic. Adoption of the principles in this paper provide a theoretical way forward for NLU practitioners based on existing, tested capabilities.

Under Review Since : 2020-04-04

According to a proof in Euclidean geometry of the "Cardinality of the Continuum", that is attributed to Georg Cantor, a line has as many points as any line segment (not inclusive of the two end points). However, this proof uses parallel lines, and therefore assumes Euclid's Parallel Postulate as an axiom. But Non-Euclidean geometries have alternative axioms. In Hyperbolic geometry, at any point off of a given line, there are a plurality of lines parallel to the given line. In Elliptic geometry (which includes Spherical geometry), no lines are parallel, so two lines always intersect. In Absolute geometry, neither Euclid's parallel postulate nor its alternatives are axioms. We provide an example in Spherical geometry and an example in Hyperbolic geometry wherein the "Cardinality of the Continuum" is false. Therefore the "Cardinality of the Continuum" is also false in Absolute geometry. So the "Continuum Hypothesis" is false too, because it assumes that the "Cardinality of the Continuum" is true.

Under Review Since : 2020-04-04

Quantum Electrodynamics (QED) Renormalizaion is a logical paradox, and thus is mathematically invalid. It converts divergent series into finite values by use of the Euler-Mascheroni constant. The definition of this constant is a conditionally convergent series. But the Riemann Series Theorem proves that any conditionally convergent series can be rearranged to be divergent. This contradiction (a series that is both convergent and divergent) violates the Law of Non-Contradiction (LNC) in "classical" and intuitionistic logics, and thus is a paradox in these logics. This result also violates the commutative and associative properties of addition, and the one-to-two mapping from domain to range violates the definition of a function in Zermelo-Fraenkel set theory.

In addition, Zeta Function Regularization is logically and mathematically invalid. It equates two definitions of the Zeta function: the Dirichlet series definition, and Riemann's definition. For domain values in the half-plane of "analytic continuation", the two definitions contradict: the former is divergent and the latter is convergent. Equating these contradictory definitions there creates a paradox (if both are true), or is logically invalid (if one is true and the other false). We show that Riemann's definition is false, because its derivation includes a contradiction: the use of both the Hankel contour and Cauchy's integral theorem. Also, a third definition of the Zeta function is proven to be false. The Zeta function is exclusively defined by the Dirichlet series, which has no zeros (and therefore the Riemann hypothesis is a paradox).

Under Review Since : 2020-03-25

Abstract—Using methods from extreme value theory, we examine the major pandemics in history, trying to understand their tail properties.

Applying the shadow distribution approach developed by the authors for violent conflicts [5], we provide rough estimates for quantities not immediately observable in the data.

Epidemics and pandemics are extremely heavy-tailed, with a potential existential risk for humanity. This property should override conclusions derived from local epidemiological models in what relates to tail events.

Under Review Since : 2020-03-25

Despite the importance of having robust estimates of the time-asymptotic total number of infections, early estimates of COVID-19 show enormous fluctuations. Using COVID-19 data for different countries, we show that predictions are extremely sensitive to the reporting protocol and crucially depend on the last available data-point, before the maximum number of daily infections is reached. We propose a physical explanation for this sensitivity, using a Susceptible-Exposed-Infected-Recovered (SEIR) model where the parameters are stochastically perturbed to simulate the difficulty in detecting asymptomatic patients, different confinement measures taken by different countries, as well as changes in the virus characteristics. Our results suggest that there are physical and statistical reasons to assign low confidence to statistical and dynamical fits, despite their apparently good statistical scores. These considerations are general and can be applied to other epidemics.

Under Review Since : 2020-03-21

I will try to analyze Harry Crane's article "*Naïve probabilism*" and formalise what is, for me, "*the problem of medicine*" (being blind to scale and his consequences). I will introduce Post-Normal Science, criticise the Geoffrey Rose's approach, praise Marc Jamoulle's work and concluding that not all the precautions (at different scales) are the same "*ting*". A short list at the end is exposed in a summary style. This is for starting a debate about epistemology of medicine and his (for me) lack of skin in the game and second-order thinking. Not a closure. Massive review is welcome (and necessary).

Under Review Since : 2020-03-19

Empirical distributions have their in-sample maxima as natural censoring. We look at the "hidden tail", that is, the part of the distribution in excess of the maximum for a sample size of n. Using extreme value theory, we examine the properties of the hidden tail and calculate its moments of order p.

The method is useful in showing how large a bias one can expect, for a given n, between the visible in-sample mean and the true statistical mean (or higher moments), which is considerable for α close to 1.

Among other properties, we note that the "hidden" moment of order 0, that is, the exceedance probabil- ity for power law distributions, follows an exponential distribution and has for expectation 1/n regardless of the parametrization of the scale and tail index.

Under Review Since : 2020-03-13

When gambling, think probability.

When hedging, think plausibility.

When preparing, think possibility.

When this fails, stop thinking. Just survive.

Naive probabilism is the (naive) view, held by many technocrats and academics, that all rational thought boils down to probability calculations. This viewpoint is behind the obsession with `data-driven methods' that has overtaken the hard sciences, soft sciences, pseudosciences and non-sciences. It has infiltrated politics, society and business. It's the workhorse of formal epistemology, decision theory and behavioral economics. Because it is mostly applied in low or no-stakes academic investigations and philosophical meandering, few have noticed its many flaws. Real world applications of naive probabilism, however, pose disproportionate risks which scale exponentially with the stakes, ranging from harmless (and also helpless) in many academic contexts to destructive in the most extreme events (war, pandemic). The 2019--2020 coronavirus outbreak (COVID-19) is a living example of the dire consequences of such probabilistic naivet\'e. As I write this on March 13, 2020, we are in the midst of a 6 continent pandemic, the world economy is collapsing and our future is bound to look very different from the recent past. The major damage caused by the spread of COVID-19 is attributable to a failure to act and a refusal to acknowledge what was in plain sight. This shared negligence stems from a blind reliance on naive probabilism and the denial of basic common sense by global and local leaders, and many in the general public.

Under Review Since : 2020-03-09

This introductory chapter of *Probabilistic Foundations of Statistical Network Analysis* explains the major shortcomings of prevailing efforts in statistical analysis of networks and other kinds of complex data, and why there is a need for a new way to conceive of and understand data arising from complex systems.

Under Review Since : 2020-03-07

This paper presents an overview of Ergodicity Economics (EE) in plain English.

Ergodicity Economics (EE) applies a modern mathematical formalization to familiar financial concepts to reveal implications, and consequences that were previously unseen.

EE provides a clear distinction between:

- methods of averaging (arithmetic means vs. geometric means),
- meanings of averaging (ensemble expectation vs. time average), and
- reasons for the different meanings (additive vs. multiplicative growth dynamics).

These are distinctions with a difference because the average experience of an ensemble over many trajectories may not be the average experience of an individual over a single life history. Using ensemble expectations inappropriately - i.e. for non-ergodic observables – misleads individuals because it implies a physical system of counterfactuals that cannot exist in a single life trajectory.

EE quantifies the differences and the trade-offs between the collective meaning and the individual meaning of financial methods. EE’s perspective opens up previously unseen distinctions for evidence-based recommendations. These distinctions enable the creation of previously unavailable recommendations for the explicit benefit of individual clients. This differentiating impact on economic theory, asset valuation, product development, and advisory best practices is developing rapidly.

Published Date : 2020-02-27

We study the problem of non-parametric Bayesian estimation of the intensity function of a Poisson point process. The observations are $n$ independent realisations of a Poisson point process on the interval $[0,T]$. We propose two related approaches. In both approaches we model the intensity function as piecewise constant on $N$ bins forming a partition of the interval $[0,T]$. In the first approach the coefficients of the intensity function are assigned independent gamma priors, leading to a closed form posterior distribution. On the theoretical side, we prove that as $n\rightarrow\infty,$ the posterior asymptotically concentrates around the ``true", data-generating intensity function at an optimal rate for $h$-H\"older regular intensity functions ($0 < h\leq 1$).

In the second approach we employ a gamma Markov chain prior on the coefficients of the intensity function. The posterior distribution is no longer available in closed form, but inference can be performed using a straightforward version of the Gibbs sampler. Both approaches scale well with sample size, but the second is much less sensitive to the choice of $N$.

Practical performance of our methods is first demonstrated via synthetic data examples. We compare our second method with other existing approaches on the UK coal mining disasters data. Furthermore, we apply it to the US mass shootings data and Donald Trump's Twitter data.

Under Review Since : 2020-02-24

The third moment skewness ratio Skew is a standard measure to understand and categorize distributions. However, its usual estimator based on sample second and third moments is biased very low and sensitive to outliers. Thus, we study two alternative measures, the Triples parameter of Randles, et al. (1980) and the third L-moment ratio of Hosking (1990). We show by simulation that their associated estimators have excellent small sample properties and can be rescaled to be practical replacements for the third moment estimator of Skew.

Under Review Since : 2020-02-20

In unreplicated two-way factorial designs, it is typical to assume no interaction between two factors. However, violations of this additivity assumption have often been found in applications, and tests for non-additivity have been a recurring topic since Tukey's one-degree of freedom test (Tukey, 1949). In the context of randomized complete block designs, recent work by Franck et al. (2013) is based on an intuitive model with "hidden additivity," a type of non-additivity where unobserved groups of blocks exist such that treatment and block effects are additive within groups, but treatment effects may be different across groups. Their proposed test statistic for detecting hidden additivity is called the "all-conguration maximum interaction F-statistic" (ACMIF). The computations of the ACMIF also result in a clustering method for blocks related to the k-means procedure. When hidden additivity is detected, a new method is proposed here for condence intervals of contrasts within groups that takes into account the error due to clustering by forming the union of standard intervals over a subset of likely congurations.

Under Review Since : 2020-02-14

An important question in economics is how people choose when facing uncertainty in the timing of rewards. In this paper we study preferences over time lotteries, in which the payment amount is certain but the payment time is uncertain. In expected discounted utility (EDU) theory decision makers must be risk-seeking over time lotteries. Here we explore growth-optimality, a normative model consistent with standard axioms of choice, in which decision makers maximise the growth rate of their wealth. Growth-optimality is consistent with both risk-seeking and risk-neutral behaviour in time lotteries, depending on how growth rates are computed. We discuss two approaches to compute a growth rate: the ensemble approach and the time approach. Revisiting existing experimental evidence on risk preferences in time lotteries, we find that the time approach accords better with the evidence than the ensemble approach. Surprisingly, in contrast to the EDU prediction, the higher the ensemble-average growth rate of a time lottery is, the less attractive it becomes compared to a sure alternative. Decision makers thus may not consider the ensemble-average growth rate as a relevant criterion for their choices. Instead, the time-average growth rate may be a better criterion for decision-making.

Under Review Since : 2020-02-14

We collected marathon performance data from a systematic sample of elite and sub-elite athletes over the period 2015 to 2019, then searched the internet for publicly-available photographs of these performances, identifying whether the Nike Vaporfly shoes were worn or not in each performance. Controlling for athlete ability and race difficulty, we estimated the effect on marathon times of wearing the Vaporfly shoes. Assuming that the effect of Vaporfly shoes is additive, we estimate that the Vaporfly shoes improve men's times between 2.1 and 4.1 minutes, while they improve women's times between 1.2 and 4.0 minutes. Assuming that the effect of Vaporfly shoes is multiplicative, we estimate that they improve men's times between 1.5 and 2.9 percent, women's performances between 0.8 and 2.4 percent. The improvements are in comparison to the shoe the athlete was wearing before switching to Vaporfly shoes, and represents an expected improvement rather than a guaranteed improvement.

Under Review Since : 2020-02-14

The outbreak of a novel Coronavirus we are facing is poised to become a global pandemic if current approaches to stemming its spread prove to be insufficient. While we can't yet say what the ultimate impact of this event will be, this crisis and governments' responses to it reveal vulnerabilities and fragilities in the structure of our global socioeconomic milieux that will continue to produce cascading crises regardless of whether or not we are successful in preventing devastation from this particular pathogen. Here we discuss the implications and some strategic considerations.

Published Date : 2020-02-06

Contrary to Ole Peters' claims in a recent Nature Physics article [1], the "Copenhagen Experiment" did not falsify Expected Utility Theory (EUT) and corroborate Ergodicity Econonomics. The dynamic version of of EUT, multi-period EUT, predicts the same change in risk aversion that EE predicts when the dynamics are changed from multiplicative to additive.

Published Date : 2020-02-03

The book investigates the misapplication of conventional statistical techniques to fat tailed distributions and looks for remedies, when possible.

Switching from thin tailed to fat tailed distributions requires more than "changing the color of the dress". Traditional asymptotics deal mainly with either n=1 or n=∞, and the real world is in between, under of the "laws of the medium numbers" --which vary widely across specific distributions. Both the law of large numbers and the generalized central limit mechanisms operate in highly idiosyncratic ways outside the standard Gaussian or Levy-Stable basins of convergence.

A few examples:

+ The sample mean is rarely in line with the population mean, with effect on "naive empiricism", but can be sometimes be estimated via parametric methods.

+ The "empirical distribution" is rarely empirical.

+ Parameter uncertainty has compounding effects on statistical metrics.

+ Dimension reduction (principal components) fails.

+ Inequality estimators (GINI or quantile contributions) are not additive and produce wrong results.

+ Many "biases" found in psychology become entirely rational under more sophisticated probability distributions

+ Most of the failures of financial economics, econometrics, and behavioral economics can be attributed to using the wrong distributions.

This book, the first volume of the Technical Incerto, weaves a narrative around published journal articles.

Under Review Since : 2020-01-23

A fundamental problem in statistics and machine learning is that of using observed data to predict future observations. This is particularly challenging for model-based approaches because often the goal is to carry out this prediction with no or minimal model assumptions. For example, the inferential model (IM) approach is attractive because it has certain validity guarantees, but requires specification of a parametric model. Here we show that a new perspective on a recently developed generalized IM approach can be applied to construct an IM for prediction that satisfies the desirable validity guarantees without specification of a model. One important special case of this approach corresponds to the powerful conformal prediction framework and, consequently, the desirable properties of conformal prediction follow immediately from the general IM validity theory. Several numerical examples are presented to illustrate the theory and highlight the method's performance and flexibility.

Published Date : 2020-01-03

In this note, I analyze the code and the data generated by M. Fodje's (2013) simulation programs "epr-simple" and "epr-clocked". They are written in Python were published on Github only, initially without any documentation at all of how they worked. Inspection of the code showed that they make use of the detection loophole and the coincidence loophole respectively. I evaluate them with appropriate modified Bell-CHSH type inequalities: the Larsson detection-loophole adjusted CHSH, and the Larsson-Gill coincidence-loophole adjusted CHSH (NB: its correctness is conjecture, we do not have proof). The experimental efficiencies turn out to be approximately eta = 81% (close to optimal) and gamma = 55% (far from optimal). The observed values of CHSH are, as they must be, within the appropriately adjusted bounds. Fodjes' detection-loophole model turns out to be very, very close to Pearle's famous 1970 model, so the efficiency is very close to optimal. The model has the same defect as Pearle's: the joint detection rates exhibit signaling. The coincidence-loophole model is actually an elegant modification of the detection-loophole model. Because of this, however, it cannot lead to optimal efficiency. Later versions of the programs included an explanation of how they worked, including formulas, though still no reference whatever to the literature on the two loopholes which Fodje exploits, not even to the concept of an experimental (i.e., in principle, avoidable) loophole. The documentation available now does make a lot of the "reverse engineering" in this paper superfluous. I plan to rewrite it as a very, very short note. I will also use the few jewels in the work in a more ambitious paper, still to be written, about the results of the bigger research project of which these experiments were a small part.

The two authors listed by Researchers.one are both myself, in my two capacities as emeritus professor and as independent consultant. Actually I was just attempting to add my middle name "David" or middle initial "D." to my name on my own profile. But only succeeded in cloning myself.

Under Review Since : 2019-11-29

Inferential challenges that arise when data are censored have been extensively studied under the classical frameworks. In this paper, we provide an alternative generalized inferential model approach whose output is a data-dependent plausibility function. This construction is driven by an association between the distribution of the relative likelihood function at the interest parameter and an unobserved auxiliary variable. The plausibility function emerges from the distribution of a suitably calibrated random set designed to predict that unobserved auxiliary variable. The evaluation of this plausibility function requires a novel use of the classical Kaplan--Meier estimator to estimate the censoring rather than the event distribution. We prove that the proposed method provides valid inference, at least approximately, and our real- and simulated-data examples demonstrate its superior performance compared to existing methods.

Under Review Since : 2019-11-27

One of the classic problems in complex systems is the existence and ubiquity of critically, characterized by scale-invariance in frequency space and a balance between emergence (randomness) and self-organization (order). Another universal characteristic of complex systems is their Antigrafility or the capacity of taking advantage from environmental randomness. Inhere we propose a primer hypothesis that both concepts are related and may be understood under an Information Theory framework using Fisher Information as unifying concept. We make some comments about possible connection with Autopoiesis and Contextuality.

Under Review Since : 2019-11-15

Published Date : 2019-11-08

Under Review Since : 2019-11-18

Under Review Since : 2019-10-25

We present a Gibbs sampler to implement the Dempster-Shafer (DS) theory of statistical inference for Categorical distributions with arbitrary numbers of categories and observations. The DS framework is trademarked by its three-valued uncertainty assessment (p, q, r), probabilities "for", "against", and "don't know", associated with formal assertions of interest. The proposed algorithm targets the invariant distribution of a class of random convex polytopes which encapsulate the inference, via establishing an equivalence between the iterative constraints of the vertex configuration and the non-negativity of cycles in a fully connected directed graph. The computational cost increases with the size of the input, linearly with the number of observations and polynomially in the number of non-empty categories. Illustrations of numerical examples include the testing of independence in 2 by 2 contingency tables and parameter estimation of the linkage model. Results are compared to alternative methods of Categorical inference.

Under Review Since : 2019-10-17

Under Review Since : 2019-10-15

Bias resulting from model misspecification is a concern when predicting insurance claims. Indeed, this bias puts the insurer at risk of making invalid or unreliable predictions. A method that could provide provably valid predictions uniformly across a large class of possible distributions would effectively eliminate the risk of model misspecification bias. Conformal prediction is one such method that can meet this need, and here we tailor that approach to the typical insurance application and show that the predictions are not only valid but also efficient across a wide range of settings.

Under Review Since : 2019-10-14

**Purpose**

To investigate transcutaneous core-needle biopsy of the supraclavicular fat as a minimally invasive and scar-free method of obtaining brown adipose tissue (BAT) samples.

**Material and Methods**

In a prospective clinical trial, 16 volunteers underwent biopsy on two separate occasions after FDG-PET had shown active BAT in the supraclavicular fossa with an FDG uptake (SUV_{max}) > 3 mg/dl. After identifying the ideal location for biopsy on FDG-PET/MRI, ultrasound-guided core-needle biopsy of supraclavicular fat with a 16G needle was performed under local anesthesia and aseptic conditions. Tissue samples were immediately shock-frozen in liquid nitrogen and processed for gene expression analysis of adipose tissue markers. Wounds were checked two weeks after the biopsy.

**Results**

Tissue sampling was successful in 15 volunteers in both scans and in one very lean volunteer (BMI=19.9 kg/m^{2}) in only one visit, without any reported adverse events. Therefore 31 tissue samples were available for further analysis. Gene expression could be analyzed with high success rate in 30 out of 31 tissue biopsies. The intervention was well tolerated with local anesthetics. None of the volunteers showed any scarring.

**Conclusion**

Ultrasound-guided core-needle biopsy of FDG-positive supraclavicular fat yields sufficient BAT samples for quantification of molecular markers. It may, however, be limited in extremely lean individuals with very little supraclavicular fat.

Under Review Since : 2019-10-11

Under Review Since : 2019-10-04

An important question in economics is how people choose between different payments in the future. The classical normative model predicts that a decision maker discounts a later payment relative to an earlier one by an exponential function of the time between them. Descriptive models use non-exponential functions to fit observed behavioral phenomena, such as preference reversal. Here we propose a model of discounting, consistent with standard axioms of choice, in which decision makers maximize the growth rate of their wealth. Four specifications of the model produce four forms of discounting - no discounting, exponential, hyperbolic, and a hybrid of exponential and hyperbolic - two of which predict preference reversal. Our model requires no assumption of behavioral bias or payment risk.

Published Date : 2019-09-30

Whether the predictions put forth prior to the 2016 U.S. presidential election were right or wrong is a question that led to much debate. But rather than focusing on right or wrong, we analyze the 2016 predictions with respect to a core set of {\em effectiveness principles}, and conclude that they were ineffective in conveying the uncertainty behind their assessments. Along the way, we extract key insights that will help to avoid, in future elections, the systematic errors that lead to overly precise and overconfident predictions in 2016. Specifically, we highlight shortcomings of the classical interpretations of probability and its communication in the form of predictions, and present an alternative approach with two important features. First, our recommended predictions are safer in that they come with certain guarantees on the probability of an erroneous prediction; second, our approach easily and naturally reflects the (possibly substantial) uncertainty about the model by outputting *plausibilities* instead of *probabilities*.

Under Review Since : 2019-09-29

This paper examines the development of Laplacean practical certainty from 1810, when Laplace proved his central limit theorem, to 1925, when Ronald A. Fisher published his *Statistical Methods for Research Workers*.

Although Laplace's explanations of the applications of his theorem were accessible to only a few mathematicians, expositions published by Joseph Fourier in 1826 and 1829 made the simplest applications accessible to many statisticians. Fourier suggested an error probability of 1 in 20,000, but statisticians soon used less exigent standards. Abuses, including p-hacking, helped discredit Laplace's theory in France to the extent that it was practically forgotten there by the end of the 19th century, yet it survived elsewhere and served as the starting point for Karl Pearson's biometry.

The probability that a normally distributed random variable is more than three probable errors from its mean is approximately 5%. When Fisher published his *Statistical Methods*, three probable errors was a common standard for likely significance. Because he wanted to enable research workers to use distributions other than the normal -- the *t* distributions, for example --- Fisher replaced three probable errors with 5%.

The use of *significant* after Fisher differs from its use by Pearson before 1920. In Pearson's *Biometrika*, a significant difference was an observed difference that *signified* a real difference. *Biometrika*'s authors sometimes said that an observed difference is likely or very likely to be significant, but they never said that it is very significant, and they did not have levels of significance. Significance itself was not a matter of degree.

What might this history teach us about proposals to curtail abuses of statistical testing by changing its current vocabulary (p-value, significance, etc.)? The fact that similar abuses arose before this vocabulary was introduced suggests that more substantive changes are needed.

Under Review Since : 2019-09-30

Meta-analysis based on only a few studies remains a challenging problem, as an accurate estimate of the between-study variance is apparently needed, but hard to attain, within this setting. Here we offer a new approach, based on the *generalized inferential model* framework, whose success lays in marginalizing out the between-study variance, so that an accurate estimate is not essential. We show theoretically that the proposed solution is at least approximately valid, with numerical results suggesting it is, in fact, nearly exact. We also demonstrate that the proposed solution outperforms existing methods across a wide range of scenarios.

Under Review Since : 2019-09-23

*Big Bubble theory is a cosmological model where the universe is an expanding bubble in four-dimensional space. Expansion is driven by starlight and gravity acts like surface tension to form a minimal surface. This model is used to derive Minkowski’s spacetime geometrically from four-dimensional Euclidian space. Big Bubble cosmology is consistent with type 1a supernova redshifts without dark energy or expanding spacetime. A different origin for the cosmic microwave background is proposed. The size of the universe is estimated using Hubble’s constant and a doppler shift of the cosmic microwave background. A mechanism for Mach’s principle is described. Big Bubble theory is similar to Einstein’s 1917 cosmological model, which is shown to be a snapshot of a rapidly expanding universe in dynamic equilibrium, rather than a static universe. The orbital speed of stars in spiral galaxies can be reproduced with Newtonian dynamics and without dark matter. A quadratic equation is derived that predicts both faster and slower rotation than purely Kepler orbits, consistent with the behaviour of spiral and elliptical galaxies, and suggesting that spiral galaxies evolve into elliptical galaxies as they age. The Big Bubble physical concept provides a basis for some quantum physics phenomena.*

Published Date : 2019-09-08

**The work argues that Nassim Taleb's precautionary principle should not apply to the domain of ‘GMOs’ any more than to other monopolizing economic domains, because the probability of systemic ruin stemming from the GM technology itself is dwarfed by other systemic risks of the Deductive-Optimization Economy of today.**

Published Date : 2019-09-08

**A philosophical version**

In this work of foresight, I communicated my perception of Taleb's policy paper and the Black Swan problem discussed in it. To this effect, I:

- Re-conceptualized the concept of "foresight,” non-teleologically, and its “method”;
- Revived Empedocles’ non-teleological philosophy of evolution with modern scientific data;
- Located the real GMO safety problem in (you guessed it) teleology: in the suppression of dissent within institutions under a seeming assumption of knowing what waste is.

Under Review Since : 2019-08-24

Under Review Since : 2019-08-04

The idea of the paper is to think about the result presented in Numberphile (http://www. numberphile.com/) talk (https://www.youtube.com/watch?v=w-I6XTVZXww) where they claim that 1 + 2 + 3 + ..., the Gauss sum, converges to −1/12. In the video they make two strong statements: one that the Grandi’s Series 1 − 1 + 1 − 1 + 1 − 1 + ... tends to 1/2 and the second that as bizarre as the −1/12 result for the Gauss sum might appears, as it is connected to Physics (this result is related with the number of dimensions in String Theory) then it is plausible. In this work we argue that these two statements reflect adhesion to a particular probability narrative and to a particular scientific philosophical posture. We argue that by doing so, these (Gauss and Grandi series) results and String Theory ultimately, might be mathematical correct but they are scientifically (in the Galileo-Newton-Einstein tradition) inconsistent (at least). The philosophical implications of this problem are also discussed, focusing on the role of evidence and scientific demarcation.

Under Review Since : 2019-08-01

A covering problem posed by Henri Lebesgue in 1914 seeks to find the convex shape of smallest area that contains a subset congruent to any point set of unit diameter in the Euclidean plane. Methods used previously to construct such a covering can be refined and extended to provide an improved upper bound for the optimal area. An upper bound of 0.8440935944 is found.

Under Review Since : 2019-07-19

In the context of predicting future claims, a fully Bayesian analysis---one that specifies a statistical model, prior distribution, and updates using Bayes's formula---is often viewed as the gold-standard, while Buhlmann's credibility estimator serves as a simple approximation. But those desirable properties that give the Bayesian solution its elevated status depend critically on the posited model being correctly specified. Here we investigate the asymptotic behavior of Bayesian posterior distributions under a misspecified model, and our conclusion is that misspecification bias generally has damaging effects that can lead to inaccurate inference and prediction. The credibility estimator, on the other hand, is not sensitive at all to model misspecification, giving it an advantage over the Bayesian solution in those practically relevant cases where the model is uncertain. This begs the question: does robustness to model misspecification require that we abandon uncertainty quantification based on a posterior distribution? Our answer to this question is *No*, and we offer an alternative *Gibbs posterior* construction. Furthermore, we argue that this Gibbs perspective provides a new characterization of Buhlmann's credibility estimator.

Under Review Since : 2019-05-27

The Wilcoxon Rank Sum is a very competitive robust alternative to the two-sample t-test when the underlying data have tails longer than the normal distribution. Extending to the one-way model with k independent samples, the Kruskal-Wallis rank test is a competitive alternative to the usual F for testing if there are any location differences. However, these positives for rank methods do not extend as readily to methods for making all pairwise comparisons used to reveal where the differences in location may exist. We demonstrate via examples and simulation that rank methods can have a dramatic loss in power compared to the standard Tukey-Kramer method of normal linear models even for non-normal data. We also show that a well-established robust rank-like method can recover the power but does not fully control the familywise error rate in small samples.

Under Review Since : 2019-05-25

Under Review Since : 2019-03-09

An inferential model encodes the data analyst's degrees of belief about an unknown quantity of interest based on the observed data, posited statistical model, etc. Inferences drawn based on these degrees of belief should be reliable in a certain sense, so we require the inferential model to be *valid*. The construction of valid inferential models based on individual pieces of data is relatively straightforward, but how to combine these so that the validity property is preserved? In this paper we analyze some common combination rules with respect to this question, and we conclude that the best strategy currently available is one that combines via a certain dimension reduction step before the inferential model construction.

Under Review Since : 2019-03-05

Cooperation is a persistent behavioral pattern of entities pooling and sharing resources. Its ubiquity in nature poses a conundrum: whenever two entities cooperate, one must willingly relinquish something of value to the other. Why is this apparent altruism favored in evolution? Classical treatments assume *a priori* a net fitness gain in a cooperative transaction which, through reciprocity or relatedness, finds its way back from recipient to donor. Our analysis makes no such assumption. It rests on the insight that evolutionary processes are typically multiplicative and noisy. Fluctuations have a net negative effect on the long-time growth rate of resources but no effect on the growth rate of their expectation value. This is a consequence of non-ergodicity. Pooling and sharing reduces the amplitude of fluctuations and, therefore, increases the long-time growth rate for cooperators. Put simply, cooperators' resources outgrow those of similar non-cooperators. This constitutes a fundamental and widely applicable mechanism for the evolution of cooperation. Furthermore, its minimal assumptions make it a candidate explanation in simple settings, where other explanations, such as emergent function and specialization, are implausible. An example of this is the transition from single cells to early multicellular life.

Published Date : 2019-03-04

Abstract

These papers - one proposition paper and ten responses - comprise a debate on shaken baby syndrome. This is the hypothesis that a Triad of indicators in the head of a dead baby reveal that it has been shaken to death, and that the killer was the person last in charge of the baby. The debate was scheduled to have appeared in *Prometheus*, a journal concerned with innovation rather than matters medical. It struck the editors of *Prometheus* that a hypothesis that had survived nearly half a century and was still resistant to challenge and change was well within the tradition of *Prometheus* debate. The debate focuses on the role of the expert witness in court, and especially the experiences of Waney Squier, a prominent paediatric pathologist, struck from the medical register in the UK for offering opinions beyond her core expertise and showing insufficient respect for established thinking and its adherents. The debate’s responses reveal much about innovation, and most about the importance of context, in this case the incompatibility of medicine and the law, particularly when constrained by the procedures of the court. Context was also important in the reluctance of Taylor & Francis, the publisher of *Prometheus*, to publish the debate on the grounds that its authors strayed from their areas of expertise and showed insufficient respect for established thinking.

*Prometheus*** shaken baby debate**

**Contents**

**Introduction **

The shaken baby debate - Stuart Macdonald

**Proposition paper**

Shaken baby syndrome: causes and consequences of conformity - Waney Squier

**Response papers**

Shaken baby syndrome: a fraud on the courts - Heather Kirkwood

Shaken baby: an evolving diagnosis deformed by the pressures of the courtroom - Susan Luttner

Waney Squier’s ordeal and the crisis of the shaken baby paradigm - Niels Lynøe

Another perspective - simply my brief thoughts - Dave Marshall

Has Squier been treated fairly? - Brian Martin

Commentary on the paper by Waney Squier: ‘Shaken baby syndrome: causes and consequences of conformity’ - Michael J Powers

Waney Squier and the shaken baby syndrome case: a clarion call to science, medicine and justice - Toni C Saad

The role of the General Medical Council - Terence Stephenson

When experts disagree - Stephen J. Watkins

The General Medical Council’s handling of complaints: the Waney Squier case - Peter Wilmshurst

Under Review Since : 2019-03-03

In this paper we adopt the familiar sparse, high-dimensional linear regression model and focus on the important but often overlooked task of prediction. In particular, we consider a new empirical Bayes framework that incorporates data in the prior in two ways: one is to center the prior for the non-zero regression coefficients and the other is to provide some additional regularization. We show that, in certain settings, the asymptotic concentration of the proposed empirical Bayes posterior predictive distribution is very fast, and we establish a Bernstein--von Mises theorem which ensures that the derived empirical Bayes prediction intervals achieve the targeted frequentist coverage probability. The empirical prior has a convenient conjugate form, so posterior computations are relatively simple and fast. Finally, our numerical results demonstrate the proposed method's strong finite-sample performance in terms of prediction accuracy, uncertainty quantification, and computation time compared to existing Bayesian methods.

Under Review Since : 2019-02-28

This communication outlines the potential for a novel, alternative model to rationalise the quantitative difficulties with the Hubble constant. This model, the Proto-Quantum Field (PQF) model, is an alternative to the singularity-big-bang (SBB) model and the two are mutually incompatible. A justification is that the theoretical developments required to validate the PQF hypothesis is closely derived from the standard model for particles and forces, more so than those required to modify the SBB hypothesis.

Published Date : 2019-02-15

The universe is formed from proto-quantum field(s) (PQFs). The initiating event is the formation of a thermal gradient which establishes synchronous oscillations which describes time. Time is not quantised. Concomitantly PQFs, either directly or indirectly, differentiate into the all the quantum fields required for the standard model of particles and forces, and three dimensional space. The transition of PQFs to functional quantum fields is a continuous process at the boundary of a spherical universe, a “ring of fire”, necessary to maintain time.

Under Review Since : 2019-02-11

This analysis shows that a special relativity interpretation matches observed type 1a supernova redshifts. Davis & Lineweaver reported in 2003 that a special relativity match to supernova redshift observations can be ruled out at more than 23σ, but MacLeod’s 2004 conclusion that this finding was incorrect and due to a mathematical error is confirmed. MacLeod’s plot of special relativity against observation has been further improved by using celerity (aka proper velocity) instead of peculiar velocity. A Hubble plot of type 1a supernova celerity against retarded distance has a straight line of 70 km s^{-1} Mpc^{-1} for as far back in time as we can observe, indicating that, with a special relativity interpretation of cosmological redshift, expansion of the universe is neither accelerating nor decelerating, and it is not necessary to invoke the existence of dark energy.

Under Review Since : 2019-02-05

*It’s been more than a century since Einstein’s special theory of relativity showed that Newton’s concept of time is incorrect, but society and science continue to use predominantly Newtonian language and thought. The words normally used to describe time don’t distinguish when time is a dimension, used for locating objects and events in spacetime, and when it’s a property of objects that can age at different rates. It is proposed to bring relativity’s terminology of coordinate time and proper time into everyday language, and thereby distinguish between ‘cotime’ (a dimensional property) and ‘protime’ (a property of objects related to energy). The differences between cotime and protime are significant and cotime might be a spatial dimension with units of length.*

Under Review Since : 2019-02-03

Statistics has made tremendous advances since the times of Fisher, Neyman, Jeffreys, and others, but the fundamental and practically relevant questions about probability and inference that puzzled our founding fathers remain unanswered. To bridge this gap, I propose to look beyond the two dominating schools of thought and ask the following three questions: what do scientists need out of statistics, do the existing frameworks meet these needs, and, if not, how to fill the void? To the first question, I contend that scientists seek to convert their data, posited statistical model, etc., into calibrated degrees of belief about quantities of interest. To the second question, I argue that any framework that returns additive beliefs, i.e., probabilities, necessarily suffers from *false confidence*---certain false hypotheses tend to be assigned high probability---and, therefore, risks systematic bias. This reveals the fundamental importance of *non-additive beliefs* in the context of statistical inference. But non-additivity alone is not enough so, to the third question, I offer a sufficient condition, called *validity*, for avoiding false confidence, and present a framework, based on random sets and belief functions, that provably meets this condition. Finally, I discuss characterizations of p-values and confidence intervals in terms of valid non-additive beliefs, which imply that users of these classical procedures are already following the proposed framework without knowing it.

Under Review Since : 2019-01-30

A perfect economic storm emerged in M\'exico in what was called (mistakenly under our analysis) The December Error (1994) in which Mexico's economy collapsed. In this paper, we show how Theoretical Psychics may help us to understand the under processes for this kind of economic crisis and eventually perhaps to develop an early warning. We specifically analyze monthly historical time series for inflation from January 1969 to November 2018. We found that Fisher information is insensible to inflation growth in the 80's decade but capture quite good The December Error (TDE). Our results show that under Salinas administration Mexican economy was characterized by unstable stability must probably due to hidden risk policies in the form of macro-economy controls that artificially suppress aleatority out of the system making it fragile. And so, we conclude that it was not at all a December error but a sexenal sustained error of fragilization.

Under Review Since : 2019-01-24

Inhere we present a proposal of how to teach complexity using a Problem Based Learning approach under a set of philosophical principles inspired by the pedagogical experience in sustainability Sciences. We described the context in which we put on practise y these ideas that was a graduate course on Complexity and Data Science applied to Ecology. In part two we present the final work presented by the students as they wrote it and which we believe could be submitted to a journal by its own merits

Under Review Since : 2019-01-21

Inhere we expand the concept of Holobiont to incorporate niche construction theory in order to increase our understanding of the current planetary crisis. By this, we propose a new ontology, the Ecobiont, as the basic evolutionary unit of analysis. We make the case of \textit{Homo Sapiens} organized around modern cities (technobionts) as a different Ecobiont from classical \textit{Homo Sapiens} (i.e. Hunter-gatherers \textit{Homo Sapiens}). We consider that Ecobiont ontology helps to make visible the coupling of \textit{Homo Sapiens} with other biological entities under processes of natural and cultural evolution. Not to see this coupling hidden systemic risks and enhance the probability of catastrophic events. So Ecobiont ontology is necessary to understand and respond to the current planetary crisis.

Under Review Since : 2018-12-31

Headwater streams are essential to downstream water quality, therefore it is important they are properly represented on maps used for stream regulation. Current maps used for stream regulation, such as the United States Geological Survey (USGS) topographic maps and Natural Resources Conservation Service (NRCS) soil survey maps, are outdated and do not accurately nor consistently depict headwater streams. In order for new stream maps to be used for regulatory purposes, the accuracy must be known and the maps must show streams with a consistent level of accuracy. This study assessed the valley presence/absence and stream length accuracy of the new stream maps created by the North Carolina Center for Geographic Analysis (CGIA) for western North Carolina. The CGIA stream map does not depict headwater streams with a consistent level of accuracy. This study also compared the accuracy of stream networks modeled using the computer software program, Terrain Analysis using Digital Elevation Models (TauDEM), to the CGIA stream map. The stream networks modeled in TauDEM, also do not consistently predict the location of headwater streams across the mountain region of the state. The location of headwater streams could not be accurately nor consistently predicted by solely using aerial photography or elevation data. Other factors such as climate, soils, geology, land use, and vegetation cover should be considered to accurately and consistently model headwater stream networks.

Under Review Since : 2018-12-28

Metric Temporal Logic (MTL) is a popular formalism to specify patterns with timing constraints over the behavior of cyber-physical systems. In this paper, I propose sequential networks for online monitoring applications and construct network-based monitors from the past fragment of MTL over discrete and dense time behaviors. This class of monitors is more compositional, extensible, and easily implementable than other monitors based on rewriting and automata. I first explain the sequential network construction over discrete time behaviors and then extend it towards dense time by adopting a point-free approach. The formulation for dense time behaviors and MTL radically differs from the traditional pointy definitions and in return, we avoid some longstanding complications. I argue that the point-free approach is more natural and practical therefore should be preferred for the dense time. Finally, I present my implementation together with some experimental results that show the performance of the network-based monitors compared to similar existing tools.

Under Review Since : 2018-12-17

Analogue gravity models are attempts to model general relativity by using such things as acoustic waves propagating through an ideal fluid. In his work, we take inspiration from these models to re-interpret general relativity in terms of an ether continuum moving and changing against a background of absolute space and time. We reformulate the metric, the Ricci tensor, the Einstein equation, continuous matter dynamics in terms of the ether. We also reformulate general relativistic electrodynamics in terms of the ether, which takes the form of electrodynamics in an anisotropic moving medium. Some degree of simplification is achieved by assuming that the speed-of-light is uniform and isotropic with respect to the ether coordinates. Finally, we speculate on the nature of under-determination in general relativity.

Under Review Since : 2018-12-05

Nonparametric estimation of a mixing density based on observations from the corresponding mixture is a challenging statistical problem. This paper surveys the literature on a fast, recursive estimator based on the *predictive recursion* algorithm. After introducing the algorithm and giving a few examples, I summarize the available asymptotic convergence theory, describe an important semiparametric extension, and highlight two interesting applications. I conclude with a discussion of several recent developments in this area and some open problems.

Under Review Since : 2018-12-05

Bayesian methods provide a natural means for uncertainty quantification, that is, credible sets can be easily obtained from the posterior distribution. But is this uncertainty quantification valid in the sense that the posterior credible sets attain the nominal frequentist coverage probability? This paper investigates the frequentist validity of posterior uncertainty quantification based on a class of empirical priors in the sparse normal mean model. In particular, we show that our marginal posterior credible intervals achieve the nominal frequentist coverage probability under conditions slightly weaker than needed for selection consistency and a Bernstein--von Mises theorem for the full posterior, and numerical investigations suggest that our empirical Bayes method has superior frequentist coverage probability properties compared to other fully Bayes methods.

Under Review Since : 2018-12-04

Reality has two different dimensions: information-communication, and matter-energy. They relate to each other in a figure-ground gestalt that gives two different perspectives on the one reality.

1. We learn from modern telecommunications that matter and energy are the **media** of information and communication; here information is the figure and matter/enegy is the ground.

2. We learn from cybernetics that information and communication **control** matter and energy; here matter/energy is the figure and information/communication is the ground.

Under Review Since : 2018-12-04

The cybernetic control loop can be understood as a decision making process, a command and control process. Information controls matter and energy. Decision making has four necessary and sufficient communication processes: act, sense, evaluate, choose. These processes are programmed by the environment, the model, values and alternatives.

Under Review Since : 2018-12-04

The idea that information is entropy is an error. The proposal is neither mathematically nor logically valid. There are numerous definitions of entropy, but there is no definition of entropy for which this equation is valid. Information is information. It is neither matter nor energy.

Under Review Since : 2018-12-03

Internet is increasingly important for our economies and societies. This is the reason for a growing interest in internet regulation. The stakes in network neutrality - that all traffic on the internet should be treated equally - are particularly high. This paper argues that technological, civil-libertarian, legal and economic arguments exist both for- and against net neutrality and that the decision is ultimately political. We therefore frame the issue of net neutrality as an issue of political economy. The main political economy arguments for net neutrality are that a net-neutral internet contributes to the reduction of inequality, preserves its openness and prevents artificial scarcity. With these arguments Slovenia, after Chile and the Netherlands, adopted net neutrality legislation. We present it as a case study for examining how political forces are affecting the choice of economic and technological policies. After a few years we are finding that proper enforcement is just as important as legislation.

Under Review Since : 2018-11-22

Physicalism, which provides the philosophical basis of modern science, holds that consciousness is solely a product of brain activity, and more generally, that mind is an epiphenomenon of matter, that is, derivable from and reducible to matter. If mind is reducible to matter, then it follows that identical states of matter must correspond to identical states of mind.

In this discourse, I provide a cogent refutation of physicalism by showing examples of physically identical states which, by definition, cannot be distinguished by any method available to science but can nevertheless be distinguished by a conscious observer. I conclude by giving an example of information that is potentially knowable by an individual but is beyond the ken of science.

Under Review Since : 2018-11-22

Philosophers have long pondered the Problem of Universals. One response is Metaphysical Realism, such as Plato's Doctrine of the Forms and Aristotle's Hylomorphism. We postulate that Measurement in Quantum Mechanics forms the basis of Metaphysical Realism. It is the process that gives rise to the instantiation of Universals as Properties, a process we refer to as Hylomorphic Functions. This combines substance metaphysics and process metaphysics by identifying the instantiation of Universals as causally active processes along with physical substance, forming a dualism of both substance and information. Measurements of fundamental properties of matter are the Atomic Universals of metaphysics, which combine to form the whole taxonomy of Universals. We look at this hypothesis in relation to various different interpretations of Quantum Mechanics grouped under two exemplars: the Copenhagen Interpretation, a version of Platonic Realism based on wave function collapse, and the Pilot Wave Theory of Bohm and de Broglie, where particle--particle interactions lead to an Aristotelian metaphysics. This view of Universals explains the distinction between pure information and the medium that transmits it and establishes the arrow of time. It also distinguishes between univerally true Atomic Facts and the more conditional Inferences based on them. Hylomorphic Functions also provide a distinction between Universals and Tropes based on whether a given Property is a physical process or is based on the qualia of an individual organism. Since the Hylomorphic Functions are causally active, it is possible to suggest experimental tests that can verify this viewpoint of metaphysics.

Under Review Since : 2018-12-05

The nature of consciousness has been one of the longest-standing open questions in philosophy. Advancements in physics, neuroscience, and information theory have informed and constrained this topic, but have not produced any consensus. What would it mean to ‘solve’ or ‘dissolve’ the mystery of consciousness?

Part I begins with grounding this topic by considering a concrete question: what makes some conscious experiences more pleasant than others? We first review what’s known about the neuroscience of pain & pleasure, find the current state of knowledge narrow, inconsistent, and often circular, and conclude we must look elsewhere for a systematic framework (Sections I & II). We then review the Integrated Information Theory (IIT) of consciousness and several variants of IIT, and find each of them promising, yet also underdeveloped and flawed (Sections III-V).

We then take a step back and distill what kind of problem consciousness *is*. Importantly, we offer eight sub-problems whose solutions would, in aggregate, constitute a *complete theory of consciousness* (Section VI).

Armed with this framework, in Part II we return to the subject of pain & pleasure (valence) and offer some assumptions, distinctions, and heuristics to clarify and constrain the problem (Sections VII-IX). Of particular interest, we then offer a specific hypothesis on what valence *is* (Section X) and several novel empirical predictions which follow from this (Section XI). Part III finishes with discussion of how this general approach may inform open problems in neuroscience, and the prospects for building a new science of qualia (Sections XII & XIII). Lastly, we identify further research threads within this framework (Appendices A-F).

Published Date : 2018-11-13

This paper argues that experimental evidence, quantum theory, and relativity theory, taken together, suggest that reality is relational: Properties and behaviors of phenomena do not have a priori, intrinsic values; instead, these properties and behaviors emerge through interactions with other systems.

Under Review Since : 2018-11-05

Published Date : 2018-11-01

Francis Perey, of the Engineering Physics Division of Oak Ridge National Lab, left a number of unpublished papers upon his death in 2017. They circulate around the idea of probabilities arising naturally from basic physical laws. One of his papers, Application of Group Theory to Data Reduction, was published as an ORNL white paper in 1982. This collection includes two earlier works and two that came later, as well as a relevant presentation. They are being published now so that the ideas in them will be available to interested parties.

Under Review Since : 2018-10-22

The potential for an infectious disease outbreak that is much worse than those which have been observed in human history, whether engineered or natural, has been the focus of significant concern in biosecurity. Fundamental dynamics of disease spread make such outbreaks much less likely than they first appear. Here we present a slightly modified formulation of the typical SEIR model that illustrates these dynamics more clearly, and shows the unlikely cases where concern may still be warranted. This is then applied to an extreme version of proposed pandemic risk, multi-disease syndemics, to show that (absent much clearer reasons for concern) the suggested dangers are overstated.

Under Review Since : 2018-10-03

This note generalizes the notion of conditional probability to Riesz spaces using the order-theoretic approach. With the aid of this concept, we establish the law of total probability and Bayes' theorem in Riesz spaces; we also prove an inclusion-exclusion formula in Riesz spaces. Several examples are provided to show that the law of total probability, Bayes' theorem and inclusion-exclusion formula in probability theory are special cases of our results.

Under Review Since : 2018-09-18

To justify the effort of developing a theoretical construct, a theoretician needs empirical data that support a non-random effect of sufficiently high replication-probability. To establish these effects statistically, researchers (rightly) rely on a *t*-test. But many pursue questionable strategies that lower the cost of data-collection. Our paper reconstructs two such strategies. Both reduce the minimum sample-size (N_{MIN}) sufficing under conventional errors (*α*, *β*) to register a given effect-size (*d*) as a statistically significant non-random data signature. The first strategy increases the *β*-error; the second treats the control-group as a constant, thereby collapsing a two-sample *t*-test into its one-sample version. (A two-sample *t*-test for *d*=0.50 under a*=β*=0.05 with N_{MIN}=176, for instance, becomes a one-sample *t*-test under a*=*0.05, *β*=0.20 with N_{MIN}=27.) Not only does this decrease the replication-probability of data from (1-*β*)=0.95 to (1-*β*)=0.80, particularly the second strategy cannot corroborate hypotheses meaningfully. The ubiquity of both strategies arguably makes them partial causes of the confidence-crisis. But as resource-pooling would allow research groups reach N_{MIN} jointly, a group’s individually limited resources justify neither strategy.

Published Date : 2018-09-15

This article describes our motivation behind the development of RESEARCHERS.ONE, our mission, and how the new platform will fulfull this mission. We also compare our approach with other recent reform initiatives such as post-publication peer review and open access publications.

Under Review Since : 2018-09-15

Under Review Since : 2018-09-14

This paper is a synthesis of the deposition in front of the Financial Crisis Inquiry Commission by the Obama Administration in 2010. Note that none of its ideas made it to the report.

Under Review Since : 2018-09-14

This article describes how the filtering role played by peer review may actually be harmful rather than helpful to the quality of the scientific literature. We argue that, instead of trying to filter out the low-quality research, as is done by traditional journals, a better strategy is to let everything through but with an acknowledgment of the uncertain quality of what is published, as is done on the RESEARCHERS.ONE platform. We refer to this as "scholarly mithridatism." When researchers approach what they read with doubt rather than blind trust, they are more likely to identify errors, which protects the scientific community from the dangerous effects of error propagation, making the literature stronger rather than more fragile.

Published Date : 2018-09-14

Penalized maximum likelihood methods that perform automatic variable are now ubiquitous in statistical research. It is well-known, however, that these estimators are nonregular and consequently have limiting distributions that can be highly sensitive to small perturbations of the underlying generative model. This is the case even for the ﬁxed “p” framework. Hence, the usual asymptotic methods for inference, like the bootstrap and series approximations, often perform poorly in small samples and require modiﬁcation. Here, we develop locally asymptotically consistent conﬁdence intervals for regression coeﬃcients when estimation is done using the Adaptive LASSO (Zou, 2006) in the ﬁxed “p” framework. We construct the conﬁdence intervals by sandwiching the nonregular functional of interest between two smooth, data-driven, upper and lower bounds and then approximating the distribution of the bounds using the bootstrap. We leverage the smoothness of the bounds to obtain consistent inference for the nonregular functional under both ﬁxed and local alternatives. The bounds are adaptive to the amount of underlying nonregularity in the sense that they deliver asymptotically exact coverage whenever the underlying generative model is such that the Adaptive LASSO estimators are consistent and asymptotically normal, and conservative otherwise. The resultant conﬁdence intervals possess a certain tightness property among all regular bounds. Although we focus on the Adaptive LASSO, our approach generalizes to other penalized methods. (Originally published as a technical report in 2014.)

Under Review Since : 2018-09-09

Prediction markets are currently used for three fields: 1. For economic, political and sporting event outcomes. (IEW, PredictIt, PredictWise) 2. For risk evaluation, product development and marketing. (Cultivate Labs/Consensus Point) 3. Research replication. (Replication Prediction Project, Experimental Economics Prediction Project, and Brian Nosek’s latest replicability study) The latter application of prediction markets has remained closed and/or proprietary despite the promising results in the methods. In this paper, I construct an open research prediction market framework to incentivize replicate study research and align the motivations of research stakeholders.

Under Review Since : 2018-09-06

Extreme values are by definition rare, and therefore a spatial analysis of extremes is attractive because a spatial analysis makes full use of the data by pooling information across nearby locations. In many cases, there are several dependent processes with similar spatial patterns. In this paper, we propose the first multivariate spatial models to simultaneously analyze several processes. Using a multivariate model, we are able to estimate joint exceedance probabilities for several processes, improve spatial interpolation by exploiting dependence between processes, and improve estimation of extreme quantiles by borrowing strength across processes. We propose models for separable and non-separable, and spatially continuous and discontinuous processes. The method is applied to French temperature data, where we find an increase in the extreme temperatures over time for much of the country.

Under Review Since : 2018-09-04

In a Bayesian context, prior specification for inference on monotone densities is conceptually straightforward, but proving posterior convergence theorems is complicated by the fact that desirable prior concentration properties often are not satisfied. In this paper, I first develop a new prior designed specifically to satisfy an empirical version of the prior concentration property, and then I give sufficient conditions on the prior inputs such that the corresponding empirical Bayes posterior concentrates around the true monotone density at nearly the optimal minimax rate. Numerical illustrations also reveal the practical benefits of the proposed empirical Bayes approach compared to Dirichlet process mixtures.

Under Review Since : 2018-09-04

Accurate estimation of value-at-risk (VaR) and assessment of associated uncertainty is crucial for both insurers and regulators, particularly in Europe. Existing approaches link data and VaR indirectly by first linking data to the parameter of a probability model, and then expressing VaR as a function of that parameter. This indirect approach exposes the insurer to model misspecification bias or estimation inefficiency, depending on whether the parameter is finite- or infinite-dimensional. In this paper, we link data and VaR directly via what we call a discrepancy function, and this leads naturally to a Gibbs posterior distribution for VaR that does not suffer from the aforementioned biases and inefficiencies. Asymptotic consistency and root-*n* concentration rate of the Gibbs posterior are established, and simulations highlight its superior finite-sample performance compared to other approaches.

Under Review Since : 2018-09-02

We prove an observation of Makkai that FOLDS equivalence coincides with homotopy equivalence in the case of semi-simplicial sets.

Under Review Since : 2018-09-01

The Art of The Election: A Social Media History of the 2016 Presidential Race

Abstract

The book is 700 pages comprising of Donald Trump’s tweets from June 2015 to November 2016 and footnotes which comprise 70-80% of the tweets which explain the context of each tweet. The book has a 100 page bibliography.

It is highly likely that Trump would not have been elected President were it not for social media. This is an unprecedented statement. This is the first time a presidential candidate utilized a social network to get his message out directly to voters, but moreover, to shape the media feedback loop. His tweets became news. This is primary source material on the 2016 election. No need for narratives, outside ”experts” or political ”science”.

The file is too large to post on this website. But you can download the book under this link:

https://www.dropbox.com/s/bxvsh7eqh2ueq6j/Trump%20Book.docx?dl=0

Keywords and phrases: 2016, book, Trump, election, social media.

Under Review Since : 2018-09-01

We propose the first economical theory of value that actually works. We explain evolutionary causes of trade, and demonstrate how goods have value from the evolutionary perspective, and how this value is increased with trade. This "Darwinian" value of goods exists before humans assign monetary value (or any other value estimate) to traded goods. We propose objective value estimate expressed in energy units.

Under Review Since : 2018-09-14

Inference on parameters within a given model is familiar, as is ranking different models for the purpose of selection. Less familiar, however, is the quantification of uncertainty about the models themselves. A Bayesian approach provides a posterior distribution for the model but it comes with no validity guarantees, and, therefore, is only suited for ranking and selection. In this paper, I will present an alternative way to view this model uncertainty problem, through the lens of a valid inferential model based on random sets and non-additive beliefs. Specifically, I will show that valid uncertainty quantification about a model is attainable within this framework in general, and highlight the benefits in a classical signal detection problem.

Under Review Since : 2018-09-17

I make the distinction between *academic* probabilities, which are not rooted in reality and thus have no tangible real-world meaning, and *real* probabilities, which attain a real-world meaning as the odds that the subject asserting the probabilities is forced to accept for a bet against the stated outcome. With this I discuss how the replication crisis can be resolved easily by requiring that probabilities published in the scientific literature are real, instead of academic. At present, all probabilities and derivatives that appear in published work, such as P-values, Bayes factors, confidence intervals, etc., are the result of academic probabilities, which are not useful for making meaningful assertions about the real world.

Under Review Since : 2018-08-30

Under Review Since : 2018-08-28

Under Review Since : 2019-03-26

I prove a connection between the logical framework for intuitive probabilistic reasoning (IPR) introduced by Crane (2017) and sets of imprecise probabilities. More specifically, this connection provides a straightforward interpretation to sets of imprecise probabilities as subjective credal states, giving a formal semantics for Crane's formal system of IPR. The main theorem establishes the IPR framework as a potential logical foundation for imprecise probability that is independent of the traditional probability calculus.

Under Review Since : 2018-09-29

Irune Orinuela's Spanish translation of https://www.researchers.one/article/2020-03-10