# Posts

### October 05, 2015

+

When you start doing more advanced sports analytics you'll eventually starting working with what are known as hierarchical, nested or mixed effects models. These are models that contain both fixed and random effects. There are multiple ways of defining fixed vs random random effects, but one way that's useful is that members of a random effect are being pooled together to better estimate the group variance, assuming that it's reasonable to pool them together. In sports one example might be the […]

+

2:00 AM | Our favourite bogus poll

It’s time for Forest & Bird’s Bird of the Year competition. As with any bogus poll, we won’t learn what the true popularity of the various NZ birds actually is. As long as it’s clear that bogus polls are being used for entertainment and advertising, not to collect information, there isn’t a statistical problem with them.

### October 04, 2015

+

Principal-components regression (PCR) is routine in applied time-series econometrics.Why so much PCR, and so little ridge regression? Ridge and PCR are both shrinkage procedures involving PC's. The difference is that ridge effectively includes all PC's and shrinks according to sizes of eigenvalues associated with the PC's, whereas PCR effectively shrinks some PCs completely to zero (those not included) and doesn't shrink others at all (those included). Ridge seems to resonate as more
[…]

+

Each week, we would like to invite readers of Stats Chat to submit nominations for our Stat of the Week competition and be in with the chance to win an iTunes voucher. Here’s how it works: Anyone may add a comment on this post to nominate their Stat of the Week candidate before midday Friday […]

+

If you’d like to comment on or debate any of this week’s Stat of the Week nominations, please do so below!

+

Political scientist Brian Silver points me to his post by economist Paul Romer, who writes: The style that I [Romer] am calling mathiness lets academic politics masquerade as science. Like mathematical theory, mathiness uses a mixture of words and symbols, but instead of making tight links, it leaves ample room for slippage between statements in […]
The post Flamebait: “Mathiness” in economics and political science appeared first on Statistical Modeling, Causal Inference, and
[…]

+

This week, I was making a presentation for Company X and needed to keep abreast of the recent cost per genome figure computed at genome.gov. Well, thanks to a retweet by Michael, it looks like October 1 saw the Second Inflection point we've been waiting for. Time to change the slides and time to think about The Important Things after Commodity Sequencing. Let us note that this cost figure still doesn't seem to include neither Pacific BioSciences RSII nor Oxford Nanopore technology cost.Related:
[…]

### October 03, 2015

+

11:16 PM | Why shallow minors matter for graph drawing

An ongoing concern in graph drawing research has been curve complexity. If you draw a graph using a certain style, how complicated are you going to have to make the edges? More complicated curves are harder for readers to follow, and therefore they make the graph less readable. But simpler curves (such as line segments) may have their own problems: not fitting the style (which may constrain the edges to certain directions), running through vertices, forming sharp angles with each other, etc. To
[…]

+

Team Ratings at 04 October The basic method is described on my Department home page. Here are the team ratings prior to 04 October along with the ratings at the start of the Rugby World Cup. Rating at 04 October Rating at RWC Start Difference New Zealand 26.72 29.01 -2.30 South Africa 22.16 22.73 -0.60 […]

+

10:06 PM | Psychic meerkats and organic antioxidants

From the Independent, which used to be the sort of paper that knew better: With a 100 per cent record up to this point – they predicted England would beat Fiji but lose to Wales – it seems that the meerkats might have some genuine psychic abilities. Even if they do, that doesn’t explain why they […]

+

7:59 PM | Field Statistics

Yesterday I learned something interesting from a talk given by Professor Bikas K Sinha. The following is an excerpt from the reference [1], which exactly shows the interesting point of the problem. “A population consisting of an unknown number of distinct species is searched by selecting one member at a time. No a priori information is available concerning […]

+

5:00 PM | Data analysis vs statistics

John Tukey preferred the term “data analysis” over “statistics.” In his paper Data Anaysis, Computation and Mathematics, he explains why. My title speaks of “data analysis” not “statistics”, and of “computation” not “computing science”; it does not speak of “mathematics”, but only last. Why? … My brother-in-squared-law, Francis J. Anscombe has commented on my use of […]

+

Reference: Hobson, M.P., Efstathiou, G. P. & Lasenby, A. N. (2006), General Relativity: An Introduction for Physicists; Cambridge University Press. Problem 2.4. As another example of finding the metric tensor in a new coordinate system, consider the stereographic projection of a sphere of radius (problem 2.4 in Hobson’s book should begin “Consider the surface of […]

+

Ed Green writes: I have fitted 5 models in Stan and computed WAIC and its standard error for each. The standard errors are all roughly the same (all between 209 and 213). If WAIC_1 is within one standard error (of WAIC_1) of WAIC_2, is it fair to say that WAIC is inconclusive? My reply: No, […]
The post Comparing Waic (or loo, or any other predictive error measure) appeared first on Statistical Modeling, Causal Inference, and Social Science.

+

Les attractions des Nuits Blanches de Paris se trouvent ici, celles de Bruxelles se trouvent ici et celles de Toronto ici.Si par contre vous etes interessés par le déluge de données qui dépassent le nombre d'étoiles dans l'univers et comment les utiliser pour mieux comprendre le monde, bienvenue, vous êtes arrivés a bon port. Pour information, il y a depuis plus de deux ans, des meetups sur le Machine Learning a Paris. L'un des membre de la communauté de Machine Learning, Samim
[…]

+

12:01 PM | Aunt Pythia’s advice

Readers, so glad to be back with you this week, and many apologies for missing last week, but I was arranging my yarn collection. I’m back now, though, and reading interesting articles about the real life of a sex worker (not arousing, as it turns out) and recording my weekly Slate Money podcast (I’m particularly […]

+

Here is a video on Gaussian Processes by Neil Lawrence from the excellent CVTalks site. Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there ! Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix
[…]

+

4:55 AM | Occam's Razor

This is Joseph.I know Mark is often lukewarm about Kevin Drum. But he has been asking some pretty good questions lately, and I think this one about Hillary Clinton is quite good. I find, in practice, that people tend to assume that others know about he things that they do. Many of my most geeky friends who are good with coding also tend to not be the biggest fans of the Clintons. I wonder if much of the email and server issues are about what they could know that one […]

### October 02, 2015

+

Bill Gillespie, of Metrum, is giving a tutorial next week at ACoP: Getting Started with Bayesian PK/PD Modeling Using Stan: Practical use of Stan and R for PK/PD applications Thursday 8 October 2015, 8 AM — 5 PM, Crystal City, VA This is super cool for us, because Bill’s not one of our core developers […]
The post Stan PK/PD Tutorial at the American Conference on Pharmacometrics, 8 Oct 2015 appeared first on Statistical Modeling, Causal Inference, and Social Science.

+

Earlier articles:
Introduction
Common features
Page 1 (numerals)
Page 2 (arithmetic)
Page 3 (exponents)
Page 4 (algebra)
Page 5 (geometry)
Page 6 (chemistry)
Page 7 (mass)
Page 8 (time and space)
Page 9 (physical units)
Page 10 (temperature)
Page 11 (solar system)
Page 12 (Earth-Moon system)
Page 13 (days, months, and years)
Page 14 (terrain)
Page 15 (human anatomy)
Page 16 (vital statistics)
This is page 17 of the Cosmic Call
message. An explanation follows.
The 10 digits are:
[…]

+

If you missed it the first time around, here’s a link to: Stan Puzzle 1: Inferring Ability from Streaks First, a hat-tip to Mike, who posted the correct answer as a comment. So as not to spoil the surprise for everyone else, Michael Betancourt (different Mike), emailed me the answer right away (as he always […]
The post Solution to Stan Puzzle 1: Inferring Ability from Streaks appeared first on Statistical Modeling, Causal Inference, and Social Science.

+

4:22 PM | Punk Rock OR goes to Oberlin College

This week I visited Oberlin College to deliver the Fuzzy Vance Lecture in Mathematics. I was honored to be the 20th Fuzzy Vance lecturer. Each year, Oberlin invites one mathematician (or an operations researcher/fake mathematician in my case!) to visit campus, participate in classes, and give a lecture (the “Fuzzy Vance Lecture”) to the general public. My evening […]

+

Western Washington University, in Bellingham, WADevlin’s Angle for July 2006 was titled Letter to a calculus student. In it, I tried to describe, as briefly but as effectively as I could, the deep beauty there is in calculus, a beauty that arises from the depth of human brilliance that it took for the human mind to find a way to tame the infinite, and bend it to our use, a beauty made the more so by the enormous impact calculus has had on life on Earth.In my essay, I acknowledged […]

Editor's Pick

+

From the paper: "...In this paper, we tackle these scalability bottlenecks by focusing on what embeddings are actually used for: computing ℓ2-based pairwise similarity metrics typically used for supervised or unsupervised learning. For example, K-means clustering uses pairwise Euclidean distances, and SVM-based classification uses pairwise inner products. We therefore ask the following question: “Is it possible to compute an embedding which captures the pairwise euclidean distances between
[…]

+

2:13 PM | MPM2D - Day 18: Quiz #2

There was a little bit of anxiety in the room at the beginning of class today as it is quiz day. I hate that my students get stressed about a quiz, but I understand it. After returning homework set 14, which I had corrected last night, I gave them 15 minutes to talk to each other about any lingering issues around solving by substitution or elimination. I also answered questions one-on-one. Before starting the quiz I warned them that the last question was unlike any they had done, but that they
[…]

+

Actually the course is called Statistical Communication and Graphics, but I was griping about how few students were taking the class, and someone suggested the title Communicating Data and Statistics as being a bit more appealing. So I’ll go with that for now. I love love love this class and everything that’s come from it […]
The post Syllabus for my course on Communicating Data and Statistics appeared first on Statistical Modeling, Causal Inference, and Social Science.

+

A rocket scientist I know once observed the fundamental similarity between LA's two defining industries: entertainment and aerospace. Both exist from deal to deal, lining up risky, hugely expensive projects. Each of these projects require much of the work to be done from scratch, so much so that it's often like setting up a new business every time a deal goes through.There are, of course, limits to the analogy. In business terms, perhaps the biggest is the nature of the customer. For all […]

+

12:49 PM | The tricky thing about disparate impact

Today I’m fascinated by the story described in this three-part American Banker series on the Consumer Financial Protection Bureau’s (CFPB’s) use of disparate impact, written by Rachel Witkowski. Disparate impact, according to the article, is a legal theory that says lenders can be penalized if they have a neutral policy that creates an adverse impact against […]

+

11:32 AM | M27 (Dumbbell Nebula)

M27 (also known as the Dumbbell Nebula because of its shape) is a planetary nebula in the constellation Vulpecula, just north of the bright star Altair. My photo: Photo location: Monifieth (near Dundee), Scotland, UK. Date: 1 Oct 2015; 21:00 UTC. Telescope: 11-inch Celestron SCT. Camera: Pentax K3 Exposure: ISO 800, 40 30-second exposures stacked […]

+

7:54 AM | An algorithm isn't "just code"

I've been talking to many people about algorithmic fairness of late, and I've realized that at the core of pushback against algorithmic bias ("algorithms are just math! If the code is biased, just look at it and you can fix it !") is a deep misunderstanding of the nature of learning algorithms, and how they differ fundamentally from the traditional idea of an algorithm as "a finite set of well-defined elementary instructions that take an input and produce an output".This misunderstanding is […]