September 17, 2014

2:15 AM | Colon cancer, mathematical time travel, and questioning the sequential mutation model.
On Saturday, I arrived in Columbus, Ohio for the the MBI Workshop on the Ecology and Evolution of Cancer. Today, our second day started. The meeting is an exciting combination of biology-minded mathematicians and computer scientists, and math-friendly biologist and clinicians. As is typical of workshops, the speakers of the first day had an agenda […]

Baker AM, Cereser B, Melton S, Fletcher AG, Rodriguez-Justo M, Tadrous PJ, Humphries A, Elia G, McDonald SA, Wright NA & Simons BD (2014). Quantification of crypt and stem cell evolution in the normal and neoplastic human colon., Cell reports, 8 (4) 940-7. PMID:


September 16, 2014

11:20 PM | Yes, it's bear-in-the-pool hot
Recently, the hottest times of the year in LA are spring and fall.The September Southern California heat wave has sent at least one bear into a backyard swimming pool. Sunday afternoon, some Sierra Madre homeowners spied a sizable black bear lounging on the steps of their in-ground pool. The bear swam and rested for about 15 minutes before leaving like an unwanted party guest. It's hard to blame the wildlife. Temperatures in Sierra Madre hit 103 on Sunday and 100 on Monday, according to […]
5:05 PM | Random walk, Dirichlet problem, and Gaussian free field
This post is devoted to the discrete Dirichlet problem and Gaussian free field, involving the symmetric nearest neighbors random walk on \( {\mathbb{Z}^d} \). The discrete settings allow to go to the concepts, without superfluous abstraction. Symmetric Nearest Neighbors random walk. The symmetric nearest neighbors random walk on \( {\mathbb{Z}^d} \) is the sequence of […]
2:01 PM | Fermat’s principle of least time and Snell’s law
References: Tom Lancaster and Stephen J. Blundell, Quantum Field Theory for the Gifted Amateur, (Oxford University Press, 2014) – Problem 1.1 One of the guiding principles of quantum field theory is that a particle travelling between two points actually traverses all possible paths between these two points, although with varying probabilities for different paths. Although […]
1:34 PM | They know my email but they don’t know me
This came (unsolicited) in the inbox today (actually, two months ago; we’re on a delay, as you’re probably aware), subject line “From PWC – animations of CEO opinions for 2014″: Good afternoon, I wanted to see if the data my colleague David sent to you was of any interest. I have attached here additional animated […] The post They know my email but they don’t know me appeared first on Statistical Modeling, Causal Inference, and Social Science.
1:00 PM | Wondering where the numbers come from -- Rotten Tomatoes
A while back I was taking one of my random walks through Wikipedia and I came across the movie Postal. For some forgotten reason (possibly to see what the critics had to say about Dave Foley, J.K. Simmons or Zack Ward, all interesting actors), I clicked on the link for Rotten Tomatoes.The movie had a perfect 0% among top critics, but I noticed Peter Hartlaub of the San Francisco Chronicle had a rather kind blurb.If this movie had been made by an unknown young director, a lot of critics would […]
12:22 PM | Christian Rudder’s Dataclysm
Here’s what I’ve spent the last couple of days doing: alternatively reading Christian Rudder’s new book Dataclysm and proofreading a report by AAPOR which discusses the benefits, dangers, and ethics of using big data, which is mostly “found” data originally meant for some other purpose, as a replacement for public surveys, with their carefully constructed data […]
12:15 PM | Don't data puke, says Avinash Kaushik
Here are five amazing recommendations by Avinash Kaushik from a post about how to make Web analytics dashboards better by simplifying. Dashboards are not reports. Don't data puke. Include insights. Include recommendations for actions. Include business impact. NEVER leave data interpretation to the executives (let them opine on your recommendations for actions with benefit of their wisdom and awareness of business strategy). When it comes to key performance indicators, segments and your […]
11:11 AM | Hardware Based Stochastic Gradient Descent
  Starting at 52 minutes and 10 seconds of this video, Ali Rahimi explains how one can use a noisy evaluation of the gradient in a gradient descent algorithm in order to minimize a convex function (most relaxations featured in current Advanced Matrix Factorization techniques and some instances of Compressive Sensing fall in that category). The interesting part of this is that the noise comes from the errors made by the, now stochastic, Floating Point Unit.  From the paper […]
6:20 AM | Linkage
Unexpected shapes in smoke plumes, as photographed by Thomas Herbrich (G+)ISAAC 2014 and COCOA 2014 accepted paper lists (G+)FOCS 2014 program and best paper winners (G+)Kinetic sculpture made of wooden balls on threads, with some extensive software simulation behind its design (G+)How a 19th century math genius taught us the best way to hold a pizza slice, or, a practical application of the theorem that when a flat surface is embedded in 3d, it remains flat in at least one direction […]
2:40 AM | Exploring Climate Data (Part 2)
guest post by Blake Pollard I have been learning to make animations using R. This is an animation of the profile of the surface air temperature at the equator. So, the x axis here is the longitude, approximately from 120° E to 280° E. I pulled the data from the region that Graham Jones specified […]

September 15, 2014

6:22 PM | Disquisitiones Arithmeticae
It's time to read Gauss's original work. More later.
6:18 PM | An analytic formula for the median.
Three observations get you there: min { a,b,c} = − max {−a, −b, −c} second from top {a,b,c,d,e} = max ( {a,b,c,d,e} without max({a,b,c,d,e}) ) max {a,b,c} ~ log_t (t^a + t^b + t^c ),   t→∞ Putting these three together you can make a continuous formula approximating the median. Just subtract off the ends until you get to the middle. It’s an ugly expression. But, now you also have a way to view the sort operation—which is discontinuous—in a “continuous” […]
3:30 PM | Accelerating Random Forests in Scikit-Learn
Accelerating Random Forests in Scikit-Learn by Gilles Louppe Join the CompressiveSensing subreddit or the Google+ Community and post there ! Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.
3:22 PM | More bad news for the buggy-whip manufacturers
In a news article regarding difficulties in using panel surveys to measure the unemployment rate, David Leonhardt writes: The main factor is technology. It’s a major cause of today’s response-rate problems – but it’s also the solution. For decades, survey research has revolved around the telephone, and it’s worked very well. But Americans’ relationship with […] The post More bad news for the buggy-whip manufacturers appeared first on Statistical Modeling, Causal Inference, and […]
3:00 PM | CSjob: Associate professor in signal processing with special emphasis on compressive sensing, Aalborg, Denmark
Found on the interweb: Associate professor in signal processing with special emphasis on compressive sensing (42102) At the Faculty of Engineering and Science, Department of Electronic Systems, Signal and Informations Processing section (SIP), a position as associate professor in signal processing with special emphasis on compressive sensing is open for appointment from 1 October 2014 or soon hereafter. The position is for a period of four years. The Department of Electronic Systems is […]
1:00 PM | On deck this week
Mon: More bad news for the buggy-whip manufacturers Tues: They know my email but they don’t know me Wed: What do you do to visualize uncertainty? Thurs: Sokal: “science is not merely a bag of clever tricks . . . Rather, the natural sciences are nothing more or less than one particular application — albeit […] The post On deck this week appeared first on Statistical Modeling, Causal Inference, and Social Science.
1:00 PM | Shifting alliances
I'm not sure what the general lessons of the Zephyr Teachout campaign are. I'll leave it to the real political scientists to debate whether her performance should be judged in relative or absolute terms. One area I will weigh in on, however, (or at least point out) is how much the alignment of the education reform movement has changed recently.2010 was something of an inflection point in the education reform movement (Here's a Kindle single of posts from that year -- Things I saw at […]
12:36 PM | [오키나와 여행 04] 호텔 리뷰 – 테라스가든 미하마 리조트
오키나와에서 첫번째 밤을 보낸 숙소, 테라스 가든 미하마 리조트(Terrace Garden Mihama Resort)를 리뷰한다. 이 호텔 주변 관광지로는 선셋비치, 아메리칸 빌리지 등이 있다. 일단 외관은 아래와 같이 상당히 허접한 모습:   오른쪽에 보이는 흰색 건물이다. 하지만 내부는 예상했던 것 이상으로 상당히 … Continue reading →Related Posts ?[오키나와 여행 03] 선셋(Sunset) 비치의 […]
12:11 PM | My dataviz workshop starts in October
Course announcement is on my sister blog. This course adopts the creative writing workshop structure to teach data visualization. Think of chart making as a craft. We emphasize the importance of sketches, revisions, feedback from readers, and a critical eye. It's limited enrollment.
11:14 AM | 1976 NBER-Census Time Series Conference
What a great blast from the past -- check out the program of the 1976 NBER-Census Time-Series Conference. (Thanks to Bill Wei for forwarding, via Hang Kim.)The 1976 conference was a pioneer in bridging time-series econometrics and statistics. Econometricians at the table included Zellner, Engle, Granger, Klein, Sims, Howrey, Wallis, Nelson, Sargent, Geweke, and Chow. Statisticians included Tukey, Durbin, Bloomfield, Cleveland, Watts, and Parzen. Wow!The 1976 conference also clearly […]
10:31 AM | Plotting Fun with ILNumerics and IronPython
Since the early days of IronPython, I keep shifting one bullet point down on my ToDo list: * Evaluate options to use ILNumerics from IronPython Several years ago there has been some attempts from ILNumerics users who successfully utilized ILNumerics from within IronPython. But despite our fascination for these attempts, we were not able to […]
7:09 AM | √(x²−1)(x²−k²).      x,k∈ℂ (actually just going over the unit...
√(x²−1)(x²−k²), k∈ℂ x² √(x²−1)(x²−k²).      x,k∈ℂ (actually just going over the unit circle, not all of ℂ) edit: hey, are these showing up as moving gif’s for you? code: require(animation) source(wegert.R) #where I define "plat" and "Z", standard for all Wegert plots i = complex(imaginary=1) hyperbola
5:00 AM | Thesis: Randomized Algorithms For Large-Scale Strongly Overdetermined Linear Regression Problems - Xiagrui Meng
RANDOMIZED ALGORITHMS FOR LARGE-SCALE STRONGLY OVER-DETERMINED LINEAR REGRESSION PROBLEMSXiangrui MengIn the era of big data, distributed systems built on top of clusters of commodity hardware provide cheap and reliable storage and scalable data processing. With cheap storage, instead of storing only currently relevant data, most people choose to store data as much as possible, expecting that its value can be extracted later. In this way, exabytes (1018) of data are being created on a daily […]

September 14, 2014

8:06 PM | Regression with Python, pandas and StatsModels
I was at Boston Data-Con 2014 this morning, which was a great event.  The organizer, John Verostek, seems to have organized this three-day event single-handedly, so I am hugely impressed.Imran Malek started the day with a very nice iPython tutorial.  The description is here, and his slides are here.  He grabbed passenger data from the MBTA and generated heat maps showing the number of passengers at each stop in the system during each hour.  The tutorial covered a good range […]
1:30 PM | Six quotes from Kaiser Fung
You may think you have all of the data. You don’t. One of the biggest myth of Big Data is that data alone produce complete answers. Their “data” have done no arguing; it is the humans who are making this claim. Before getting into the methodological issues, one needs to ask the most basic question. […] The post Six quotes from Kaiser Fung appeared first on Statistical Modeling, Causal Inference, and Social Science.
3:30 AM | Defining empathy, sympathy, and compassion
When discussing the evolution of cooperation, questions about empathy, sympathy, and compassion are often close to mind. In my computational work, I used to operationalize-away these emotive concepts and replace them with a simple number like the proportion of cooperative interactions. This is all well and good if I want to confine myself to a […]
1:02 AM | Measuring Countersink Diameter Using Gage Balls
Introduction I am still working through some examples of using gage balls for machine shop work. The following reference on Google Books has great information on using gage balls (Figure 1) in measuring the characteristics of a countersink and I … Continue reading →
12:40 AM | Prevent errors or fix errors
The other day I was driving by our veterinarian’s office and saw that the marquee said something like “Prevention is less expensive than treatment.” That’s often true, but certainly not always. This evening I ran across a couple lines from Ed Catmull that are more accurate than the vet’s quote. Do not fall for the […]
9,413 Results