I installed the “roasted marshmallows” version of R (2.15.1) from www.r-project.org, which went smoothly on my MacBookPro running Lion. I was happy to find its REPL ran easily on the command line in Terminal.

R is a free implementation of a dialect of the S language, the statistics and graphics environment for which John Chambers won the ACM Software Systems award. S was consciously designed to blur the distinction between users and programmers. S is a high-level programming language, with similarities to Scheme and Python. It is a good system for rapid development of statistical applications.
R fundamentals

R has amazing built-in primitives and libraries for what scientists like to do with data and incredible graphing options. I struggled to find good, simple resources to approach learning R as a programmer. I loved this tutorial. Here are my cliff notes:

The most fundamental objects are vectors — basically an array, where index starts at 1.
Names of objects are case sensitive
Comments starts with ‘#’


n <- 25
v1 <- c(2,5,1,9)        # create a vector, combine a list of columns
v2 <- numeric(4)        # initialize a vector of specific length (with zeros)
v3 <- rep(3, 10)        # initialize a vector with number: repeat 3, 10x
v4 <- 1:4                  # specify a vector using the range 1-4
v4 <- 4:1                  # you can count down, 4,3,2,1
v4 <- 3:-1                 # even to negative numbers: 3,2,1,0,-1
v4 <- seq(1, 1.5, by=.1)   # easy to generate a sequence of equality spaced non-integers
v5 <- c(v3, v4, 7)        # c will combine or concatenate vectors
v2[1] <- 4         # use bracket notation to access (or set) an element
v2[1:3] <- 4:6     # you can even access (or set) a range

For those of us who already know it from Math class (or computer graphics), vector math in R works the way you would expect:

  • when an operation involves a vector and a number, the number is used to modify each element of the vector as specified by the operation
  • when arithmetic involves two vectors of the same length, then the operation applies to elements with the same index.

Adding vectors of different length doesn’t really make sense in real life (although maybe there’s an application for that I don’t know about), but R conveniently defines that the shorter vector is repeated as often a needed to match the length of the longer vector.

Like many languages, and more importantly, like Math, functions are a name with its arguments enclosed in parentheses.  Here are some common ones:


By using parentheses for grouping, one can combine several expressions that involve

(sum(v2^2) - (sum(v2)^2)/length(v2)) / (length(v2) - 1)

A simpler way to get the same result would be to use the var function.


The standard deviation is computed by using the expression sqrt(var(v2))

Comparisons are cool.

> 1  1:2  v6  v6[v6 < 3]    # find all elements in a vector which are  length(v6[v6 < 3])    # count how many are < 3
[1] 2

I’ve long observed that lean startup methodologies are a lot like agile development. When I explain it to developers, they get it pretty quickly, but that doesn’t resonate with business folk. I’ve been searching for ways to explain why I believe market testing is so important before starting to build a software product.

While there are plenty of folk drinking the Lean Startup Koolaid, the vast majority still believes that the best software comes exclusively from entrepreneurial insight. It is like a devine inspiration, where seeking validation would somehow indicate that that we are doubting our belief in God.

I’ve advocated validating market fit with clients who react with a seeming genuine concern that I lack confidence in my own ability to define the feature set for a new product or they tell me that if I’m afraid to move forward with my own opinions they are happy to just tell me what to do. Given my experience and past success, they want to place a bet on a winning jockey, and are puzzled why this jockey won’t just get on the horse and ride fast.

I have confidence in my ability to turn a great vision into the reality of a software product. I understand that small details can make or break a product, and the magic of great software design and development is found in selecting the right details that reinforce a cohesive product approach, where every feature enhances and supports every other feature. A high-level feature set or set of use cases could be implemented in a hundred different ways, which would each be good. I don’t assume that I (or anyone) could possible know which of those will resonate best with a specific audience. I also know that every company, whether its a new startup or Fortune 500, has a different opportunity for its go-to-market strategy, a different set of people that it could possibly reach. While perhaps the product could be used by “everyone,” it will be used first by the people who hear about it first, and those people can be discovered (and should be discovered) before you ever write a single line of code.

We have such great opportunity these days to envision in a practical way what the product will be like after it is completed. We can so easily put that in front of people to see how they respond. When I first heard about the idea on an MVP Landing Page test, it felt so delightfully obvious that this was a great approach that I’ve struggled to explain it to people.

I think the problem is with the word “testing” and even “validation” doesn’t hit the mark. I think it is just like test-driven development where it isn’t obvious until you do it (a lot) that the goal of testing is not quality assurance. When we write tests first while developing, it gives space and opportunity for to focus on the design of the particular part of the code you are working on. These so-call market tests allow us to see how our product features hang together because we need to express them in words and pictures. Even if we never did an internet advertising campaign to determine the conversion rate and customer acquisition costs, we would still benefit from seeing the product as a whole.

We make so many choices at the beginning of a product, why not let our prospective customers influence them? The key is to use our insights, our creativity and experience, to pick a couple of great options and then, instead of arguing with each other in a room, we let the target customers be the tie-breaker, and as a happy side-effect, if we’re good, we have a group of people eagerly awaiting the results of our hard work.

Also, if you haven’t yet read Eric Ries’ Lean Startup Book, you should buy it now. (I get a referral fee when you buy a book from this post, but that’s not why you should buy it. I keep a stack in the office that I routinely give away.) Unlike Eric, I don’t think people are afraid of negative results, I think it’s just very, very hard to imagine, in a concrete way, what a product will be before it exists. It feels easier to create it first, then try to explain it, but it just feels hard because this methodology is new and different for most of us. And it feels hard, because design is hard, and I’m not talking about colors and fonts, I’m taking about form and function and getting to the elusive question of why someone would want to use your software at all.