Monday, February 13, 2012

Funnel plot, bar plot and R

I just finished my Omniture SiteCatalyst training in Mclean, VA a few days ago. It was ok (somehow boring), we only went through how to click buttons inside SiteCatalyst to generate reports, not necessarily how to implement it and let it track the information we want to track.

I got two impressions out of the class: one is Omniture is great and powerful web analytical tool; another is the funnel plots could be misleading from data visualization perspective. For example, regardless of why the second event 'Reg Form Viewed' has higher frequency than first event 'Web Landing Viewed', the funnel bar for second event is still narrower than the one for first event. Just because it's designed to be the second stage in the funnel report.

This is a typical example of visualization components do not match up the numbers. There could be other types of funnel plots that are misleading as well, as pointed out by Jon Peltier in his blog article. I totally agree with him on using the simple barplot to be an alternative for the funnel plots. And I also like his idea of adding another plot for visualizing some small yet important metric, like purchases as shown in his example.

Then I turned into R to see if I can do some quick poking around on how to display the misleading funnel I have here into something meaningful and hopefully beautiful. Since I always feel like I don't have a good grasp on how to do barplots in R, this is going to be a good exercise for me.

As always, figuring out the 3-letters parameters for base package plot function is painful. And I had to set up appropriate size of margins, so that my category names won't be cut off.

Then I drew the same plot using ggplot2. All the command names make sense. And the plot is built up layer by layer. However, I did not manage to get the x-axis to the top of the plot, which will involve creating new customized geom.

There are some nice R barchart tips on the web, for example on learning_r, stackoverflow, and gglot site. Anyway, this is what I used

##### barchart

dd = data.frame(cbind(234, 334, 82, 208, 68))
colnames(dd) = c('web_landing_viewed', 'reg_form_viewed', 'registration_complete', 'download_viewed', 'download_clicked')
dd_pct = round(unlist(c(1, dd[,2:5]/dd[,1:4]))*100, digits=0)

# plain barchart horizontal
#control outside margin so the text could be equeezed into the plot
#las directions of tick labels for x-y axis, range 0-3, so 4 combinations
mp<-barplot(as.matrix(rev(dd)), horiz=T, col='gray70', las=1, xaxt='n');
tot = paste(rev(dd_pct), '%');
# add percentage numbers
text(rev(dd)+17, mp, format(tot), xpd=T, col='blue', cex=.65)
# axis on top(side=3),'at' ticks location, las: parallel or pertanculiar to axis
axis(side=3,at=seq(from=0, to=30, by=5)*10, las=0)

# with ggplot2
dd2=data.frame(metric=c('web_landing_viewed', 'reg_form_viewed', 'registration_complete', 'download_viewed', 'download_clicked'), value=c(234, 334, 82, 208, 68))

ggplot(dd2, aes(metric, value)) + geom_bar(stat='identity', fill=I('grey50')) + coord_flip() + ylab('') + xlab('') + geom_errorbar(aes(ymin = value+10, ymax = value+10), size = 1) + geom_text(aes(y = value+20, label = paste(dd_pct, '%', sep=' ')), vjust = 0.5, size = 3.5)

No comments:

Post a Comment