Newsletters

Select newsletters below and click the button to sign up!

Boston News NY News
DC News Internet Daily
SiliconValley News
InternetNews Business Report




Become a Marketplace Partner



Partner With Us















Internetnews Bloggers

Recent Entries

Archives

July 2008
Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31    

Monthly Archives

Search The Blog

Netstat -vat by Sean Michael Kerner (bio)

A command line view of IT



Mozilla Data Project Is Not a Good Idea

sr-firefox3.jpgI'm a fan of Michael Arrington and his work at TechCrunch, though I disagree with his assessment of Mozilla's new secret 'Data' effort.

The plan is basically to collect data from Firefox users (who opt in) in order to provide a data set on site popularity and user trends. It's an interesting idea and one that might help Mozilla, but IMHO it's not a good one for the broader marketplace for a few reasons.

1) The data will always be biased because it will only be for Firefox users
2) 'Hackers' will try to do 'bad things' with the data which could well provide personally identifiable information (sure Mozilla would do its best to secure users, but the point is they would be providing a new potential attack vector).
3) More data isn't always better. Every web server in existence has some form of log system which accurately measures real traffic. Adding yet another new statistics system only confuses an already confused marketplace.
4) A users 'History' file already tracks the data (though it doesn't  publish it publicly...).
I personally like what Red Hat's Fedora project is doing with users statistics. Fedora (by way of its Smolt technology) tracks how many IP addresses actually connect to Fedora Update servers. With that data Fedora know how many 'active' Fedora installations it has.

How many active Firefox installations are there? Sure we know how many downloads, but wouldn't it be great to have real number on users too?
**UPDATE 5:41 PM EDT - I'm wrong on the Firefox installations issue. Mozilla's Asa Dotzler commented below (thanks Asa!!) that Mozilla does have stats on this now and that current users number about 170 million **

So YES, getting stats is a good thing. And YES Mozilla Data will be a solid effort at understanding what Firefox users may be doing. But NO I will not personally participate myself and while I'll comment on their Data (when it's available) I'll always take it with a grain of salt. 

| Comments (2) | TrackBacks (0) | Share

0 TrackBacks

Listed below are links to blogs that reference this entry: Mozilla Data Project Is Not a Good Idea.

TrackBack URL for this entry: https://swarm.jupitermedia.com/mt-tb.cgi/3122

2 Comments

Asa Dotzler said:

Sean, we do know about how many Firefox users there are.

We've already done some modeling around our adoption funnel (acquisition, trial and adoption) with the Funnel Cake project http://blog.mozilla.com/metrics/2007/11/02/firefox%E2%80%99s-funnel-factor/ and we monitor the daily update pings that every active Firefox client sends to find out whether or not it needs to fetch a security update.

Looking at those data sets and comparing with some of the data we get from third parties, we've been able to pretty accurately estimate our total usage.

Right now that number is about 170 million users worldwide, or 3x our daily update pings.

- A

Mike Shaver said:

Sean,

I'm not sure quite how to reconcile your concern that the data will only be from Firefox users (and therefore too narrow) with the belief that a single site's data is sufficient -- if 170M users of Firefox is too narrow a data set for your tastes, then I would expect that very few sites would be able to get interesting-to-you data from their own logs.

Any data collection system that is even the slightest bit respectful of user privacy decisions is going to be skewed; certainly the statistics that people use to make decisions today, purchased from the handful of companies who have usefully-large sample sizes, are skewed as well. Taking these statistics with a grain of salt is very wise, and you should definitely do so with whatever data Mozilla gathers and releases as well. What distinguishes Mozilla's nascent plans from the current state of the art, though, is that we will be much much more transparent about how the data was collected, and likely provide much better access to the raw data sets. You'll know better how much salt to apply, and to which portions of the meal, so to speak.

We're just starting to think and talk about this, and you can read more on John's latest post: http://john.jubjubs.net/2008/05/13/mozilla-firefox-data/ .

What would make the data useful to you? What are the strictures you'd want in place to participate? That would be very interesting to know, at least for me. :)

Leave a comment