Pere Villega bio photo

Pere Villega

If at first you don’t succeed; call it version 1.0





Mining Massive Datasets

As part of my efforts to understand more of the data science world, I’ve joined another Coursera class: Mining Massive Datasets. First week done, how was it?

I have to say that, unfortunately, I found it discouraging for a developer. Don’t take me wrong, the course is fine and plenty of developers will love it. But my maths are very rusty, it’s been a long time since I had to do things like multiplying matrixes. So the first week has been harder than expected: I can calculate your MapReduce solutions with little effort, but iterating over PageRank…

I guess that, personally, I like more a ‘hands-on’ approach and the fact this course is so theoretical puts me off a bit. One gets used to a quick-feedback loop during development and moving to the pure-theory realm is hard.

From what I understand all the course will follow the same pattern (videos and quiz) with some coding exercises as optional deliverables with no impact in your grade. I doubt many people attend a MOOC for the grade (at least if you are taking the free ones) so most of the effort in the course will be to understand purely theoretical approaches. I’m not denying their usefulness, but given my limited free time I may have to drop this course if things keep at the same level, as at my current level of data science knowledge I won’t benefit too much from it.