Menu

Michael Gracie

Coursera Data Science Specialization: A Student’s Review

datasciencelogoCoursera is a pure play online education provider distributing classes in a wide variety of subjects, from The Music of the Beatles to Analyzing Global Trends for Business and Society. Many courses are offered in the native languages of those who developed them, such as Peking University’s Methodologies in Social Research, while others have been translated for more widespread use – see Yale’s Financial Markets instructed by Prof. Robert Shiller (I’m a fan).

Part of the Massive Open Online Course (or “MOOC”) site’s push is linking courses developed by accredited higher-learning institutions into specializations, series of classes designed to develop skills in a particular field. There are seven specializations as of this writing, and the one I dove into is called Data Science.

Made up of nine segments created by Johns Hopkins University’s Bloomberg School of Public Health – taught by Professor Brian Caffo and Assistant Professors Roger Peng and Jeff Leek – it’s a half-year commitment for Energizer bunnies with a math/programming bent, and probably twelve to eighteen months if right this moment you’re distracted by your Instagram feed. Ok, maybe twenty-four to thirty-six.

Yours truly took the fast track, doubling and tripling up on classes at the outset, leaving the purportedly hard stuff for the windup to winter solstice. What follows is a summary of each class, including comparison to what was “sold”, tips for getting the most from them i.e. scoring well and then some, as well as supplementary materials discovered that turned out worthwhile. It’s the truth within from a dedicated student’s point of view, and a long road. So feel free to skip to the conclusions; just don’t make fun of my grades.

The Series

1) The Data Scientist’s Toolbox (link)

The Pitch

Estimated Workload: 3-4 hours/week

Upon completion of this course you will be able to identify and classify data science problems. You will also have created your Github account, created your first repository, and pushed your first markdown file to your account.

The Reality

If you know your way around a laptop and the internet (and I don’t mean Facebook), you can complete this class in half a quiet weekend day. It took me about six total hours to rip through, and that was while also taking R Programming. It may seem easy to signup for a cloud service and install some software, but in later courses the forums were littered with Github, R Studio, and other menial problems that were otherwise taught here. I am glad I paid attention, regardless of the difficulty level.

2) R Programming (link)

The Pitch

Estimated Workload: 3-5 hours/week

The course will cover the following material each week: i) Overview of R, R data types and objects, reading and writing data, ii) Control structures, functions, scoping rules, dates and times, iii) Loop functions, debugging tools, and iv) Simulation, code profiling

The Reality

If you know most any modern programming language, you’ll both enjoy the concepts presented within and not have too hard a time. However, the instructors are already throwing tricky problems at you (read: adjust your thinking cap), so the true investment is more like 5-7 hours per week. Caveat: that’s coming from someone who previous did all their data analysis in either Excel or relational databases. Further, I used for-each loops instead of the ‘apply’ family of functions for my project work; that voodoo kinda spooked me, and old school got the job done without having to reinvent the mental wheel in haste. Finally, run with swirl, the embedded tutorial package the course relies on – it’s a winner for working through R programming concepts, and you might need the extra credit.

3) Getting and Cleaning Data (link)

The Pitch

Estimated Workload: 3-5 hours/week

Upon completion of this course you will be able to obtain data from a variety of sources. You will know the principles of tidy data and data sharing. Finally, you will understand and be able to apply the basic tools for data cleaning and manipulation.

The Reality

This course felt harder that the sell, but I was also taking Exploratory Data Analysis and Reproducible Research along with it (btw, not the best idea). Fully wrapped around the R language, if you are a rookie you’ll be very glad you took R Programming beforehand as this segment dove into data manipulation with nary a spreadsheet crutch to lean on. There were few zingers in the quizzes, but the project rubrics were cupcakes sprinkled with obfuscation; patient due diligence was required to discern what what being asked. But once you did, it was pretty easy goings. This was also the point where I realized peers were grading the projects, and to keep presentations simple/easy to understand and follow.

All in, I figure I invested 8-10 hours a week on the class, but as a result have since dumped Excel for most raw data manipulation. If you do any amount of number tumbling in your day job, this course will really sell you on adopting R.

4) Exploratory Data Analysis (link)

The Pitch

Estimated Workload: 3-5 hours/week

After successfully completing this course you will be able to make visual representations of data using the base, lattice, and ggplot2 plotting systems in R, apply basic principles of data graphics to create rich analytic graphics from different types of datasets, construct exploratory summaries of data in support of a specific question, and create visualizations of multidimensional data using exploratory multivariate statistical techniques.

The Reality

Graphs, graphs and more graphs, the upper range of the workload estimate is in step. Learned how to create lots of fancy graphics to illustrate analysis, most of the techniques which were forgotten soon thereafter. Lesson learned … hold onto that code! This student wound up using chunks for Reproducible Research (which I took simultaneously with this class) and even integrated some of it into a few work-related reports. R’s graphics functions are nifty, and certainly more flexible than the canned stuff you might be used to. Moderately worth the effort, and while the quizzes were cake I did find peers morphed exceptionally picky during project grading; this might be the only course where it is worth getting more creative with a project rather than less. In addition, you should be pretty handy with Github by this point; if not you’re in trouble.

5) Reproducible Research (link)

The Pitch

Estimated Workload: 3-5 hours/week

In this course you will learn to write a document using R markdown, integrate live R code into a literate statistical program, compile R markdown documents using knitr and related tools, and organize a data analysis so that it is reproducible and accessible to others.

The Reality

I am 100% sold on the concepts presented within; if you can’t hand me the dataset and your code, and I can’t easily replicate your study with it, I now consider your work not worth the paper (or PDF) it is printed on. Some of the real-life examples exposed, where renowned academics completely screwed the pooch on otherwise world-changing studies, put even more gasoline on the belief fire. I probably spent ten plus hours a week on this one, but was enthralled after lecture #1 and have since used R markdown and related technologies everywhere I can. The quizzes aren’t too bad, nor was the first project. The final project was not difficult, but it was time consuming, taking nearly two full days on it’s own even while struggling NOT to overdo it. Probably the most utilitarian course in the series.

6) Statistical Inference (link)

The Pitch

Estimated Workload: 3-5 hours/week

In this class students will learn the fundamentals of statistical inference. Students will receive a broad overview of the goals, assumptions and modes of performing statistical inference. Students will be able to perform inferential tasks in highly targeted settings and will be able to use the skills developed as a roadmap for more complex inferential challenges.

The Reality

If you don’t have a solid background in linear algebra, are loath to watch mathematicians draw greek letters on chalkboards, and/or generally fall asleep anytime someone brings up the topic of particle physics, you might want to watch the lectures oncetwice … at least three times over. Then watch them again. The only truly positive note here is the instructors warned ahead of time how difficult it is. The estimates are a joke; I spent more time “getting” this material than I did on the previous five classes. Combined. Further, I had to reference quite a few outside resources; the lectures were steeped in theory so far over my practical nature that I debated dropping the rest of the series most every time I looked at one.

Needed every attempt on quizzes, and the projects left little to be desired either. In particular, one project was entirely not what it seemed at first; the forums were quickly littered with unbridled speculation about requirements, and I took all postings with a grain of salt thereafter (more on that later). Got through it all after putting on blinders. Finally, I took this course in conjunction with Developing Data Products, which was also a mistake.

7) Regression Models (link)

The Pitch

Estimated Workload: 3-5 hours/week

In this course students will learn how to fit regression models, how to interpret coefficients, how to investigate residuals and variability. Students will further learn special cases of regression models including use of dummy variables and multivariable adjustment. Extensions to generalized linear models, especially considering Poisson and logistic regression will be reviewed.

The Reality

Much like Statistical Inference, course developers flagged this one as pain-in-the-ass. But after the former, it’ll seem like welcome relief (for about ten lousy seconds). The instructor starts off by noting that the majority of students likely come from a computing background, then proceeds by tearing into alphas and sigmas and gammas without reproach. But at least there are frequent notices saying you can “skip this bit” if you’re not interested in proofs.

I immediately applied my rip through the material as fast as you can and get through the quizzes by hook or crook approach (noted below), then went searching for supplementary material that might provide some insight into completing the project. I found it; meanwhile said project had strict limits on length which precluded overthinking. Having the skills taught in Reproducible Research is a must here, as are general research, writing and referencing capabilities. If you’ve got those, can type lm(blah ~ blah, data=blah) into R, remember what a confidence interval is, and actually try (versus cry), you’ll pass. Additionally, my external research into regression methodologies supplanted most everything taken in via the primary material, but it was worth the time invested. Had the course essentially completed just as Week 2 was ending, then fiddled with my project presentation until the peer evaluation window opened up.

8) Practical Machine Learning (link)

The Pitch

Estimated Workload: 4-6 hours/week

Upon completion of this course you will understand the components of a machine learning algorithm. You will also know how to apply multiple basic machine learning tools. You will also learn to apply these tools to build and evaluate predictors on real data.

The Reality

This is the pièce de résistance of the Data Science Specialization, the chance to learn how to tell someone something they don’t already know using data and R. I drank like a fish worked like a dog, non-stop for nearly two weeks straight, just so I could carve out some alone time with Practical Machine Learning. Broad strokes: the title is fitting, with the instructor cooking useful code examples while the introduction lecture is still warm; the quizzes are NOT to be trifled with – you are expected to know how to make all those functions you’ve just been shown work – and the project was no cake walk either.

Like Statistical Inference and Regression Models it’s a lot to take in, but at least there were fewer than a handful of mathematical equations presented throughout. It isn’t just button pushing though; if you don’t know how to use Github, can’t code R at minimum like an enthusiastic rookie, don’t understand basic statistical concepts such as resampling and variance, or think knitting is something done with a needle and yarn, you’ll be lost in the blink of a keystroke. If, however, you make it through the lectures on random forest and gradient boosting without being completely befuddled, consider yourself a prize-winner; those methodologies are the laser beams of the supervised learning world. Disappointing there aren’t any sharks involved though.

9) Developing Data Products (link)

The Pitch

Estimated Workload: 3-5 hours/week

Students will learn how communicate using statistics and statistical products. Emphasis will be paid to communicating uncertainty in statistical results. Students will learn how to create simple Shiny web applications and R packages for their data products.

The Reality

While this is officially the last class in the specialization, I took it earlier in conjunction with Statistical Inference. There is almost nothing related to math here i.e. it’s easy, particularly if you know anything about html, css, client and server-side coding, or just loathe Powerpoint. If you fit that bill, and have already taken Reproducible Research (which introduces you to markdown), you can knock off all the lectures and quizzes in a few sittings then complete the projects while sipping whiskey one evening during a cold snap. Like this guy did.

Tips for Surviving the Data Science Specialization

1) Notes: Download all the lecture notes at the beginning of the class – you’ll find them useful reference both for quizzes and for having example code handy when it’s project time.

2) Quizzes: Print out all the quizzes, review each, and take notes on them as you follow the lectures. Read the quiz questions very carefully, as many of them are designed to trick. And take advantage of all your quiz attempts – don’t leave an 80% score sitting on the first try when you have two more open windows available for perfection. Use your quiz printouts to mark right/wrong answers on attempts, then scour your lecture notes for information related to that question. When in doubt, make an educated guess!

3) Projects: Print out the project rubric(s) and grading criteria, early. Keep it handy throughout the lectures, and by the time the project comes around you should know exactly what to do. When working on the projects think elegant, not encyclopedic. I realize early on that because many projects would be peer graded, going overboard on them would serve to confuse and subsequently hurt scores. You will be better off with succinct, well-written explanations, proper spelling and ordered labeling than you will dumping a pile of code into a PDF then praying your fellow students can figure it all out. The projects are the culmination of all previous efforts, where you’ll realize you learned more than you thought you did.

4) Timing: When this student got to Statistical Inference, he realized he had his hands full, and that didn’t change for Regression Models or Practical Machine Learning either. So a new course strategy was employed, whereby I cranked through the lectures with vigor, knocking off all the quiz questions I could (easily, on paper) as quickly as possible; in most cases I wound up with between 60% to 80% of the quiz questions down stone cold. Then I ran back through the lectures, week by week, filling in the mental blanks and completing the quizzes. By late in week two I could concentrate on the project requirements. Reasoning: some of the course material in the last week was actually needed to produce a salient project, particularly when the rubric seemed vague. Don’t try whipping together a project at the last minute – you will wind up getting caught short.

5) Statistical Inference: By far the most difficult course in the series, it is really an entire semester of statistical theory crammed into four weeks. My suggestion … take notes, lots of notes. Write down every goddamn equation, figure, and truism that comes out of the lectures. I used sticky notes for this, and plastered them across the wall over my desk. Then I rearranged them according to subject matter. After that I sorted them again, on paper, according to what I perceived the instructor was really trying to convey, while scribbling additional thoughts on the back of each sheet. By the time the projects came around, I was thinking big picture instead of minutiae.

And finally …

6) The Forums: I am all for joint interaction and open communication, but sometimes the forums were just too much. I am not suggesting ignoring them altogether, instead offering up a pointer to be diligent regarding who and what you pay attention to within. Alarms bells should ring when any of the following occur: A) a TA states that one solution is determinately better than another, referencing only their own project results from last term; B) a select few participants hijack every forum thread with long-winded explanations to seemingly straightforward problems; C) someone starts a thread with a question, then two posts later is conveying their spontaneous expertise to anyone attempting to assist; and D) a student loudly criticizes course requirements, exclaiming changes should be made to suit their need, particularly when said need is based on their admittedly ignoring prerequisites and/or published instructions.

The forums would be much more useful if the staff was more aggressive with moderation, including quashing duplicate inquiries, designating posts resolved, and otherwise managing the venue versus just doing the Q&A dance.

Supplementary Online Material

This student made use of the following resources; the first was suggested by course developers at the outset, while the latter two I sought out on my own:

StackOverflow – a solid resource, but be forewarned that a lot of folks post complex inquiries there, so you often have to parse, test and tweak solutions to “dumb down” for your particular issue.

DataCamp – for getting handy with R upfront, as well as learn some exploratory analysis tricks and stepwise regression, the site is fantastic; I’ve mentioned it before too.

Rdocumentation – a comprehensive resource for R functionality, easier to read than R’s internal help files.

Solid Reference Material

The materials below are (legally) free, although they may require a little effort to find. I’m making it easy on you, and they are now all permanent residents of my digital library:

OpenIntro Statistics 2nd Edition; Diez, Barr, Cetinkaya-Rundel – https://www.openintro.org (PDF)

R function reference cards – “cheat sheets” from a variety of authors – let me Google that for you

And some more advanced stuff …

An Introduction to Statistical Learning, with Applications in R; James, Witten, Hastie, Tibshirani – http://www-bcf.usc.edu/~gareth/ISL/ (PDF)

Elements of Statistical Learning, 2nd Edition; Hastie, Tibshirani, Friedman – http://statweb.stanford.edu/~tibs/ElemStatLearn/ (PDF)

Mining Massive Datasets, 2nd Edition; Leskovec, Rajaraman, Ullman – http://www.mmds.org (PDF)

Expectations and Conclusions

The Data Science Specialization is not for the undisciplined, the intellectually lazy or those with an entitlement mentality. If you do not have significant evening time to spare or are not willing to forego spending 20 hours a day on “social media”, you should not bother; the classes will, in several cases, entail significantly more time than the course developers estimate. If you spent your formal education expecting your teachers to hand you answers – like they did in grade school because your mother was captain of the PTA – not only are these courses not for you but the subject matter probably isn’t either. Go take a class in The Social Structure of Pre-Cambrian Basket Weavers if you seek validation; data analysis, statistics, and machine learning are for those who not only enjoy getting their asses kicked but will beg for more after every beatdown.

Pay for the courses, particularly if that’s what it takes to keep you motivated. I went the free route, figuring a “verified certificate” from a non-accredited institution was no better than an unverified one, and certainly a pittance compared to the knowledge gained. If your end-game is pasting “Johns Hopkins University” in the education section of your LinkedIn profile, Idiocracy is a must watch; after that review the disclaimers, paying particular attention to the part about these courses being entirely invalid as degree requirements go. Not paying also meant forgoing the opportunity to participate in the capstone project, but after the sometimes hellacious course-load I was in no mood to do any free work. If you apply what you’ve learned along the way, you may very well feel the same.

Will you be a bonafide benjamin-printing mofo after the Data Science Specialization? I don’t think so. Consider it instead an introduction to data analysis and machine learning, more of a primer if hoping to get into the field. If you are already involved somehow, whether it be as a financial analyst or lab biologist, taking the series on has significant merits; it might even be worth a pay raise. However, if you are a CIO, CMO, or otherwise tasked with managing those that do the prescribed work, you should not only consider the education suitable for consumption but also required training. For your own self.

Quasi-Epilogue

About midway through Reproducible Research I started using R for various work-related projects, including website traffic and online ad spending analyses, as well as sorting through the historical asset and liability positions of an irrevocable trust. Practice makes perfect, and money buys extravagant fly-fishing trips.

Intriguing subject matter, and I’m somewhat convinced the big data movement’s supposed failure to find real answers stems not from shortage of knowhow, but from a dearth of perspective. Too many heads twirling too much code over too much data, and not enough stepping back to take in the forest. I witnessed many course participants, seemingly bright and articulate, get completely wound up in “proper” R usage when perusing a two-paragraph dataset description, reviewing some original survey methodology, or even doing a web search on “the primary parts of an automobile” would have made their lives so much easier.

Next up: Stanford’s Statistical Learning. Selfie was recently cancelled, so what the heck else is there to do until the bugs start hatching and the greenskeepers are back to work?

MG signing off (to close with scores from this insanity – click here)

Editor’s note: A very special thanks goes out to Johns Hopkins University and the brilliant geniuses within – Brian Caffo, Roger Peng, and Jeff Leek. The sheer volume of effort required to produce the course materials must have been astounding, let alone what it took to keep the quality top notch as it was. Additional gratuity goes to Coursera and their staff – nice work folks.

Comments

Pierre says:

Nice review. In particular the part about the forums. Although, the can sometimes be useful to get a sense of what the ultimate limit if of what someone is planning to do for a project (there are some really hardcore data science people hanging out there).

So you didn’t do the Capstone project to complete the Specialization? If you managed the first 9, then you will even survive that, although in itself it feels as if it something completely different all over again. We had to do next word prediction in R and I have to admit not much of the previous modelling actually prepared us for that. But, fingers crossed, certificates should be handed out by next week, so then we’ll now how many of us managed to complete it in this first run.

Thanks Pierre – and hope the scores turn out well for you. Yep … opted out of the capstone; already using most everything I learned in real-time anyway.

Again, best wishes.

Alex says:

I just registered for the Capstone class. I do not know what the final assignment is or how to approach it, but I hope to figure it out. Overall, the previous nine classes were interesting. However, reading the review, I am not sure I am prepared for the Capstone. I guess there is only one way to find out…

SandX says:

I’m in the final week of the introductory course (The Data Scientist’s Toolbox). I’ll use your elaborate review both as an encouragement that it is really worthwhile to follow the whole path and as good advice while planning along the way.

Thanks very much for sharing your experiences, thoughts and -no doubt- useful links!

For anyone working to better themselves … my pleasure.

Assuming you’ll start R Programming soon, I suggest doing Datacamp’s Introduction to R now. It’s free and it’s fun, and will help if you’ve never been exposed to R before. Also, just before Regression Models (or at least pre-project), do their Data Analysis and Statistical Inference course with Mine Çetinkaya-Rundel; while it uses some different linear modeling scripts within that you won’t have access to, running through the process is definitely helpful come project time.

All definitely worth the effort – best wishes.

Upasana Sharma says:

Hi Michael

Thank you for the wonderful review. I have done the first course in the data science track. I have tried completing the R programming course however was not able to. I will take up next month. Thumbs up for the motivation to learn, not to go after certificates.

Look forward to your updates.

Regards

Upasana

You are welcome. Sounds like you are still keen on the subject matter, so I’ll make an additional resource suggestion …

Jeff Leek, one of the Johns Hopkins fellas that developed the specialization, just published a book on data analysis “style”. You don’t meet Jeff until Practical Machine Learning (although I think he does a few cameos), but you can get his book here -> https://leanpub.com/datastyle. Well worth the ten bucks.

Upasana Sharma says:

HI Michael

Thanks. I got a copy of the kindle version from amazon.
Yes I am interested in learning Data Science. I am not sure where to start for self learning.
I guess JHU courses will give me some idea of the field.

Regards

Upasana

Jayasimha G P says:

Hello Michael Gracie,

Very informative review about the course. First thank you so much for that. I will be completing the R programming this month end. Having said that I am from a Computer Science stream with good programming background and have good statistical knowledge, could you please suggest me which all courses I can take in parallel in subsequent months.
P.S – I am dedicated and I have very free time for next 2 months 🙂

Thanks

JGP,

Considering your background and availability, why not follow the same track I did; take Getting Data, Exploratory Data and Reproducible Research together. However, the following month you might want to tackle Statistical Inference by itself, as getting that material down stone cold will make the remaining courses a cakewalk for you.

PS: Glad you enjoyed the review. Best wishes on the work.

Jayasimha G P says:

Thank you so much for your reply. Surely will try to follow you. 🙂

Tammy H says:

Hi Michael,
Thanks for publishing such a detailed review. I found it very informative and decided to register for this series and follow your recommendations. The first two courses just launched today and, thus far, I’m not finding any way to print out the quizzes at the onset (in order to take notes, etc) as you’ve suggested. Do you know if printing out the quizzes is only an option only if you follow the “free route” or has the course structure changed whereas this is no longer possible?

Tammy,

Not sure what OS/browser you are using, but I just opened the quizzes in Safari and selected File, Print from the Safari menu.

Best wishes on the course.

Russ says:

Thank you for this comprehensive review. Definitely helps in my decision to pursue this.
Russ

Sujata says:

Hi Michael,
I am currently working as a Business Analyst for a medical malpractice insurance company and wanted to switch careers and move to data analytics. Will this course help. I have a previous programming background as well.

Thanks ,
Sujata

If you plan on staying in the insurance sector i.e. make a horizontal move to where there are better opportunities to advance, I suspect it will be quite valuable for you. Insurance -> incident data + application of statistical analysis, for the purpose of estimating cash outflows, applied against budgeted cash inflows – plus cash on hand – targeting a desired return on the net.

Best wishes.

Gideon says:

Thanks very much for your review!!

I’m aiming to do it starting December 7th. I’m a web/windows developer for several years now (8+), I dabble into a lot of things and did Andrew Ng’s course on Machine learning fairly well.

Reading your post makes me think I should be able to do it fine, except for the heavy math. I have done some advanced stats, now kinda forgotten and have been try to learn calculus at khanacademy.org.

Any other thoughts as to what I can do in 3 weeks before this course begins? Learn some stats? Or cram linear algebra ? I can definitely start learning R.

Any thoughts?

Thoughts? Yea … relax.

Assuming you plan on doing all the courses, you are about to “cram” for at least six-months, so no reason to get wound up ahead of time.

Steve says:

Great write up on the courses but I couldn’t understand the part on whether getting the “verified certificate” – especially this part, I didn’t get the reference:

“If your end-game is pasting “Johns Hopkins University” in the education section of your LinkedIn profile, Idiocracy is a must watch; after that review the disclaimers, paying particular attention to the part about these courses being entirely invalid as degree requirements go. ”

Are you saying it is not worthwhile to put the verified certificate on a CV because it is an online course? Or it is only worthwhile to pay to do the capstone project?

Neither. I was targeting intention i.e. paying for a bunch of courses for the sole purpose of tagging an online resume.

Shazaib says:

Hi Michael,

Nice review. I have been asked by my employer to complete the following courses in a month, as they are considering offering me for a data science related position in a newly-created team:

Data Scientist tool box
R programming
Getting and Cleaning Data
Regression Models
Statistical Inference
Practical Machine Learning

I have studied Statistics before and know quite a bit of the Statistical Inference and Regression material but, I haven’t used R and don’t have much programming background. I have a solid mathematical background though, and I’m able a persistent learner who can commit 20 hours per week for these courses.

Although, I know I can do these courses “time of just a month” that I have been given is the crucial factor for me. Do you think it is doable? Need your suggestions.

I am concerned more about the Machine Learning course, since I presume that it assumes significant background knowledge. Would greatly appreciate any views. Thanks.

Sounds like Statistical Inference should come easy for you, nevertheless I do not think this is doable for someone with 1) zero R experience + little programming experience and 2) that is gainfully employed.

While the track does cover broad data science / analytics, it is extremely heavy on R. Further, R is widely regarded as having a fairly steep learning curve, even for those with a programming bent. This might be a different story if it was Python; I found picking it up quite easy by comparison.

If I were you I’d try re-negotiating; you might be able to pull it off in two months. But you will definitely be sleep-deprived when it’s over.

Best wishes.

Paula S. says:

Hi Michael,
Random question- did you end up doing Stanford’s Statistical Thinking that you mentioned on the end of this post? If so, how was it?
Thanks

Yes. The material is akin to Statistical Inference i.e. heavy on the theory, but with more graphical representations than linear algebra. I got behind around week seven and never recovered, although I was passing when I dropped it.

Guzzyman says:

Hi Michael,

Nice article you’ve got here about the data science specialization. I’m currently enrolled for the Executive Data Science and after reading your review about the data science specialization, I’m thinking of taking the courses as well. But I found out most of the courses have the same start/finish dates. How do I cope with offering several courses in the same specialization concurrently at the same time? That is where I have a challenge. Could I just take the courses one after the other not minding the start/finish dates? Will the courses be available after the supposed start/finish dates? Would be waiting to read from you.

Once again, thanks for the review. I will apply the techniques to all the specialization I’m doing.

Check Coursera; they ran the courses roughly every five weeks when I did them. Best.

Guzzyman says:

I already checked coursers before I commented on your post. A few of the courses have same start/finish dates.

Might want to contact Coursera. They used to offer the courses with the same start/finish dates, and then offer them all again five weeks later. When I went through the specialization I planned around that, taking a few courses at a time, then a few more courses in the track when they were next offered; took me from August 2014 to December 2014 to finish them all (capstone excluded). But maybe that’s changed now.

dissipate says:

Thank you very much for the detailed review and tips! I have just signed up for Data Scientist’s Toolbox and am excited to begin, although feeling a little apprehensive at the same time as I have not really done any coding for 10+ years and have no background in statistics save for reading a textbook the past two weeks.

You are welcome. Best wishes on success.

Giridhar says:

Hi Michael ,
I read through your comments and feedback about the course and its intended importance. However, i have a couple of queries to ask you through this thread.I am employed as a Quality Assurance Analyst and Business Analyst in the banking domain for 5 years and wish to make a jump in to analytics or data science. I have very little or no programming experience. Is it feasible to pull off the Data science course for a candidate like me ? Kindly highlight your thoughts on the same

Answer via a question, for you to ask yourself …

Are you the type that confronts a problem, particularly one involving technology, business processes and/or mathematics, head on because you have an aching desire to kick that problem’s butt with you own two fists, or one that immediately cries for help (usually designated by the “HELP!!! ME!!! PLEASE!!!” tag, roughly 10 seconds after the problem arises).

The former will learn the material like the back of their hands (bloody as those hands may wind up), prove the skills when it really matters, and write their own ticket. The latter will get a better grade and a great resume line; then, when time and/or money is of the essence, they will royally screw something up, blame their colleagues, and (if their manager is anything but a moron) get sent packing just as they deserve.

Which type are you?

SOLUTION HINT: If the former, go for it! If the latter, take up stamp collecting.

Chris Horvath says:

Nice review… I am doing the capstone right now and like you I was going to skip the final class as I learned what I wanted to, but since I payed for the series I started it.

I am not finished yet but the project is something that would help a lot of people and I’ll get some value as its in NLP which i find interesting. Overall I learned a lot more than I expected and content a lot harder than expected.

Val says:

Thanks for the review!

I am going for a knee surgery and will be away from work for 6 weeks and would like to use that time to build some technical skills.
Wanted to get your advice on whether it would be doable to finish this course within that time.

I have some stats and math background (though probably need to refresh that before the course), no programming background. I will be able to do this almost full time since I’m on medical leave.

Do you think it is feasible for me to finish the course in 6 weeks. If so, in what order should I ‘cram’ these modules? Thanks!!

You are welcome.

I don’t think it is feasible, nor is it advisable.

Ignoring any nagging faults mentioned in the review, this is very valuable i.e. robust material assembled by some extremely intelligent and diligent people. It’s a hoard of work even at the individual class level, and as mentioned in the review many of the classes build on – and assume proficiency in – material covered in previous classes. Taking the lack of programming experience into account, I suspect you will find yourself frustrated, and quickly.

If you must jump in hard, try taking the first three classes together. Toolkit is cake, so getting through Cleaning Data at the same time you are doing R Programming will be your test.

Val says:

Thank you for the quick response!
Agreed that it seems like it may not be feasible. Will try to work out an alternative timeline for it.

Vishnu says:

May I know if there is any timeframe to finish all course if paid full or for individual courses?

Best to ask Coursera about that.

dataguy says:

Hi, I have a quick question. I am wanting to take this course and get into data science. The only thing holding me back about starting to invest the time in this area is my insecurity in applying mathematically theory. I can do basic algebra but I just could not really get through physics and calc in school. i did pass both pysics and calc 1 and 2 which were prereqs for engineers but i passed with d’s and think it was just bc the prof felt sorry for me.ha,

When it comes to any theory, I have a hard time. I kind of need a template or algebraic modelm if this makes sense. With this thinking, Do you feel I can excel in this field or am I wasting time? I do have MS in Science and am good in Science.

Thks for any insight..

There is little to no need for calculus here. Most of the theoretical material is linear algebra, presented for statistical proofs. Further, I don’t think you need to have a solid handle on the proofs to do well here. Really missing them might make Statistical Inference a pain, but it kind of already is regardless.

Data says:

Thanks for your insight Michael. Im gonna jump into it and hope it leads me somewhere. Thks again

Francis says:

Hello colleagues,
I would like to know if you enroll now for the course if you can master the materials faster can you enroll for capstone earlier ?
In short, if you are really bright and work it out, can coursera courses (capstone included) be finished earlier than what is scheduled ?
Thanks in advance.
Francis

Best to ask Coursera about that.

Jeff H. says:

Michael,
Thank you for this detailed review! You really sold me on the value of at least considering these classes, which is exactly what I was looking for. I like that you were honest in your assessment … nothing is ever as perfect as advertised, and some classes are going to be worth it, some aren’t! This review bears out that is true for Coursera just as much as anywhere else.

Thanks again,
Jeff

You are welcome. Good luck with the specialization.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.