Ultimate Guide to Data Science Courses (Over 65+ courses covered)

KarthiKeyan Shanmugam
60 min readNov 28, 2018

--

This story uses affiliate links.When you click an affliate link,we get a small compensation at no cost to you.Thanks for your support!

Data Science has been ranked as one of the hottest professions and the demand for data practitioners is booming. Data Scientists perform sophisticated empirical analysis to understand and make predictions about complex systems. They draw on methods and tooling from probability and statistics, mathematics, and computer science and primarily focus on extracting insights from data. They communicate results through statistical models, visualizations, and data products.

Per IBM study, By 2020 the number of Data Science and Analytics job listings is projected to grow by nearly 364,000 listings to approximately 2,720,000.The following summary graphic from the study highlights how in-demand data science and analytics skill sets are today and are projected to be through 2020.

Data Science / Analytics Landscape / Source - IBM
Image — Data Science / Analytics Landscape / Source — IBM

In this post,we take look at courses that will help boost your career and expand your knowledge. Check them out, and start enrolling today!

Coursera
Image — Coursera

#1.Data Science Specialization

This Specialization covers the concepts and tools you’ll need throughout the entire data science pipeline, from asking the right kinds of questions to making inferences and publishing results. In the final Capstone Project, you’ll apply the skills learned by building a data product using real-world data. At completion, students will have a portfolio demonstrating their mastery of the material.

Created By : John Hopkins University

#2.Survey Data Collection and Analytics Specialization

This specialization covers the fundamentals of surveys as used in market research, evaluation research, social science and political research, official government statistics, and many other topic domains. In six courses, you will learn the basics of questionnaire design, data collection methods, sampling design, dealing with missing values, making estimates, combining data from different sources, and the analysis of survey data. In the final Capstone Project, you’ll apply the skills learned throughout the specialization by analyzing and comparing multiple data sources.

Created By : Michigan Program in Survey Methodology and the Joint Program in Survey Methodology, a collaboration between the University of Maryland, the University of Michigan, and the data collection firm Westat, founded by the National Science Foundation and the Interagency Consortium of Statistical Policy in the U.S. to educate the next generation of survey researchers, survey statisticians, and survey methodologists.

#3.Data Structures and Algorithms Specialization

This specialization is a mix of theory and practice: you will learn algorithmic techniques for solving various computational problems and will implement about 100 algorithmic coding problems in a programming language of your choice. The specialization contains two real-world projects: Big Networks and Genome Assembly.You will analyze both road networks and social networks and will learn how to compute the shortest route between New York and San Francisco (1000 times faster than the standard shortest path algorithms!) Afterwards, you will learn how to assemble genomes from millions of short fragments of DNA and how assembly algorithms fuel recent developments in personalized medicine.

Created By : UCSanDieago,Higher School of Economics

#4.Introduction to Discrete Mathematics for Computer Science Specialization

Discrete Math is needed to see mathematical structures in the object you work with, and understand their properties. This ability is important for software engineers, data scientists, security and financial analysts (it is not a coincidence that math puzzles are often used for interviews). We cover the basic notions and results (combinatorics, graphs, probability, number theory) that are universally needed. To deliver techniques and ideas in discrete mathematics to the learner we extensively use interactive puzzles specially created for this specialization. To bring the learners experience closer to IT-applications we incorporate programming examples, problems and projects in our courses.

Created By : UCSanDieago,Higher School of Economics

#5.Genomic Data Science Specialization

This specialization covers the concepts and tools to understand, analyze, and interpret data from next generation sequencing experiments. It teaches the most common tools used in genomic data science including how to use the command line, Python, R, Bioconductor, and Galaxy. The sequence is a stand alone introduction to genomic data science or a perfect compliment to a primary degree or postdoc in biology, molecular biology, or genetics.

Created By : John Hopkins University

#6.Applied Data Science with Python Specialization

The 5 courses in this University of Michigan specialization introduce learners to data science through the python programming language. This skills-based specialization is intended for learners who have a basic python or programming background, and want to apply statistical, machine learning, information visualization, text analysis, and social network analysis techniques through popular python toolkits such as pandas, matplotlib, scikit-learn, nltk, and networkx to gain insight into their data.

Introduction to Data Science in Python (course 1), Applied Plotting, Charting & Data Representation in Python (course 2), and Applied Machine Learning in Python (course 3) should be taken in order and prior to any other course in the specialization. After completing those, courses 4 and 5 can be taken in any order. All 5 are required to earn a certificate.

Created By : University of Michigan

#7.Data Analysis and Interpretation Specialization

The Data Analysis and Interpretation Specialization takes you from data novice to data expert in just four project-based courses. You will apply basic data science tools, including data management and visualization, modeling, and machine learning using your choice of either SAS or Python, including pandas and Scikit-learn. Throughout the Specialization, you will analyze a research question of your choice and summarize your insights. In the Capstone Project, you will use real data to address an important issue in society, and report your findings in a professional-quality report.

This Specialization is designed to help you whether you are considering a career in data, work in a context where supervisors are looking to you for data insights, or you just have some burning questions you want to explore. No prior experience is required. By the end you will have mastered statistical methods to conduct original research to inform complex decisions.

Created By : Wesleyan University

#8.IBM Data Science Professional Certificate

This Professional Certificate from IBM is intended for anyone interested in developing skills and experience to pursue a career in Data Science or Machine Learning.

This program consists of 9 courses providing you with latest job-ready skills and techniques covering a wide array of data science topics including: open source tools and libraries, methodologies, Python, databases, SQL, data visualization, data analysis, and machine learning. You will practice hands-on in the IBM Cloud using real data science tools and real-world data sets.

Upon successfully completing these courses you will have done several hands-on assignments and built a portfolio of data science projects to provide you with the confidence to plunge into an exciting profession in Data Science. In addition to earning a Professional Certificate from Coursera, you will also receive a digital Badge from IBM recognizing your proficiency in Data Science.

This professional certificate has a strong emphasis on applied learning. Except for the first course, all other courses include a series of hands-on labs and are performed in the IBM Cloud (without any cost to you). Throughout this Professional Certificate you are exposed to a series of tools, libraries, cloud services, datasets, algorithms, assignments and projects that will provide you with practical skills with applicability to real jobs that employers value, including:

Tools: Jupyter / JupyterLab, Zeppelin notebooks, R Studio, and Watson Studio

Libraries: Pandas, NumPy, Matplotlib, Seaborn, Folium, ipython-sql, Scikit-learn, ScipPy, etc.

Projects: random album generator, predict housing prices, best classifier model, battle of neighborhoods

Created By : IBM

#9.Advanced Data Science with IBM Specialization

By completing this specialization,you will have a proven deep understanding on massive parallel data processing, data exploration and visualization, and advanced machine learning & deep learning. You’ll understand the mathematical foundations behind all machine learning & deep learning algorithms. You can apply knowledge in practical use cases, justify architectural decisions, understand the characteristics of different algorithms, frameworks & technologies & how they impact model performance & scalability.

You will build fully scalable end to end data integration, machine learning and deep learning pipelines using the most prominent and widely used frameworks and technologies like Apache Spark, scikit-learn, SparkML, SystemML, TensorFlow, Keras, PyTorch, DeepLearning4J, Apache CouchDB and MQTT.

Created By : IBM

#10.Data Science at Scale Specialization

This Specialization covers intermediate topics in data science. You will gain hands-on experience with scalable SQL and NoSQL data management solutions, data mining algorithms, and practical statistical and machine learning concepts. You will also learn to visualize data and communicate results, and you’ll explore legal and ethical issues that arise in working with big data. In the final Capstone Project, developed in partnership with the digital internship platform Coursolve, you’ll apply your new skills to a real-world data science project.

Created By : University of Washington

Products from Amazon.in

Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data
Data Science from Scratch
Data Structures and Algorithms Made Easy: Data Structures and Algorithmic Puzzles
Practical Statistics for Data Scientists: 50 Essential Concepts
Business Analytics: The Science of Data - Driven Decision Making
What Is Data Science?
Python Data Science Handbook: Essential Tools for Working with Data
Python:  3 Manuscripts in 1 book: - Python Programming For Beginners - Python Programming For Intermediates - Python Programming for Advanced
Introducing Data Science: Big Data, Machine Learning, and More, Using Python Tools

#11.Executive Data Science

In four intensive courses, you will learn what you need to know to begin assembling and leading a data science enterprise, even if you have never worked in data science before. You’ll get a crash course in data science so that you’ll be conversant in the field and understand your role as a leader. You’ll also learn how to recruit, assemble, evaluate, and develop a team with complementary skill sets and roles. You’ll learn the structure of the data science pipeline, the goals of each stage, and how to keep your team on target throughout. Finally, you’ll learn some down-to-earth practical skills that will help you overcome the common challenges that frequently derail data science projects.

Created By : John Hopkins University

#12.Methods and Statistics in Social Sciences

This Specialization covers research methods, design and statistical analysis for social science research questions. In the final Capstone Project, you’ll apply the skills you learned by developing your own research question, gathering data, and analyzing and reporting on the results using statistical methods.

Created By : University of Amsterdam

#13.Applied Data Science Specialization

The 5 courses in this University of Michigan specialization introduce learners to data science through the python programming language. This skills-based specialization is intended for learners who have a basic python or programming background, and want to apply statistical, machine learning, information visualization, text analysis, and social network analysis techniques through popular python toolkits such as pandas, matplotlib, scikit-learn, nltk, and networkx to gain insight into their data.

Introduction to Data Science in Python (course 1), Applied Plotting, Charting & Data Representation in Python (course 2), and Applied Machine Learning in Python (course 3) should be taken in order and prior to any other course in the specialization. After completing those, courses 4 and 5 can be taken in any order. All 5 are required to earn a certificate.

Created By : University of Michigan

#14.Introduction to Data Science Specialization

In this Specialization learners will develop foundational Data Science skills to prepare them for a career or further learning that involves more advanced topics in Data Science. The specialization entails understanding what is Data Science and the various kinds of activities that a Data Scientist performs. It will familiarize learners with various open source tools, like Jupyter notebooks, used by Data Scientists. It will teach you about methodology involved in tackling data science problems. The specialization also provides knowledge of relational database concepts and the use of SQL to query databases. Learners will complete hands-on labs and projects to apply their newly acquired skills and knowledge.

Upon receiving the certificate for completion of the specialization, you will also receive an IBM Badge as a Specialist in Data Science Foundations.

Created By : IBM

#15.Big Data Integration and Processing

This course is for those new to data science. Completion of Intro to Big Data is recommended. No prior programming experience is needed, although the ability to install applications and utilize a virtual machine is necessary to complete the hands-on assignments.

At the end of the course, you will be able to:

  • Retrieve data from example database and big data management systems
  • Describe the connections between data management operations and the big data processing patterns needed to utilize them in large-scale analytical applications
  • Identify when a big data problem needs data integration
  • Execute simple big data integration and processing on Hadoop and Spark platforms

Syllabus:

  1. Welcome to Big Data Integration and Processing
  2. Retrieving Big Data (Part 1)
  3. Retrieving Big Data (Part 2)
  4. Big Data Integration
  5. Processing Big Data
  6. Big Data Analytics using Spark
  7. Learn By Doing: Putting MongoDB and Spark to Work

Created By : University of California San Diego

Level Beginner Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.4

#16.Big Data Modeling and Management Systems

In this course, you will experience various data genres and management tools appropriate for each. You will be able to describe the reasons behind the evolving plethora of new big data platforms from the perspective of big data management systems and analytical tools. Through guided hands-on tutorials, you will become familiar with techniques using real-time and semi-structured data examples. Systems and tools discussed include: AsterixDB, HP Vertica, Impala, Neo4j, Redis, SparkSQL. This course provides techniques to extract value from existing untapped data sources and discovering new data sources. At the end of this course, you will be able to:

  • Recognize different data elements in your own work and in everyday life problems
  • Explain why your team needs to design a Big Data Infrastructure Plan and Information System Design
  • Identify the frequent data operations required for various types of data
  • Select a data model to suit the characteristics of your data
  • Apply techniques to handle streaming data
  • Differentiate between a traditional Database Management System and a Big Data Management System
  • Appreciate why there are so many data management systems
  • Design a big data information system for an online game company

Syllabus:

  1. Introduction to Big Data Modeling and Management
  2. Big Data Modeling
  3. Big Data Modeling (Part 2)
  4. Working With Data Models
  5. Big Data Management: The “M” in DBMS
  6. Designing a Big Data Management System for an Online Game

Created By : University of California San Diego

Commitment 6 weeks of study, 2–3 hours/week Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.3

#17.Fundamentals of Scalable Data Science

The value of IoT can be found within the analysis of data gathered from the system under observation, where insights gained can have direct impact on business and operational transformation. Through analysis data correlation, patterns, trends, and other insight are discovered. Insight leads to better communication between stakeholders, or actionable insights, which can be used to raise alerts or send commands, back to IoT devices.

With a focus on the topic of Exploratory Data Analysis, the course provides an in-depth look at mathematical foundations of basic statistical measures, and how they can be used in conjunction with advanced charting libraries to make use of the world’s best pattern recognition system — the human brain. Learn how to work with the data, and depict it in ways that support visual inspections, and derive to inferences about the data. Identify interesting characteristics, patterns, trends, deviations or inconsistencies, and potential outliers. The goal is that you are able to implement end-to-end analytic workflows at scale, from data acquisition to actionable insights.

Through a series of lectures and exercises students get the needed skills to perform such analysis on any data, although we clearly focus on IoT Sensor Event Data.

Syllabus:

  1. Introduction to exploratory analysis
  2. Tools that support IoT solutions
  3. Mathematical Foundations on Exploratory Data Analysis
  4. Data Visualization

Created By : IBM

Level Beginner Language

English, Subtitles: Vietnamese

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.4

#18.Genomic Data Science and Clustering (Bioinformatics V)

In this class, we will see that these two seemingly different questions can be addressed using similar algorithmic and machine learning techniques arising from the general problem of dividing data points into distinct clusters. In the first half of the course, we will introduce algorithms for clustering a group of objects into a collection of clusters based on their similarity, a classic problem in data science, and see how these algorithms can be applied to gene expression data.

In the second half of the course, we will introduce another classic tool in data science called principal components analysis that can be used to preprocess multidimensional data before clustering in an effort to greatly reduce the number dimensions without losing much of the “signal” in the data.

Finally, you will learn how to apply popular bioinformatics software tools to solve a real problem in clustering.

Syllabus:

  1. Week 1: Introduction to Clustering Algorithms
  2. Week 2: Advanced Clustering Techniques
  3. Week 3: Introductory Algorithms in Population Genetics

Created By : University of California San Diego

Level Beginner Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.2

#19.Communicating Data Science Results

Making predictions is not enough! Effective data scientists know how to explain and interpret their results, and communicate findings accurately to stakeholders to inform business decisions. Visualization is the field of research in computer science that studies effective communication of quantitative results by linking perception, cognition, and algorithms to exploit the enormous bandwidth of the human visual cortex. In this course you will learn to recognize, design, and use effective visualizations.

Just because you can make a prediction and convince others to act on it doesn’t mean you should. In this course you will explore the ethical considerations around big data and how these considerations are beginning to influence policy and practice. You will learn the foundational limitations of using technology to protect privacy and the codes of conduct emerging to guide the behavior of data scientists. You will also learn the importance of reproducibility in data science and how the commercial cloud can help support reproducible research even for experiments involving massive datasets, complex computational infrastructures, or both.

Learning Goals: After completing this course, you will be able to:

1. Design and critique visualizations

2. Explain the state-of-the-art in privacy, ethics, governance around big data and data science

3. Use cloud computing to analyze large datasets in a reproducible way.

Syllabus:

  1. Visualization
  2. Privacy and Ethics
  3. Reproducibility and Cloud Computing

Created By : University of Washington

Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 3.6

#20.Data Manipulation at Scale: Systems and Algorithms

Data analysis has replaced data acquisition as the bottleneck to evidence-based decision making — — we are drowning in it. Extracting knowledge from large, heterogeneous, and noisy datasets requires not only powerful computing resources, but the programming abstractions to use them effectively. The abstractions that emerged in the last decade blend ideas from parallel databases, distributed systems, and programming languages to create a new class of scalable data analytics platforms that form the foundation for data science at realistic scales.

In this course, you will learn the landscape of relevant systems, the principles on which they rely, their tradeoffs, and how to evaluate their utility against your requirements. You will learn how practical systems were derived from the frontier of research in computer science and what systems are coming on the horizon. Cloud computing, SQL and NoSQL databases, MapReduce and the ecosystem it spawned, Spark and its contemporaries, and specialized systems for graphs and arrays will be covered.

You will also learn the history and context of data science, the skills, challenges, and methodologies the term implies, and how to structure a data science project.

Syllabus:

  1. Data Science Context and Concepts
  2. Relational Databases and the Relational Algebra
  3. MapReduce and Parallel Dataflow Programming
  4. NoSQL: Systems and Concepts
  5. Graph Analytics

Created By : University of Washington

Rated 4.3 out of 5 of 661 ratings

#21.Applied Plotting, Charting & Data Representation in Python

This course will introduce the learner to information visualization basics, with a focus on reporting and charting using the matplotlib library. The course will start with a design and information literacy perspective, touching on what makes a good and bad visualization, and what statistical measures translate into in terms of visualizations. The second week will focus on the technology used to make visualizations in python, matplotlib, and introduce users to best practices when creating basic charts and how to realize design decisions in the framework. The third week will be a tutorial of functionality available in matplotlib, and demonstrate a variety of basic statistical charts helping learners to identify when a particular method is good for a particular problem. The course will end with a discussion of other forms of structuring and visualizing data.

This course should be taken after Introduction to Data Science in Python and before the remainder of the Applied Data Science with Python courses: Applied Machine Learning in Python, Applied Text Mining in Python, and Applied Social Network Analysis in Python.

This course is part of “Applied Data Science with Python“ and is intended for learners who have basic python or programming background, and want to apply statistics, machine learning, information visualization, social network analysis, and text analysis techniques to gain new insight into data and/or a tutorial of the matplotlib system. Only minimal statistics background is expected, and the first course contains a refresh of these basic concepts. There are no geographic restrictions. Learners with a formal training in Computer Science but without formal training in data science will still find the skills they acquire in these courses valuable in their studies and careers.

Syllabus:

  1. Module 1: Principles of Information Visualization
  2. Module 2: Basic Charting
  3. Module 3: Charting Fundamentals
  4. Module 4: Applied Visualizations

Created By : University of Michigan

Level Intermediate Language

English, Subtitles: Korean

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.5

#22.Exploring and Producing Data for Business Decision Making

This course provides an analytical framework to help you evaluate key problems in a structured fashion and will equip you with tools to better manage the uncertainties that pervade and complicate business processes. Specifically, you will be introduced to statistics and how to summarize data and learn concepts of frequency, normal distribution, statistical studies, sampling, and confidence intervals.

While you will be introduced to some of the science of what is being taught, the focus will be on applying the methodologies. This will be accomplished through the use of Excel and data sets from many different disciplines, allowing you to see the use of statistics in very diverse settings. The course will focus not only on explaining these concepts, but also understanding the meaning of the results obtained.

Upon successful completion of this course, you will be able to:

  • Summarize large data sets in graphical, tabular, and numerical forms.
  • Understand the significance of proper sampling and why you can rely on sample information.
  • Understand why normal distribution can be used in so many settings.
  • Use sample information to infer about the population with a certain level of confidence about the accuracy of the estimations.
  • Use Excel for statistical analysis.

Syllabus:

  1. Introduction and Summarizing Data
  2. Descriptive Statistics and Probability Distributions
  3. Sampling and Central Limit Theorem
  4. Inference

Created By : University of Illinois

#23.Mathematical Thinking in Computer Science

Mathematical thinking is crucial in all areas of computer science: algorithms, bioinformatics, computer graphics, data science, machine learning, etc. In this course, we will learn the most important tools used in discrete mathematics: induction, recursion, logic, invariants, examples, optimality. We will use these tools to answer typical programming questions like: How can we be certain a solution exists? Am I sure my program computes the optimal answer? Do each of these objects meet the given requirements?

In the course, we use a try-this-before-we-explain-everything approach: you will be solving many interactive (and mobile friendly) puzzles that were carefully designed to allow you to invent many of the important ideas and concepts yourself.

Our intended audience are all people that work or plan to work in IT, starting from motivated high school students.

Prerequisites:

1. We assume only basic math (e.g., we expect you to know what is a square or how to add fractions), common sense and curiosity.

2. Basic programming knowledge is necessary as some quizzes require programming in Python.

Syllabus:

  1. Making Convincing Arguments
  2. How to Find an Example?
  3. Recursion and Induction
  4. Logic
  5. Invariants
  6. Solving a 15-Puzzle

Created By : University of California San Diego, National Research University Higher School of Economics

Level Beginner Commitment 6 weeks, 2–5 hours/week Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.5

Products from Amazon.in

Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data
Data Science from Scratch
Data Structures and Algorithms Made Easy: Data Structures and Algorithmic Puzzles
Practical Statistics for Data Scientists: 50 Essential Concepts
Business Analytics: The Science of Data - Driven Decision Making
What Is Data Science?
Python Data Science Handbook: Essential Tools for Working with Data
Python:  3 Manuscripts in 1 book: - Python Programming For Beginners - Python Programming For Intermediates - Python Programming for Advanced
Introducing Data Science: Big Data, Machine Learning, and More, Using Python Tools

#24.Advanced Machine Learning and Signal Processing

This course, Advanced Machine Learning and Signal Processing, is part of the IBM Advanced Data Science Specialization which IBM is currently creating and gives you easy access to the invaluable insights into Supervised and Unsupervised Machine Learning Models used by experts in many field relevant disciplines. We’ll learn about the fundamentals of Linear Algebra to understand how machine learning modes work. Then we introduce the most popular Machine Learning Frameworks for python Scikit-Learn and SparkML. SparkML is making up the greatest portion of this course since scalability is key to address performance bottlenecks. We learn how to tune the models in parallel by evaluating hundreds of different parameter-combinations in parallel. We’ll continuously use a real-life example from IoT (Internet of Things), for exemplifying the different algorithms. For passing the course you are even required to create your own vibration sensor data using the accelerometer sensors in your smartphone. So you are actually working on a self-created, real dataset throughout the course.

Syllabus:

  1. Setting the stage
  2. Supervised Machine Learning
  3. Unsupervised Machine Learning
  4. Digital Signal Processing in Machine Learning

Created By : IBM

Level Advanced Language

English

Hardware Required You should have access to a smart phone with internet connection in the second week of the course How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.8

#25.Art and Science of Machine Learning

In this data science course you will learn the essential skills of ML intuition, good judgment and experimentation to finely tune and optimize your ML models for the best performance.

In this course you will learn the many knobs and levers involved in training a model. You will first manually adjust them to see their effects on model performance. Once familiar with the knobs and levers, otherwise known as hyperparameters, you will learn how to tune them in an automatic way using Cloud Machine Learning Engine on Google Cloud Platform.

Syllabus:

  1. Introduction
  2. The Art of ML
  3. Hyperparameter Tuning
  4. A pinch of science
  5. The science of neural networks
  6. Embeddings
  7. Custom Estimator
  8. Summary

Created By : Google

Level Intermediate Commitment 3 weeks of study, 5–7 hours per week Language

English, Subtitles: Portuguese (Brazilian), German, Spanish, Japanese

Hardware Required You’ll need a desktop web browser to run this course’s interactive labs via Qwiklabs. How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.5

#26.Launching into Machine Learning

This course is aimed at data scientists, data engineers and programmers interested in learning how to apply machine learning in practice.Starting from a history of machine learning, we discuss why neural networks today perform so well in a variety of data science problems. We then discuss how to set up a supervised learning problem and find a good solution using gradient descent. This involves creating datasets that permit generalization; we talk about methods of doing so in a repeatable way that supports experimentation.

Course Objectives:

Identify why deep learning is currently popular Optimize and evaluate models using loss functions and performance metrics Mitigate common problems that arise in machine learning Create repeatable and scalable training, evaluation, and test datasets

Syllabus:

  1. Introduction
  2. Practical ML
  3. Optimization
  4. Generalization and Sampling
  5. Summary

Created By : Google

Level Intermediate Commitment 5–7 hours per week Language

English, Subtitles: French, Portuguese (Brazilian), German, Spanish, Japanese

Hardware Required You’ll need to use a desktop web browser to access labs. How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.5

#27.Feature Engineering

Want to know how you can improve the accuracy of your machine learning models? What about how to find which data columns make the most useful features? Welcome to Feature Engineering on Google Cloud Platform where we will discuss the elements of good vs bad features and how you can preprocess and transform them for optimal use in your machine learning models.

In this course you will get hands-on practice choosing features and preprocessing them inside of Google Cloud Platform with interactive labs. Our instructors will walk you through the code solutions which will also be made public for your reference as you work on your own future data science projects.

Syllabus:

  1. Introduction
  2. Raw Data to Features
  3. Preprocessing and Feature Creation
  4. Feature Crosses
  5. TF Transform

Created By : Google

Level Intermediate Commitment 2 weeks of study, 5–7 hours per week Language

English, Subtitles: French, Portuguese (Brazilian), German, Spanish, Japanese

Hardware Required You’ll need a desktop web browser to run this course’s interactive labs via Qwiklabs. How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.4

#28.Data Science in Stratified Healthcare and Precision Medicine

An increasing volume of data is becoming available in biomedicine and healthcare, from genomic data, to electronic patient records and data collected by wearable devices. Recent advances in data science are transforming the life sciences, leading to precision medicine and stratified healthcare.

In this course, you will learn about some of the different types of data and computational methods involved in stratified healthcare and precision medicine. You will have a hands-on experience of working with such data. And you will learn from leaders in the field about successful case studies.

Topics include: (i) Sequence Processing, (ii) Image Analysis, (iii) Network Modelling, (iv) Probabilistic Modelling, (v) Machine Learning, (vi) Natural Language Processing, (vii) Process Modelling and (viii) Graph Data.

This course is aimed at learners who are familiar with the basic principles of biomedical and healthcare practice, and who have an interest in data science. Prior knowledge in computer programming or data science is not needed.

Syllabus:

  1. Welcome to the Course
  2. WEEK 2 — This week you will be introduced to Sequence Processing and Medical Image Analysis.
  3. WEEK 3 — This week you will learn about Probabilistic and Network Modelling, and how they are applied to biomedicine.
  4. WEEK 4 — This week you will discover how clinical notes and other free-form text can be analysed with the use of Natural Language Processing techniques.
  5. WEEK 5 — In this final week of the course you will learn how the Graph Data model allows for effective linkage of different data in the life sciences.

Created By : The University of Edinburgh

Level Intermediate Commitment 5 weeks of study, 3–4 hours/week Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.7

#29.Spatial Data Science and Applications

Spatial (map) is considered as a core infrastructure of modern IT world, which is substantiated by business transactions of major IT companies such as Apple, Google, Microsoft, Amazon, Intel, and Uber, and even motor companies such as Audi, BMW, and Mercedes. Consequently, they are bound to hire more and more spatial data scientists. Based on such business trend, this course is designed to present a firm understanding of spatial data science to the learners, who would have a basic knowledge of data science and data analysis, and eventually to make their expertise differentiated from other nominal data scientists and data analysts. Additionally, this course could make learners realize the value of spatial big data and the power of open source software’s to deal with spatial data science problems.

This course will start with defining spatial data science and answering why spatial is special from three different perspectives — business, technology, and data in the first week. In the second week, four disciplines related to spatial data science — GIS, DBMS, Data Analytics, and Big Data Systems, and the related open source software’s — QGIS, PostgreSQL, PostGIS, R, and Hadoop tools are introduced together. During the third, fourth, and fifth weeks, you will learn the four disciplines one by one from the principle to applications. In the final week, five real world problems and the corresponding solutions are presented with step-by-step procedures in environment of open source software’s.

Syllabus:

  1. Understanding Spatial Data Science
  2. Solution Structures of Spatial Data Science Problems
  3. Geographic Information System (GIS)
  4. Spatial DBMS and Big Data Systems
  5. Spatial Data Analytics
  6. Practical Applications of Spatial Data Science

Created By : Yonsei University

Level Intermediate Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.4

#30.Genomic Data Science with Galaxy

Learn to use the tools that are available from the Galaxy Project. This is the second course in the Genomic Big Data Science Specialization.

Syllabus:

  1. Introduction
  2. Galaxy 101
  3. Working with sequence data
  4. RNA-seq & Running your own Galaxy

Created By : Johns Hopkins University

Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 3.6

#31.Python for Genomic Data Science

This class provides an introduction to the Python programming language and the iPython notebook. This is the third course in the Genomic Big Data Science Specialization from Johns Hopkins University.

Syllabus:

  1. Week One — This week we will have an overview of Python and take the first steps towards programming.
  2. Week Two — In this module, we’ll be taking a look at Data Structures and Ifs and Loops.
  3. Week Three — In this module, we have a long three-part lecture on Functions as well as a 10-minute look at Modules and Packages.
  4. Week Four — In this module, we have another long three-part lecture, this time about Communicating with the Outside, as well as a final lecture about Biopython.

Created By : Johns Hopkins University

Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.3

#32.Statistics for Genomic Data Science

An introduction to the statistics behind the most popular genomic data science projects. This is the sixth course in the Genomic Big Data Science Specialization from Johns Hopkins University.

Syllabus:

  1. Module 1 — This course is structured to hit the key conceptual ideas of normalization, exploratory analysis, linear modeling, testing, and multiple testing that arise over and over in genomic studies.
  2. Module 2 — This week we will cover preprocessing, linear modeling, and batch effects.
  3. Module 3 — This week we will cover modeling non-continuous outcomes (like binary or count data), hypothesis testing, and multiple hypothesis testing.
  4. Module 4 — In this week we will cover a lot of the general pipelines people use to analyze specific data types like RNA-seq, GWAS, ChIP-Seq, and DNA Methylation studies.

Created By : Johns Hopkins University

Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.1

#33.Bioconductor for Genomic Data Science

Learn to use tools from the Bioconductor project to perform analysis of genomic data. This is the fifth course in the Genomic Big Data Specialization from Johns Hopkins University.

Syllabus:

  1. Week One — The class will cover how to install and use Bioconductor software.
  2. Week Two — In this week we will learn how to represent and compute on biological sequences.
  3. Week Three — In this week we will cover Basic Data Types, ExpressionSet, biomaRt, and R S4.
  4. Week Four — In this week, we will cover Getting data in Bioconductor, Rsamtools, oligo, limma, and minfi

Created By : Johns Hopkins University

Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.0

#34.Executive Data Science Capstone

The Executive Data Science Capstone, the specialization’s culminating project, is an opportunity for people who have completed all four EDS courses to apply what they’ve learned to a real-world scenario developed in collaboration with Zillow, a data-driven online real estate and rental marketplace, and DataCamp, a web-based platform for data science programming. Your task will be to lead a virtual data science team and make key decisions along the way to demonstrate that you have what it takes to shepherd a complex analysis project from start to finish. For the final project, you will prepare and submit a presentation, which will be evaluated and graded by your fellow capstone participants.

Syllabus:

  1. Executive Data Science Capstone

Created By : Johns Hopkins University

Commitment 1 week of study, 4–6 hours Language

English, Subtitles: Indonesian, Spanish

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.6

#35.Data Science Ethics

This course provides a framework to analyze these concerns as you examine the ethical and privacy implications of collecting and managing big data. Explore the broader impact of the data science field on modern society and the principles of fairness, accountability and transparency as you gain a deeper understanding of the importance of a shared set of ethical values. You will examine the need for voluntary disclosure when leveraging metadata to inform basic algorithms and/or complex artificial intelligence systems while also learning best practices for responsible data management, understanding the significance of the Fair Information Practices Principles Act and the laws concerning the “right to be forgotten.”

This course will help you answer questions such as who owns data, how do we value privacy, how to receive informed consent and what it means to be fair.

Data scientists and anyone beginning to use or expand their use of data will benefit from this course. No particular previous knowledge needed.

Syllabus:

  1. What are Ethics?
  2. History, Concept of Informed Consent
  3. Data Ownership
  4. Privacy
  5. Anonymity
  6. Data Validity
  7. Algorithmic Fairness
  8. Societal Consequences
  9. Code of Ethics
  10. Attributions

Created By : University of Michigan

Level Beginner Commitment 4 weeks, 3–4 hours/week Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.6

#36.A Crash Course in Data Science

In this one-week class, we will provide a crash course in what these terms mean and how they play a role in successful organizations. This class is for anyone who wants to learn what all the data science action is about, including those who will eventually need to manage data scientists. The goal is to get you up to speed as quickly as possible on data science without all the fluff. We’ve designed this course to be as convenient as possible without sacrificing any of the essentials.

This is a focused course designed to rapidly get you up to speed on the field of data science. Our goal was to make this as convenient as possible for you without sacrificing any essential content. We’ve left the technical information aside so that you can focus on managing your team and moving it forward.

After completing this course you will know.

1. How to describe the role data science plays in various contexts

2. How statistics, machine learning, and software engineering play a role in data science

3. How to describe the structure of a data science project

4. Know the key terms and tools used by data scientists

5. How to identify a successful and an unsuccessful data science project

6. The role of a data science manager

Syllabus:

  1. A Crash Course in Data Science

Created By : Johns Hopkins University

Commitment 1 week of study, 4–6 hours Language

English, Subtitles: Chinese (Traditional), Russian, Turkish, Hindi

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.5

#37.Materials Data Sciences and Informatics

This course aims to provide a succinct overview of the emerging discipline of Materials Informatics at the intersection of materials science, computational science, and information science. Attention is drawn to specific opportunities afforded by this new field in accelerating materials development and deployment efforts. A particular emphasis is placed on materials exhibiting hierarchical internal structures spanning multiple length/structure scales and the impediments involved in establishing invertible process-structure-property (PSP) linkages for these materials. More specifically, it is argued that modern data sciences (including advanced statistics, dimensionality reduction, and formulation of metamodels) and innovative cyberinfrastructure tools (including integration platforms, databases, and customized tools for enhancement of collaborations among cross-disciplinary team members) are likely to play a critical and pivotal role in addressing the above challenges.

Syllabus:

  1. Welcome,Accelerating Materials Development and Deployment
  2. Materials Knowledge and Materials Data Science
  3. Materials Knowledge Improvement Cycles
  4. Case Study in Homogenization: Plastic Properties of Two-Phase Composites
  5. Materials Innovation Cyberinfrastructure and Integrated Workflows

Created By : Georgia Institute of Technology

Level Intermediate Commitment 5 weeks of study, 2–3 hours/week Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.3

#38.Building a Data Science Team

In this one-week course, we will cover how you can find the right people to fill out your data science team, how to organize them to give them the best chance to feel empowered and successful, and how to manage your team as it grows.

This is a focused course designed to rapidly get you up to speed on the process of building and managing a data science team. Our goal was to make this as convenient as possible for you without sacrificing any essential content. We’ve left the technical information aside so that you can focus on managing your team and moving it forward.

After completing this course you will know.

1. The different roles in the data science team including data scientist and data engineer

2. How the data science team relates to other teams in an organization

3. What are the expected qualifications of different data science team members

4. Relevant questions for interviewing data scientists

5. How to manage the onboarding process for the team

6. How to guide data science teams to success

7. How to encourage and empower data science teams

Commitment: 1 week of study, 4–6 hours

Syllabus:

  1. Building a Data Science Team

Created By : Johns Hopkins University

Commitment 1 week of study, 4–6 hours Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.5

#39.How to Win a Data Science Competition: Learn from Top Kagglers

If you want to break into competitive data science, then this course is for you! Participating in predictive modelling competitions can help you gain practical experience, improve and harness your data modelling skills in various domains such as credit, insurance, marketing, natural language processing, sales’ forecasting and computer vision to name a few. At the same time you get to do it in a competitive context against thousands of participants where each one tries to build the most predictive algorithm. Pushing each other to the limit can result in better performance and smaller prediction errors. Being able to achieve high ranks consistently can help you accelerate your career in data science.

In this course, you will learn to analyse and solve competitively such predictive modelling tasks.

When you finish this class, you will:

  • Understand how to solve predictive modelling competitions efficiently and learn which of the skills obtained can be applicable to real-world tasks.
  • Learn how to preprocess the data and generate new features from various sources such as text and images. — Be taught advanced feature engineering techniques like generating mean-encodings, using aggregated statistical measures or finding nearest neighbors as a means to improve your predictions.
  • Be able to form reliable cross validation methodologies that help you benchmark your solutions and avoid overfitting or underfitting when tested with unobserved (test) data.
  • Gain experience of analysing and interpreting the data. You will become aware of inconsistencies, high noise levels, errors and other data-related issues such as leakages and you will learn how to overcome them.
  • Acquire knowledge of different algorithms and learn how to efficiently tune their hyperparameters and achieve top performance.
  • Master the art of combining different machine learning models and learn how to ensemble.
  • Get exposed to past (winning) solutions and codes and learn how to read them.

Syllabus:

  1. Introduction & Recap
  2. Feature Preprocessing and Generation with Respect to Models
  3. Final Project Description

Created By : National Research University Higher School of Economics

Level Advanced Commitment 6–10 hours/week Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.7

#40.Data Science Math Skills

This course is designed to teach learners the basic math you will need in order to be successful in almost any data science math course and was created for learners who have basic math skills but may not have taken algebra or pre-calculus. Data Science Math Skills introduces the core math that data science is built upon, with no extra complexity, introducing unfamiliar ideas and math symbols one-at-a-time.

Learners who complete this course will master the vocabulary, notation, concepts, and algebra rules that all data scientists must know before moving on to more advanced material.

Topics include:

  • Set theory, including Venn diagrams
  • Properties of the real number line
  • Interval notation and algebra with inequalities
  • Uses for summation and Sigma notation
  • Math on the Cartesian (x,y) plane, slope and distance formulas
  • Graphing and describing functions and their inverses on the x-y plane,
  • The concept of instantaneous rate of change and tangent lines to a curve
  • Exponents, logarithms, and the natural log function.
  • Probability theory, including Bayes’ theorem.

While this course is intended as a general introduction to the math skills needed for data science, it can be considered a prerequisite for learners interested in the course, “Mastering Data Analysis in Excel,” which is part of the Excel to MySQL Data Science Specialization. Learners who master Data Science Math Skills will be fully prepared for success with the more advanced math concepts introduced in “Mastering Data Analysis in Excel.”

Syllabus:

  1. Welcome to Data Science Math Skills
  2. Building Blocks for Problem Solving
  3. Functions and Graphs
  4. Measuring Rates of Change
  5. Introduction to Probability Theory

Created By : Duke University

Level Beginner Commitment Four weeks, 3–5 hours per week. Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.5

#41.Process Mining: Data science in Action

Process mining is the missing link between model-based process analysis and data-oriented analysis techniques. Through concrete data sets and easy to use software the course provides data science knowledge that can be applied directly to analyze and improve processes in a variety of domains.

Data science is the profession of the future, because organizations that are unable to use (big) data in a smart way will not survive. It is not sufficient to focus on data storage and data analysis. The data scientist also needs to relate data to process analysis. Process mining bridges the gap between traditional model-based process analysis (e.g., simulation and other business process management techniques) and data-centric analysis techniques such as machine learning and data mining. Process mining seeks the confrontation between event data (i.e., observed behavior) and process models (hand-made or discovered automatically). This technology has become available only recently, but it can be applied to any type of operational processes (organizations and systems). Example applications include: analyzing treatment processes in hospitals, improving customer service processes in a multinational, understanding the browsing behavior of customers using booking site, analyzing failures of a baggage handling system, and improving the user interface of an X-ray machine. All of these applications have in common that dynamic behavior needs to be related to process models. Hence, we refer to this as “data science in action”.

The course explains the key analysis techniques in process mining. Participants will learn various process discovery algorithms. These can be used to automatically learn process models from raw event data. Various other process analysis techniques that use event data will be presented. Moreover, the course will provide easy-to-use software, real-life data sets, and practical skills to directly apply the theory in a variety of application domains.

This course starts with an overview of approaches and technologies that use event data to support decision making and business process (re)design. Then the course focuses on process mining as a bridge between data mining and business process modeling. The course is at an introductory level with various practical assignments.

This course is aimed at both students and professionals. A basic understanding of logic, sets, and statistics (at the undergraduate level) is assumed. Basic computer skills are required to use the software provided with the course (but no programming experience is needed). Participants are also expected to have an interest in process modeling and data mining but no specific prior knowledge is assumed as these concepts are introduced in the course.

Syllabus:

  1. Introduction and Data Mining
  2. Process Models and Process Discovery
  3. Different Types of Process Models
  4. Process Discovery Techniques and Conformance Checking
  5. Enrichment of Process Models
  6. Operational Support and Conclusion

Created By : Eindhoven University of Technology

Level Intermediate Commitment 6 weeks of study, 3 to 5 hours/week of material + self study Language

English

Hardware Required Laptop or computer with 1+ GB memory, able to run Java tools. How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.7

#42.Data Science in Real Life

This is a focused course designed to rapidly get you up to speed on doing data science in real life. Our goal was to make this as convenient as possible for you without sacrificing any essential content. We’ve left the technical information aside so that you can focus on managing your team and moving it forward.

After completing this course you will know how to:

1, Describe the “perfect” data science experience

2. Identify strengths and weaknesses in experimental designs

3. Describe possible pitfalls when pulling / assembling data and learn solutions for managing data pulls.

4. Challenge statistical modeling assumptions and drive feedback to data analysts

5. Describe common pitfalls in communicating data analyses

6. Get a glimpse into a day in the life of a data analysis manager.

The course will be taught at a conceptual level for active managers of data scientists and statisticians.

Some key concepts being discussed include: 1. Experimental design, randomization, A/B testing 2. Causal inference, counterfactuals, 3. Strategies for managing data quality. 4. Bias and confounding 5. Contrasting machine learning versus classical statistical inference

Syllabus:

  1. Introduction, the perfect data science experience

Created By : Johns Hopkins University

Commitment 1 week of study, 4–6 hours Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.4

#43.Big Data Science with the BD2K-LINCS Data Coordination and Integration Center

The Library of Integrative Network-based Cellular Signatures (LINCS) is an NIH Common Fund program. The idea is to perturb different types of human cells with many different types of perturbations such as: drugs and other small molecules; genetic manipulations such as knockdown or overexpression of single genes; manipulation of the extracellular microenvironment conditions, for example, growing cells on different surfaces, and more. These perturbations are applied to various types of human cells including induced pluripotent stem cells from patients, differentiated into various lineages such as neurons or cardiomyocytes. Then, to better understand the molecular networks that are affected by these perturbations, changes in level of many different variables are measured including: mRNAs, proteins, and metabolites, as well as cellular phenotypic changes such as changes in cell morphology. The BD2K-LINCS Data Coordination and Integration Center (DCIC) is commissioned to organize, analyze, visualize and integrate this data with other publicly available relevant resources. In this course we briefly introduce the DCIC and the various Centers that collect data for LINCS. We then cover metadata and how metadata is linked to ontologies. We then present data processing and normalization methods to clean and harmonize LINCS data. This follow discussions about how data is served as RESTful APIs. Most importantly, the course covers computational methods including: data clustering, gene-set enrichment analysis, interactive data visualization, and supervised learning. Finally, we introduce crowdsourcing/citizen-science projects where students can work together in teams to extract expression signatures from public databases and then query such collections of signatures against LINCS data for predicting small molecules as potential therapeutics.

Learn various methods of analysis including: unsupervised clustering, gene-set enrichment analysis, interactive data visualization, and supervised machine learning with application to data from the Library of Integrated Network-based Cellular Signature (LINCS) program, and other relevant Big Data from high content molecular omics data and phenotype profiling of mammalian cells.

Syllabus:

  1. The Library of Integrated Network-based Cellular Signatures (LINCS) Program Overview
  2. Metadata and Ontologies
  3. Serving Data with APIs
  4. Bioinformatics Pipelines
  5. The Harmonizome
  6. Data Normalization
  7. Data Clustering
  8. Midterm Exam
  9. Enrichment Analysis
  10. Machine Learning
  11. Benchmarking
  12. Interactive Data Visualization
  13. Crowdsourcing Projects
  14. Final Exam

Created By : Icahn School of Medicine at Mount Sinai

Level Intermediate Commitment 4–5 hours/week Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 5.0

#44.Building Data Visualization Tools

The data science revolution has produced reams of new data from a wide variety of new sources. These new datasets are being used to answer new questions in way never before conceived. Visualization remains one of the most powerful ways draw conclusions from data, but the influx of new data types requires the development of new visualization techniques and building blocks. This course provides you with the skills for creating those new visualization building blocks. We focus on the ggplot2 framework and describe how to use and extend the system to suit the specific needs of your organization or team. Upon completing this course, learners will be able to build the tools needed to visualize a wide variety of data types and will have the fundamentals needed to address new data types as they come about.

Syllabus:

  1. Welcome to Building Data Visualization Tools
  2. Plotting with ggplot2
  3. Mapping and interactive plots
  4. The grid Package
  5. Building New Graphical Elements

Created By : Johns Hopkins University

Level Intermediate Commitment 4 weeks, 2 hours per week Language

English, Subtitles: Chinese (Simplified)

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 3.9

#45.Understanding China, 1700–2000: A Data Analytic Approach, Part 1

The purpose of this course is to summarize new directions in Chinese history and social science produced by the creation and analysis of big historical datasets based on newly opened Chinese archival holdings, and to organize this knowledge in a framework that encourages learning about China in comparative perspective.

Our course demonstrates how a new scholarship of discovery is redefining what is singular about modern China and modern Chinese history. Current understandings of human history and social theory are based largely on Western experience or on non-Western experience seen through a Western lens. This course offers alternative perspectives derived from Chinese experience over the last three centuries. We present specific case studies of this new scholarship of discovery divided into two stand-alone parts, which means that students can take any part without prior or subsequent attendance of the other part.

Part 1 (this course) focuses on comparative inequality and opportunity and addresses two related questions ‘Who rises to the top?’ and ‘Who gets what?’.

Syllabus:

  1. Orientation and Module 1: Social Structure and Education in Late Imperial China
  2. Module 2: Education and Social Mobility in Contemporary China
  3. Module 3: Social Mobility and Wealth Distribution in Late Imperial and Contemporary China
  4. Module 4: Wealth Distribution and Regime Change in Twentieth Century China
  5. Final Exam and Farewell

Created By : The Hong Kong University of Science and Technology

Commitment 5 weeks of study, 2–3 hours/week Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.4

#46.Understanding China, 1700–2000: A Data Analytic Approach, Part 2

The purpose of this course is to summarize new directions in Chinese history and social science produced by the creation and analysis of big historical datasets based on newly opened Chinese archival holdings, and to organize this knowledge in a framework that encourages learning about China in comparative perspective.

Part 2 (this course) turns to an arguably even more important question ‘Who are we?’ as seen through the framework of comparative population behavior — mortality, marriage, and reproduction — and their interaction with economic conditions and human values. We do so because mortality and reproduction are fundamental and universal, because they differ historically just as radically between China and the West as patterns of inequality and opportunity, and because these differences demonstrate the mutability of human behavior and values.

Syllabus:

  1. Orientation and Module 1: Who Are We and Who Survives
  2. Module 2: Who Reproduces and Who Marries
  3. Module 3: Who Cares and Course Conclusion
  4. Final Exam and Farewell

Created By : The Hong Kong University of Science and Technology

Commitment 4 weeks of study, 2–3 hours/week Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.3

#47.Data Management for Clinical Research

This course presents critical concepts and practical methods to support planning, collection, storage, and dissemination of data in clinical research.

Understanding and implementing solid data management principles is critical for any scientific domain. Regardless of your current (or anticipated) role in the research enterprise, a strong working knowledge and skill set in data management principles and practice will increase your productivity and improve your science. Our goal is to use these modules to help you learn and practice this skill set.

This course assumes very little current knowledge of technology other than how to operate a web browser. We will focus on practical lessons, short quizzes, and hands-on exercises as we explore together best practices for data management.

Syllabus:

  1. Research Data Collection Strategy
  2. Electronic Data Capture Fundamentals
  3. Planning a Data Strategy for a Prospective Study
  4. Practicing What We’ve Learned: Implementation
  5. Post-Study Activities and Other Considerations

Created By : Vanderbilt University

Level Beginner Commitment 6 weeks of study, 2–4 hours per week Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.7

#48.Big Data, Genes, and Medicine

This course distills for you expert knowledge and skills mastered by professionals in Health Big Data Science and Bioinformatics. You will learn exciting facts about the human body biology and chemistry, genetics, and medicine that will be intertwined with the science of Big Data and skills to harness the avalanche of data openly available at your fingertips and which we are just starting to make sense of. We’ll investigate the different steps required to master Big Data analytics on real datasets, including Next Generation Sequencing data, in a healthcare and biological context, from preparing data for analysis to completing the analysis, interpreting the results, visualizing them, and sharing the results.

Needless to say, when you master these high-demand skills, you will be well positioned to apply for or move to positions in biomedical data analytics and bioinformatics. No matter what your skill levels are in biomedical or technical areas, you will gain highly valuable new or sharpened skills that will make you stand-out as a professional and want to dive even deeper in biomedical Big Data. It is my hope that this course will spark your interest in the vast possibilities offered by publicly available Big Data to better understand, prevent, and treat diseases.

This course is primarily aimed at health care professionals or assistants, and those with a BS/MA/MS in science or technology or equivalent professional experience. Minimum technical skills are a good understanding of using an Excel spreadsheet. Additional prerequisite knowledge in basic statistics would be preferred, however additional resources will be made available to learners to acquire this knowledge. I think that anyone interested in getting insights into how to harness Big Data to better understand, prevent, and treat diseases can take this course because the material can be applied at different levels of expertise.

Syllabus:

  1. Genes and Data
  2. Preparing Datasets for Analysis
  3. Finding Differentially Expressed Genes
  4. Predicting Diseases from Genes
  5. Determining Gene Alterations
  6. Clustering and Pathway Analysis

Created By : The State University of New York

Level Advanced Commitment 6 weeks of study, 3–5 hours per week Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.2

#49.Data-driven Astronomy

Science is undergoing a data explosion, and astronomy is leading the way. Modern telescopes produce terabytes of data per observation, and the simulations required to model our observable Universe push supercomputers to their limits. To analyse this data scientists need to be able to think computationally to solve problems. In this course you will investigate the challenges of working with large datasets: how to implement algorithms that work; how to use databases to manage your data; and how to learn from your data with machine learning tools. The focus is on practical skills — all the activities will be done in Python 3, a modern programming language used throughout astronomy.

Regardless of whether you’re already a scientist, studying to become one, or just interested in how modern astronomy works ‘under the bonnet’, this course will help you explore astronomy: from planets, to pulsars to black holes.

This course is aimed at science students with an interest in computational approaches to problem solving, people with an interest in astronomy who would like to learn current research methods, or people who would like to improve their programming by applying it to astronomy examples.

Syllabus:

  1. Thinking about data
  2. Big data makes things slow
  3. Querying your data
  4. Managing your data
  5. Learning from data: regression
  6. Learning from data: classification

Created By : The University of Sydney

Level Intermediate Commitment 6 weeks of study, 4–6 hours/week Language

English

Hardware Required You’ll need to have a computer with internet access. How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.9

#50.Measuring Causal Effects in the Social Sciences

How can we know if the differences in wages between men and women are caused by discrimination or differences in background characteristics? In this PhD-level course we look at causal effects as opposed to spurious relationships. We will discuss how they can be identified in the social sciences using quantitative data, and describe how this can help us understand social mechanisms.

Syllabus:

  1. The Nature of Causal Effects and How to Measure Them
  2. The Multivariate Regression Model and Mediating Factors
  3. Randomized Controlled Trials
  4. Instrumental Variables
  5. Difference in Difference

Created By : University of Copenhagen

University of Copenhagen

Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.1

#51.What is Data Science?

The art of uncovering the insights and trends in data has been around since ancient times. The ancient Egyptians used census data to increase efficiency in tax collection and they accurately predicted the flooding of the Nile river every year. Since then, people working in data science have carved out a unique and distinct field for the work they do. This field is data science. In this course, we will meet some data science practitioners and we will get an overview of what data science is today.

This course is primarily for individuals who are passionate about the field of data science and who are aspiring to become data scientists.

Syllabus:

  1. Defining Data Science and What Data Scientists Do
  2. Data Science Topics
  3. Data Science in Business

Created By : IBM

Level Beginner Commitment 3 weeks of study, 2–3 hours/week Language

English, Subtitles: Arabic

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.6

#52.Data Science Methodology

Despite the recent increase in computing power and access to data over the last couple of decades, our ability to use the data within the decision making process is either lost or not maximized at all too often, we don’t have a solid understanding of the questions being asked and how to apply the data correctly to the problem at hand.

This course has one purpose, and that is to share a methodology that can be used within data science, to ensure that the data used in problem solving is relevant and properly manipulated to address the question at hand.

Accordingly, in this course, you will learn:

  • The major steps involved in tackling a data science problem.
  • The major steps involved in practicing data science, from forming a concrete business or research problem, to collecting and analyzing data, to building a model, and understanding the feedback after model deployment.
  • How data scientists think!

This course is primarily aimed at data scientists, data engineers, or anyone with interest in data science

Syllabus:

  1. From Problem to Approach and From Requirements to Collection
  2. From Understanding to Preparation and From Modeling to Evaluation
  3. From Deployment to Feedback

Created By : IBM

Level Beginner Commitment 3 weeks of study, 2–3 hours/week Language

English, Subtitles: Arabic

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.5

#53.Open Source tools for Data Science

What are some of the most popular data science tools, how do you use them, and what are their features? In this course, you’ll learn about Jupyter Notebooks, RStudio IDE, Apache Zeppelin and Data Science Experience. You will learn about what each tool is used for, what programming languages they can execute, their features and limitations. With the tools hosted in the cloud on Cognitive Class Labs, you will be able to test each tool and follow instructions to run simple code in Python, R or Scala. To end the course, you will create a final project with a Jupyter Notebook on IBM Data Science Experience and demonstrate your proficiency preparing a notebook, writing Markdown, and sharing your work with your peers.

This course is aimed any anyone in Data Science and has little or no knowledge about the tools used by Data Scientists.

Syllabus:

  1. Introducing Cognitive Class Labs
  2. Jupyter Notebooks
  3. Apache Zeppelin Notebooks
  4. RStudio IDE
  5. IBM Watson Studio
  6. Project: Create and share a Jupyter Notebook

Created By : IBM

Level Beginner Commitment 3 weeks of study, 2–3 hours/week Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.6

#54.Databases and SQL for Data Science

Much of the world’s data resides in databases. SQL (or Structured Query Language) is a powerful language which is used for communicating with and extracting data from databases. A working knowledge of databases and SQL is a must if you want to become a data scientist.

The purpose of this course is to introduce relational database concepts and help you learn and apply knowledge of the SQL language. It is also intended to get you started with performing SQL access in a data science environment.

The emphasis in this course is on hands-on and practical learning . As such, you will work with real databases, real data science tools, and real-world datasets. You will create a database instance in the cloud. Through a series of hands-on labs you will practice building and running SQL queries. You will also learn how to access databases from Jupyter notebooks using SQL and Python.

No prior knowledge of databases, SQL, Python, or programming is required.

Anyone can audit this course at no-charge. If you choose to take this course and earn the Coursera course certificate, you can also earn an IBM digital badge upon successful completion of the course.

This course is designed for anyone interested in learning SQL, especially those who want to apply SQL for data science. Therefore it is ideal for those aspiring to become data scientists. It is also suitable for those who want to become data analysts, data engineers, database administrators, or a database developers. This course is for beginners in SQL and Data Science. It does not require any prior knowledge of SQL, databases, Python, or programming.

Syllabus:

  1. Week 1 — Introduction to Databases and Basic SQL
  2. Week 2 — Advanced SQL
  3. Week 3 — Accessing Databases using Python
  4. Week 4: Course Assignment

Created By : IBM

Level Beginner Commitment 3–4 weeks of study, 2–4 hours/week. Language

English, Subtitles: Arabic

Hardware Required Only browser access is required. Access to cloud based environment will be provided for hands-on labs. How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.6

#55.Advanced Linear Models for Data Science 1: Least Squares

This class is an introduction to least squares from a linear algebraic and mathematical perspective. Before beginning the class make sure that you have the following:

  • A basic understanding of linear algebra and multivariate calculus.
  • A basic understanding of statistics and regression models.
  • At least a little familiarity with proof based mathematics.
  • Basic knowledge of the R programming language.

After taking this course, students will have a firm foundation in a linear algebraic treatment of regression modeling. This will greatly augment applied data scientists’ general understanding of regression models.

This class is for students who already have had a class in regression modeling and are familiar with the area who would like to see a more advanced treatment of the topic.

Syllabus :

  1. Background
  2. One and two parameter regression
  3. Linear regression
  4. General least squares
  5. Least squares examples
  6. Bases and residuals

Created By : Johns Hopkins University

Level Advanced Commitment 6 weeks of study, 1–2 hours/week Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.4

#56.Advanced Linear Models for Data Science 2: Statistical Linear Models

This class is an introduction to least squares from a linear algebraic and mathematical perspective. Before beginning the class make sure that you have the following:

  • A basic understanding of linear algebra and multivariate calculus.
  • A basic understanding of statistics and regression models.
  • At least a little familiarity with proof based mathematics.
  • Basic knowledge of the R programming language.

After taking this course, students will have a firm foundation in a linear algebraic treatment of regression modeling. This will greatly augment applied data scientists’ general understanding of regression models.

This class is for students who already have had a class in regression modeling and are familiar with the area who would like to see a more advanced treatment of the topic.

Syllabus :

  1. Introduction and expected values
  2. The multivariate normal distribution
  3. Distributional results
  4. Residuals

Created By : Johns Hopkins University

Level Advanced Commitment 6 weeks of study, 1–2 hours/week Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.7

#57.Data Processing Using Python

It starts with the basic syntax of Python, to how to acquire data in Python locally and from network, to how to present data, then to how to conduct basic and advanced statistic analysis and visualization of data, and finally to how to design a simple GUI to present and process data, advancing level by level. This course, as a whole, based on Finance data and through establishment of popular cases one after another, enables learners to more vividly feel the simplicity, elegance and robustness of Python. Also, it discusses the fast, convenient and efficient data processing capacity of Python in humanities and social sciences fields like literature, sociology and journalism and science and engineering fields like mathematics and biology, in addition to business fields. Similarly, it may also be flexibly applied into other fields.

Syllabus :

  1. Welcome to learn Data Processing Using Python!
  2. Basics of Python
  3. Data Acquisition and Presentation
  4. Powerful Data Structures and Python Extension Libraries
  5. Python Data Statistics and Visualization
  6. Object Orientation and Graphical User Interface

Created By: Nanjing University

Level Beginner Commitment 3–5 hours/week Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.4

#58.Applied Text Mining in Python

This course will introduce the learner to text mining and text manipulation basics. The course begins with an understanding of how text is handled by python, the structure of text both to the machine and to humans, and an overview of the nltk framework for manipulating text. The second week focuses on common manipulation needs, including regular expressions (searching for text), cleaning text, and preparing text for use by machine learning processes. The third week will apply basic natural language processing methods to text, and demonstrate how text classification is accomplished. The final week will explore more advanced methods for detecting the topics in documents and grouping them by similarity (topic modelling).

This course should be taken after: Introduction to Data Science in Python, Applied Plotting, Charting & Data Representation in Python, and Applied Machine Learning in Python.

This course is part of “Applied Data Science with Python“ and is intended for learners who have basic python or programming background, and want to apply statistics, machine learning, information visualization, social network analysis, and text analysis techniques to gain new insight into data and/or a basic nltk tutorial. Only minimal statistics background is expected, and the first course contains a refresh of these basic concepts. There are no geographic restrictions. Learners with a formal training in Computer Science but without formal training in data science will still find the skills they acquire in these courses valuable in their studies and careers.

Syllabus:

  1. Module 1: Working with Text in Python
  2. Module 2: Basic Natural Language Processing
  3. Module 3: Classification of Text
  4. Module 4: Topic Modeling

Created By: University of Michigan

Level Intermediate Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.2

#59.Building R Packages

Writing good code for data science is only part of the job. In order to maximizing the usefulness and reusability of data science software, code must be organized and distributed in a manner that adheres to community-based standards and provides a good user experience. This course covers the primary means by which R software is organized and distributed to others. We cover R package development, writing good documentation and vignettes, writing robust software, cross-platform development, continuous integration tools, and distributing packages via CRAN and GitHub. Learners will produce R packages that satisfy the criteria for submission to CRAN.

Syllabus:

  1. Getting Started with R Packages
  2. Documentation and Testing
  3. Licensing, Version Control, and Software Design
  4. Continuous Integration and Cross Platform Development

Created By: Johns Hopkins University

Level Intermediate Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.2

#60.The R Programming Environment

This course provides a rigorous introduction to the R programming language, with a particular focus on using R for software development in a data science setting. Whether you are part of a data science team or working individually within a community of developers, this course will give you the knowledge of R needed to make useful contributions in those settings. As the first course in the Specialization, the course provides the essential foundation of R needed for the following courses. We cover basic R concepts and language fundamentals, key concepts like tidy data and related “tidyverse” tools, processing and manipulation of complex and large datasets, handling textual data, and basic data science tasks. Upon completing this course, learners will have fluency at the R console and will be able to create tidy datasets from a wide range of possible data sources.

This course is aimed at learners who have some experience programming computers but who are not familiar with the R environment.

Syllabus:

  1. Basic R Language
  2. Data Manipulation
  3. Text Processing, Regular Expression, & Physical Memory
  4. Large Datasets

Created By: Johns Hopkins University

Level Intermediate Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.4

#61.Advanced R Programming

This course covers advanced topics in R programming that are necessary for developing powerful, robust, and reusable data science tools. Topics covered include functional programming in R, robust error handling, object oriented programming, profiling and benchmarking, debugging, and proper design of functions. Upon completing this course you will be able to identify and abstract common data analysis tasks and to encapsulate them in user-facing functions. Because every data science environment encounters unique data challenges, there is always a need to develop custom software specific to your organization’s mission. You will also be able to define new data types in R and to develop a universe of functionality specific to those data types to enable cleaner execution of data science tasks and stronger reusability within a team.

Syllabus:

  1. Welcome to Advanced R Programming
  2. Functions
  3. Functional Programming
  4. Debugging and Profiling
  5. Object-Oriented Programming

Created By : Johns Hopkins University

Level Intermediate Language

English, Subtitles: Chinese (Simplified)

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.3

#62.Basic Statistics

Understanding statistics is essential to understand research in the social and behavioral sciences. In this course you will learn the basics of statistics; not just how to calculate them, but also how to evaluate them. This course will also prepare you for the next course in the specialization — the course Inferential Statistics.

In the first part of the course we will discuss methods of descriptive statistics. You will learn what cases and variables are and how you can compute measures of central tendency (mean, median and mode) and dispersion (standard deviation and variance). Next, we discuss how to assess relationships between variables, and we introduce the concepts correlation and regression.

The second part of the course is concerned with the basics of probability: calculating probabilities, probability distributions and sampling distributions. You need to know about these things in order to understand how inferential statistics work.

The third part of the course consists of an introduction to methods of inferential statistics — methods that help us decide whether the patterns we see in our data are strong enough to draw conclusions about the underlying population we are interested in. We will discuss confidence intervals and significance tests.

You will not only learn about all these statistical concepts, you will also be trained to calculate and generate these statistics yourself using freely available statistical software.

Syllabus:

  1. Before we get started…
  2. Exploring Data
  3. Correlation and Regression
  4. Probability
  5. Probability Distributions
  6. Sampling Distributions
  7. Confidence Intervals
  8. Significance Tests
  9. Exam time!

Created By : University of Amsterdam

Level Beginner Commitment 8 weeks of study, week 1: 3–6 hours; week 2–8: 1–3 hours/week. Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.7

#63.Practical Predictive Analytics: Models and Methods

Statistical experiment design and analytics are at the heart of data science. In this course you will design statistical experiments and analyze the results using modern methods. You will also explore the common pitfalls in interpreting statistical arguments, especially those associated with big data. Collectively, this course will help you internalize a core set of practical and effective machine learning methods and concepts, and apply them to solve some real world problems.

Learning Goals: After completing this course, you will be able to:

1. Design effective experiments and analyze the results

2. Use resampling methods to make clear and bulletproof statistical arguments without invoking esoteric notation

3. Explain and apply a core set of classification methods of increasing complexity (rules, trees, random forests), and associated optimization methods (gradient descent and variants)

4. Explain and apply a set of unsupervised learning concepts and methods

5. Describe the common idioms of large-scale graph analytics, including structural query, traversals and recursive queries, PageRank, and community detection

Syllabus:

  1. Practical Statistical Inference
  2. Supervised Learning
  3. Optimization
  4. Unsupervised Learning

Created By: University of Washington

Commitment 4 weeks of study, 6–8 hours/week Language

English, Subtitles: Korean

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.1

#64.Statistics with R Capstone

The capstone project will be an analysis using R that answers a specific scientific/business question provided by the course team. A large and complex dataset will be provided to learners and the analysis will require the application of a variety of methods and techniques introduced in the previous courses, including exploratory data analysis through data visualization and numerical summaries, statistical inference, and modeling as well as interpretations of these results in the context of the data and the research question. The analysis will implement both frequentist and Bayesian techniques and discuss in context of the data how these two approaches are similar and different, and what these differences mean for conclusions that can be drawn from the data.

A sampling of the final projects will be featured on the Duke Statistical Science department website.

Syllabus:

  1. About the Capstone Project
  2. Exploratory Data Analysis (EDA)
  3. EDA and Basic Model Selection — Submission
  4. EDA and Basic Model Selection — Evaluation
  5. Model Selection and Diagnostics
  6. Out of Sample Prediction
  7. Final Data Analysis — Submission
  8. Final Data Analysis — Evaluation

Created By : Duke University

Commitment 5–10 hours/week Language

English

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.7

#65.Model Thinking

We live in a complex world with diverse people, firms, and governments whose behaviors aggregate to produce novel, unexpected phenomena. We see political uprisings, market crashes, and a never ending array of social trends. How do we make sense of it? Models. Evidence shows that people who think with models consistently outperform those who don’t. And, moreover people who think with lots of models outperform people who use only one. Why do models make us better thinkers? Models help us to better organize information — to make sense of that fire hose or hairball of data (choose your metaphor) available on the Internet. Models improve our abilities to make accurate forecasts. They help us make better decisions and adopt more effective strategies. They even can improve our ability to design institutions and procedures. In this class, I present a starter kit of models: I start with models of tipping points. I move on to cover models explain the wisdom of crowds, models that show why some countries are rich and some are poor, and models that help unpack the strategic decisions of firm and politicians.

The models covered in this class provide a foundation for future social science classes, whether they be in economics, political science, business, or sociology. Mastering this material will give you a huge leg up in advanced courses. They also help you in life. Here’s how the course will work. For each model, I present a short, easily digestible overview lecture. Then, I’ll dig deeper. I’ll go into the technical details of the model. Those technical lectures won’t require calculus but be prepared for some algebra. For all the lectures, I’ll offer some questions and we’ll have quizzes and even a final exam. If you decide to do the deep dive, and take all the quizzes and the exam, you’ll receive a Course Certificate. If you just decide to follow along for the introductory lectures to gain some exposure that’s fine too. It’s all free. And it’s all here to help make you a better thinker!

Syllabus:

  1. Why Model & Segregation/Peer Effects
  2. Aggregation & Decision Models
  3. Thinking Electrons: Modeling People & Categorical and Linear Models
  4. Tipping Points & Economic Growth
  5. Diversity and Innovation & Markov Processes
  6. Midterm Exam
  7. Lyapunov Functions & Coordination and Culture
  8. Path Dependence & Networks
  9. Randomness and Random Walks & Colonel Blotto
  10. Prisoners’ Dilemma and Collective Action & Mechanism Design
  11. Learning Models: Replicator Dynamics & Prediction and the Many Model Thinker
  12. Final Exam

Created By: University of Michigan

Commitment 4–8 hours/week Language

English, Subtitles: Arabic, Ukrainian, Chinese (Simplified), Portuguese (Brazilian), Turkish

How To Pass Pass all graded assignments to complete the course. User Ratings

Average User Rating 4.8

Like this post? Don’t forget to share it!

--

--