Home
Videos uploaded by user “Data Science Dojo”
Intro to Web Scraping with Python and Beautiful Soup
 
33:31
Web scraping is a very powerful tool to learn for any data professional. With web scraping the entire internet becomes your database. In this tutorial we show you how to parse a web page into a data file (csv) using a Python package called BeautifulSoup. In this example, we web scrape graphics cards from NewEgg.com. Sublime: https://www.sublimetext.com/3 Anaconda: https://www.anaconda.com/distribution/#download-section If you are not seeing the command line, follow this tutorial: https://www.tenforums.com/tutorials/72024-open-command-window-here-add-windows-10-a.html Find more tutorials here: https://tutorials.datasciencedojo.com/ -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 4000+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f6wzS0 See what our past attendees are saying here: https://hubs.ly/H0f6wzY0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 514433 Data Science Dojo
Introduction to Data Mining: Basic Vocabulary
 
04:17
All great learning opportunities are built on a solid foundation. This data mining fundamentals series is jam-packed with all the background information, technical terminology, and basic knowledge that you will need to hit the ground running. In part 1 of this data mining video series, we cover what data is and the basic vocabulary associated with it. Topics: – Data and Data Types – Data Quality – Data Preprocessing – Similarity and Dissimilarity – Data Exploration and Visualization -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8LhN0 See what our past attendees are saying here: https://hubs.ly/H0f8LhR0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://plus.google.com/+Datasciencedojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 35124 Data Science Dojo
Introduction to Data Mining: Euclidean Distance & Cosine Similarity
 
04:51
In this Data Mining Fundamentals tutorial, we continue our introduction to similarity and dissimilarity by discussing euclidean distance and cosine similarity. We will show you how to calculate the euclidean distance and construct a distance matrix. -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8M8m0 See what our past attendees are saying here: https://hubs.ly/H0f8Lts0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://plus.google.com/+Datasciencedojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 24339 Data Science Dojo
What is A/B Testing? | Data Science in Minutes
 
03:43
What is A/B testing? In this quick tutorial, we go over the basics of A/B testing, as well as answer some in-depth questions such as: why should businesses conduct A/B testing? Or how do you perform an A/B test? A/B testing aims to determine not only which technique performs better but also to understand whether the difference is statistically significant. -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0g0C-80 See what our past attendees are saying here: https://hubs.ly/H0g0C-90 -- Like Us: https://www.facebook.com/datasciencedojo/ Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/data-science-dojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo/ Vimeo: https://vimeo.com/datasciencedojo
Views: 3055 Data Science Dojo
Data Science Dojo Alumni - Bayo (Olubayo) Adekanmbi
 
02:23
Bayo Adekanmbi, part of the Washington D.C. May 2017 cohort, was looking for a data science class that had a combination of academic and business-centric teaching. He found that our bootcamp gave him that combination, and also allowed him to apply practical, real-world data science that will take him to the next level. -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0fBt7H0 See what our past attendees are saying here: https://hubs.ly/H0fBtK_0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 3306 Data Science Dojo
My First Kaggle Submission
 
09:00
To prepare you for Data Science Dojo's day two homework we will explain what Kaggle is and show you how to create a Kaggle account and submit your model to the Kaggle competition. Titanic Data Set: https://www.kaggle.com/c/titanic -- At Data Science Dojo, we're extremely passionate about data science. We've helped educate and train 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f6y0L0 See what our past attendees are saying here: https://hubs.ly/H0f6wN00 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 52139 Data Science Dojo
Intro to Azure ML: What is Azure Machine Learning?
 
12:14
What's better than machine learning? Machine learning where coding is optional! Drag and drop machine learning with a visual interface! We’re going to introduce you to a new tool to add to your data science toolkit, Azure Machine Learning Studio. Azure Machine Learning is a cloud based data science platform on the Azure cloud ecosystem. Azure Machine Learning Studio also supports coding in Python, SQL, and R. In Part 1 we will cover: - What is Azure Machine Learning Studio - Being in the cloud - Subscriptions you need - Pricing of Azure Introduction to Data Mining: https://youtu.be/f7NfO16l04U -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8mQq0 See what our past attendees are saying here: https://hubs.ly/H0f8m_C0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 35496 Data Science Dojo
Introduction to Text Analytics with R: Overview
 
30:38
The overview of this video series provides an introduction to text analytics as a whole and what is to be expected throughout the instruction. It also includes specific coverage of: – Overview of the spam dataset used throughout the series – Loading the data and initial data cleaning – Some initial data analysis, feature engineering, and data visualization About the Series This data science tutorial introduces the viewer to the exciting world of text analytics with R programming. As exemplified by the popularity of blogging and social media, textual data if far from dead – it is increasing exponentially! Not surprisingly, knowledge of text analytics is a critical skill for data scientists if this wealth of information is to be harvested and incorporated into data products. This data science training provides introductory coverage of the following tools and techniques: – Tokenization, stemming, and n-grams – The bag-of-words and vector space models – Feature engineering for textual data (e.g. cosine similarity between documents) – Feature extraction using singular value decomposition (SVD) – Training classification models using textual data – Evaluating accuracy of the trained classification models Kaggle Dataset: https://www.kaggle.com/uciml/sms-spam-collection-dataset The data and R code used in this series is available here: https://code.datasciencedojo.com/datasciencedojo/tutorials/tree/master/Introduction%20to%20Text%20Analytics%20with%20R -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f5JLp0 See what our past attendees are saying here: https://hubs.ly/H0f5JZl0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 71472 Data Science Dojo
Intro to Azure ML: Subscriptions & Workspaces
 
13:46
To begin data mining in Azure Machine Learning Studio, we must first setup an Azure subscription. Once we have a subscription, create an Azure Machine Learning Studio workspace within Azure. Azure Machine Learning is in the cloud and allows for workspaces to be shared with other users for collaborative data science projects. In Part 2 we will cover: - Getting an Azure subscription - Creating and Azure ML work space - Exploring the Azure ML work space Free trial Azure subscription: https://azure.microsoft.com/en-us/free/ Azure Portal: https://portal.azure.com -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8n220 See what our past attendees are saying here: https://hubs.ly/H0f8mVs0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 9571 Data Science Dojo
How to do the Titanic Kaggle competition in R - Part 1
 
35:07
As part of submitting to Data Science Dojo's Kaggle competition you need to create a model out of the titanic data set. We will show you how to do this using RStudio. Titanic Data Set: https://www.kaggle.com/c/titanic Download RStudio: https://www.rstudio.com/products/rstudio -- At Data Science Dojo, we're extremely passionate about data science. We've helped educate and train 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f6y390 See what our past attendees are saying here: https://hubs.ly/H0f6wND0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 54493 Data Science Dojo
What is A/A Testing? | Data Science in Minutes
 
01:56
In this quick tutorial we go over A/A testing, what it is and how to use it to help you properly conduct A/B or multivariate tests. A/A testing is the tactic of using A/B testing to test two identical versions of a page against each other. -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0g32QW0 See what our past attendees are saying here: https://hubs.ly/H0g35mk0 -- Like Us: https://www.facebook.com/datasciencedojo/ Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/data-science-dojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo/ Vimeo: https://vimeo.com/datasciencedojo
Views: 508 Data Science Dojo
Introduction to Data Mining: Similarity & Dissimilarity
 
03:43
In this Data Mining Fundamentals tutorial, we introduce you to similarity and dissimilarity. Similarity is a numerical measure of how alike two data objects are, and dissimilarity is a numerical measure of how different two data objects are. We also discuss similarity and dissimilarity for single attributes. -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8Lsn0 See what our past attendees are saying here: https://hubs.ly/H0f8Lsp0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://plus.google.com/+Datasciencedojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 19461 Data Science Dojo
Automated Web Scraping in R: Writing your Script
 
16:34
In this video tutorial you will learn how to write standard web scraping commands in R, filter timely data based on time diffs, analyze or summarize key information in the text, and send an email alert of the results of your analysis. Packages used: rvest - for downloading website data lubridate - for cleaning, converting date-time data stringr - for cleaning text in r LSAfun - for ranking/summarizing the text Recommended for medium level R users. See our Introduction to R to get up-to-speed with basic R commands: https://tutorials.datasciencedojo.com/introduction-to-r/ The R full script for this video tutorial can be accessed here: https://code.datasciencedojo.com/datasciencedojo/tutorials/tree/master/web_scraping_R-master Link to Website used: https://www.marketwatch.com/story/bitcoin-jumps-after-credit-scare-2018-10-15 To see an example of web scraping timely political news events and commentary from Reddit, check out Data Science Dojo's blog tutorial on KDnuggets: https://www.kdnuggets.com/2018/12/automated-web-scraping-r.html -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0fZy3C0 See what our past attendees are saying here: https://hubs.ly/H0fZyph0 -- Like Us: https://www.facebook.com/datasciencedojo/ Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/data-science-dojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo/ Vimeo: https://vimeo.com/datasciencedojo
Views: 3852 Data Science Dojo
Intro to Azure ML: Joining Datasets
 
10:11
Last time we prepared our dataset for a join. In this video we’ll use the join data module inside of Azure ML to cross reference each airport id with the airport table to find airport city, airport state, and airport name. We will briefly go over the different types of joins, then combine the three tables together. Each time we join we will add 3 columns to our dataset. -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8pK40 See what our past attendees are saying here: https://hubs.ly/H0f8pKm0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 5057 Data Science Dojo
Intro to Data Visualization with R & ggplot2
 
01:11:15
The R programming language is experiencing rapid increases in popularity and wide adoption across industries. This popularity is due, in part, to R’s rich and powerful data visualization capabilities. While tools like Excel, Power BI, and Tableau are often the go-to solutions for data visualizations, none of these tools can compete with R in terms of the sheer breadth of, and control over, crafted data visualizations. As an example, R’s ggplot2 package provides the R programmer with dozens of print-quality visualizations – where any visualization can be heavily customized with a minimal amount of code. In this webinar Dave Langer will provide an introduction to data visualization with the ggplot2 package. The focus of the webinar will be using ggplot2 to analyze your data visually with a specific focus on discovering the underlying signals/patterns of your business. Attendees will learn how to: • Craft ggplot visualizations, including customization of rendered output. • Choose optimal visualizations for the type of data and the nature of the analysis at hand. • Leverage ggplot2’s powerful segmentation capabilities to achieve “visual drill-in of data”. • Export ggplot2 visualizations from RStudio for use in documents and presentations. Repository: https://code.datasciencedojo.com/datasciencedojo/tutorials/tree/master/Introduction%20to%20Data%20Visualization%20with%20R%20and%20ggplot2 -- Learn more about Data Science Dojo here: https://hubs.ly/H0dTtFq0 See what our past attendees are saying here: https://hubs.ly/H0dTtFw0 -- Like Us: https://www.facebook.com/datasciencedojo/ Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/data-science-dojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo/ Vimeo: https://vimeo.com/datasciencedojo
Views: 109251 Data Science Dojo
Introduction to Data Mining: Data Noise
 
04:10
In this Data Mining Fundamentals tutorial, we discuss data noise that can overlap valid data and outliers. Noise can appear because of human inconsistency and labeling. We will provide you with several examples of data noise, and how data noise can be measured and recorded. -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8M3q0 See what our past attendees are saying here: https://hubs.ly/H0f8Llr0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://plus.google.com/+Datasciencedojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 7587 Data Science Dojo
Introduction to Data Mining: Feature Subset Selection
 
05:30
In this Data Mining Fundamentals tutorial, we discuss another way of dimensionality reduction, feature subset selection. We discuss the many techniques for feature subset selection, including the brute-force approach, embedded approach, and filter approach. Feature subset selection will reduce redundant and irrelevant features in your data. -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8Lrw0 See what our past attendees are saying here: https://hubs.ly/H0f8M7M0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://plus.google.com/+Datasciencedojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 7047 Data Science Dojo
Solving the Titanic Kaggle Competition in Azure ML
 
20:15
In this tutorial we will show you how to complete the titanic Kaggle competition using Microsoft Azure Machine Learning Studio.This video assumes you have an Azure account and you understand how to use Azure. Kaggle Titanic Experiment: https://gallery.azure.ai/Experiment/Titanic-Kaggle-Competition-1 -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f6y1P0 See what our past attendees are saying here: https://hubs.ly/H0f6wNd0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 9531 Data Science Dojo
Introduction to Precision, Recall and F1 | Data Science in Minutes
 
03:06
You may have come across the terms "Precision, Recall and F1" when reading about Classification Models and machine learning. In this Data Science in Minutes tutorial, we will explain what Precision, Recall and F1 are, and when you can use each for measuring the accuracy of your model! -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0gkMMB0 See what our past attendees are saying here: https://hubs.ly/H0gkMMY0 -- Like Us: https://www.facebook.com/datasciencedojo/ Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/data-science-dojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo/ Vimeo: https://vimeo.com/datasciencedojo
Views: 456 Data Science Dojo
Intro to Azure ML: Building a Machine Learning Model
 
15:31
Let’s build our first machine learning model in Azure ML. First, we have to go shopping for a machine learning model. We must identify what type of machine learning algorithm we want to choose from. We ended up using a decision tree algorithm because we have lots of categorical data. We’ll build a very simplistic model so that we can visualize the decision tree model and understand its results. -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8pBm0 See what our past attendees are saying here: https://hubs.ly/H0f8pBp0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 7762 Data Science Dojo
Intro to Azure ML: Splitting & Categorical Casting
 
18:47
Before we can feed this dataset into a machine learning model there are two things we have to take care of. First we have to make sure all the categorical features are treated as categories. We’ll use the editmeta data module once again to cast these features. Then we need to setup a holdout dataset for future evaluation of any model that we build. We will randomly sample our dataset into two partitions, a training set and a test set. The test set we will lock away to pretend that its future world data. The assumption is if the model we built can predict well on this test set, which it has never been exposed to before, it will do moderately just as well on the future world data. -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8pzs0 See what our past attendees are saying here: https://hubs.ly/H0f8p-B0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 5075 Data Science Dojo
Introduction to R: Data Types & Atomic Classes
 
04:40
In this introduction to R tutorial, we go deeper into how R functions, and introduce data types and the 5 atomic classes. R has three number types, a character type, and a logical type. Download R: https://cran.r-project.org/ Download RStudio: https://www.rstudio.com/products/rstu... -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8H4X0 See what our past attendees are saying here: https://hubs.ly/H0f8GMM0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://plus.google.com/+Datasciencedojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 7644 Data Science Dojo
Introduction to Data Mining: Dimensionality Reduction
 
03:51
In this Data Mining Fundamentals tutorial, we discuss the curse of dimensionality and the purpose of dimensionality reduction for data preprocessing. When dimensionality increases, data becomes increasingly sparse in the space that it occupies. Dimensionality reduction will help you avoid this. -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8LqR0 See what our past attendees are saying here: https://hubs.ly/H0f8LqS0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://plus.google.com/+Datasciencedojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 6487 Data Science Dojo
What is Multivariate Testing? | Data Science in Minutes
 
02:49
In this tutorial, we will explain: how a multivariate test differs from an A/B Test, how to create and conduct a multivariate test, and what questions you should be asking of your test. Multivariate testing is a technique for testing a hypothesis in which multiple variables are modified. The goal of multivariate testing is to determine which combination of variations performs the best out of all of the possible combinations. -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0g4Fb10 See what our past attendees are saying here: https://hubs.ly/H0g4Fz-0 -- Like Us: https://www.facebook.com/datasciencedojo/ Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/data-science-dojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo/ Vimeo: https://vimeo.com/datasciencedojo
Views: 516 Data Science Dojo
Intro to Azure ML: Modules & Experiments
 
17:33
Today we'll explore the interface of our new machine learning tool, Azure ML. How do you bring data to and from the outside world into Azure ML? The import dataset module can read in data from a variety of sources: HTTP, Azure SQL database, Hadoop Hive query, or Azure Storage Blobs. You can also convert the data to a variety of formats, then save them to your local computer. If the data is fairly large, you have access to export the data to other parts of the Azure ecosystem using the export data module. In Part 3 we will cover: - Creating an experiment - Exploring your experiment workspace - Modules - Importing and exporting data -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8mYD0 See what our past attendees are saying here: https://hubs.ly/H0f8n4J0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 9942 Data Science Dojo
Introduction to Data Mining: Data Quality
 
02:00
In this Data Mining Fundamentals, we introduce the most overlooked step in data mining, Data Quality. Understanding your data quality problems is very important to creating robust models that will actually work in production. -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8M2X0 See what our past attendees are saying here: https://hubs.ly/H0f8M330 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://plus.google.com/+Datasciencedojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 6908 Data Science Dojo
Intro to Azure ML: Renaming & Replicating Data
 
10:37
Now that we have a better understanding of our dataset. Let’s go out and gather more data. There is an additional dataset inside of Azure ML we can use to look up and cross reference the airport name with airport id, then bring with it, which state and city the airport belongs in. These features will add more information to our dataset. Before we can do this join, let’s mark the columns as being part of the arrival airport or origin airport. We will use the editmeta data module to rename these columns and to fork the dataset into two duplicate but differently named projections. -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8pln0 See what our past attendees are saying here: https://hubs.ly/H0f8ph90 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 4845 Data Science Dojo
Intro to Machine Learning with R & caret
 
01:42:33
Lecture starts at 3:00 The R programming language is experiencing rapid increases in popularity and wide adoption across industries. This popularity is due, in part, to R’s huge collection of open source machine learning algorithms. If you are a data scientist working with R, the caret package (short for [C]lassification [A]nd [RE]gression [T]raining) is a must-have tool in your toolbelt. The caret package provides capabilities that are ubiquitous in all stages of the data science project lifecycle. Most important of all, caret provides a common interface for training, tuning, and evaluating more than 200 machine learning algorithms. Not surprisingly, caret is a sure fire way to accelerate your velocity as a data scientist! In this presentation Dave Langer will provide an introduction to the caret package. The focus of the presentation will be using caret to implement some of the most common tasks of the data science project lifecycle and to illustrate incorporating caret into your daily work. Attendees will learn how to: • Create stratified random samples of data useful for training machine learning models. • Train machine learning models using caret’s common interface. • Leverage caret’s powerful features for cross-validation and hyperparameter tuning. • Scale caret via use of multi-core, parallel training. • Increase their knowledge of caret’s many features. R code and accompanying dataset: https://code.datasciencedojo.com/datasciencedojo/tutorials/tree/master/Introduction%20to%20Machine%20Learning%20with%20R%20and%20Caret caret website: http://topepo.github.io/caret/index.html Learn more about David here: https://www.meetup.com/data-science-dojo/events/239730653/ -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ from over 742 companies globally. This channel contains tutorials, community talks, and courses on data science and data engineering. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8wHn0 See what our past attendees are saying here: https://hubs.ly/H0f8wtJ0 -- Like Us: https://www.facebook.com/datasciencedojo/ Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/data-science-dojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo/ Vimeo: https://vimeo.com/datasciencedojo
Views: 43634 Data Science Dojo
Intro to Azure ML: Dropping & Selecting Columns
 
07:42
The machine learning model will learn from the data it has access to. Sometimes it becomes necessary to shed columns from our dataset so our machine learning model do not learn from them. In this video we’ll drop a few columns that do not currently add value in their current form. In this video we will drop Year, Quarter, Month, DayofMonth, OriginAirportID, DestAirportID, CRSDepTime, CRSArrTime, ArrDelay, Cancelled, and Diverted. If you want to know the rationale as to why these columns are being dropped, watch this video: https://www.youtube.com/watch?v=u3L_itVgp1A -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8pNn0 See what our past attendees are saying here: https://hubs.ly/H0f8psY0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 4131 Data Science Dojo
Introduction to R Programming for Excel Users
 
01:45:58
R programming is rapidly becoming a valuable skill for data professionals of all stripes and a must-have skill for aspiring data scientists. Adding R programming to your data analyst skillset allows you to leverage powerful data visualizations, statistical analyses, and even machine learning in your daily work. In this presentation, we illustrate how your knowledge of performing data analyses in Microsoft Excel gives you a unique foundation for quickly learning how to apply R in your daily work. No knowledge of R coding is required for this meetup as Dave will illustrate scenarios in Excel and then walk through how each Excel scenario is implemented in R. Attendees will learn how: • Fundamental concepts of Excel (e.g., working with tables, collections of cells, and functions) translate 100% to working with data in R. • Excel pivot tables translate to R code. • Creating charts in Excel is very similar to creating data visualizations in R. • R offers visualizations not available in Excel out of the box. An Excel spreadsheet and R code will be made available prior to the meetup via GitHub for attendees interested in following along during the talk. Repository: https://code.datasciencedojo.com/datasciencedojo/tutorials/tree/master/Business%20Data%20Analysis%20with%20Excel -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8xxZ0 See what our past attendees are saying here: https://hubs.ly/H0f8xyt0 -- Like Us: https://www.facebook.com/datasciencedojo/ Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/data-science-dojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo/ Vimeo: https://vimeo.com/datasciencedojo
Views: 32691 Data Science Dojo
How to do the Titanic Kaggle competition in R - Part 2
 
17:53
In part two of using RStudio for Data Science Dojo's Kaggle competition, we will show you more advance cleaning functions for your model. This video assumes you have watched part one, if you have not, view it here: https://www.youtube.com/watch?v=Zx2TguRHrJE&t=0s&list=PL8eNk_zTBST83rnRPkypp_0MrjoXobLDF&index=4 Titanic Data Set: https://www.kaggle.com/c/titanic Download RStudio: https://www.rstudio.com/products/rstudio -- At Data Science Dojo, we're extremely passionate about data science. We've helped educate and train 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f6y490 See what our past attendees are saying here: https://hubs.ly/H0f6y4c0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 15004 Data Science Dojo
Introduction to Data Mining: Types of Sampling
 
05:15
In this Data Mining Fundamentals tutorial, we discuss the different types of sampling for data preprocessing, such as random sampling, stratified sampling, sampling without and with replacement. We will also dive into the issues of sample size, and how that can effect your sampling. -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8LpT0 See what our past attendees are saying here: https://hubs.ly/H0f8Lqf0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://plus.google.com/+Datasciencedojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 7034 Data Science Dojo
Introduction to Text Analytics with R: TF-IDF
 
33:26
TF-IDF includes specific coverage of: • Discussion of how the document-term frequency matrix representation can be improved: – How to deal with documents of unequal lengths. – What to do about terms that are very common across documents. •Introduction of the mighty term frequency-inverse document frequency (TF-IDF) to implement these improvements: -TF for dealing with documents of unequal lengths. -IDF for dealing with terms that appear frequently across documents. • Implementation of TF-IDF using R functions and applying TF-IDF to document-term frequency matrices. • Data cleaning of matrices post TF-IDF weighting/transformation. About the Series This data science tutorial introduces the viewer to the exciting world of text analytics with R programming. As exemplified by the popularity of blogging and social media, textual data if far from dead – it is increasing exponentially! Not surprisingly, knowledge of text analytics is a critical skill for data scientists if this wealth of information is to be harvested and incorporated into data products. This data science training provides introductory coverage of the following tools and techniques: – Tokenization, stemming, and n-grams – The bag-of-words and vector space models – Feature engineering for textual data (e.g. cosine similarity between documents) – Feature extraction using singular value decomposition (SVD) – Training classification models using textual data – Evaluating accuracy of the trained classification models The data and R code used in this series is available here: https://code.datasciencedojo.com/datasciencedojo/tutorials/tree/master/Introduction%20to%20Text%20Analytics%20with%20R -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f5K1v0 See what our past attendees are saying here: https://hubs.ly/H0f5K1B0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 19147 Data Science Dojo
Intro to Azure ML: Data Exploration
 
19:00
Now that Azure Machine Learning Studio is setup, let’s begin an end-to-end data science project in Azure Machine Learning. We’ll choose the flight delay data, and use it to predict whether not a flight will be late on arrival based upon the flight’s circumstances. In this video we will begin our preliminary exploration into the dataset using Azure Machine Learning’s dataset module. In Part 4 we will cover: - introduction to projects - Exploring a data set using Azure ML - Building a data mining strategy -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8p250 See what our past attendees are saying here: https://hubs.ly/H0f8p2l0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 7097 Data Science Dojo
Introduction to Text Analytics with R: Our First Model
 
28:36
We are now ready to build our first model in RStudio and to do that, we cover: – Correcting column names derived from tokenization to ensure smooth model training. – Using caret to set up stratified cross validation. – Using the doSNOW package to accelerate caret machine learning training by using multiple CPUs in parallel. – Using caret to train single decision trees on text features and tune the trained model for optimal accuracy. – Evaluating the results of the cross validation process. About the Series This data science tutorial introduces the viewer to the exciting world of text analytics with R programming. As exemplified by the popularity of blogging and social media, textual data if far from dead – it is increasing exponentially! Not surprisingly, knowledge of text analytics is a critical skill for data scientists if this wealth of information is to be harvested and incorporated into data products. This data science training provides introductory coverage of the following tools and techniques: – Tokenization, stemming, and n-grams – The bag-of-words and vector space models – Feature engineering for textual data (e.g. cosine similarity between documents) – Feature extraction using singular value decomposition (SVD) – Training classification models using textual data – Evaluating accuracy of the trained classification models The data and R code used in this series is available here: https://code.datasciencedojo.com/datasciencedojo/tutorials/tree/master/Introduction%20to%20Text%20Analytics%20with%20R -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f5JNF0 See what our past attendees are saying here: https://hubs.ly/H0f5K120 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 16808 Data Science Dojo
Introduction to Data Mining: Document & Transaction Data
 
02:46
In this Data Mining Fundamentals video tutorial, we discuss another useful subcategory of record data, document data. We also discuss transaction data, which is record data where each record involves a set of items. -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8M0m0 See what our past attendees are saying here: https://hubs.ly/H0f8M0v0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://plus.google.com/+Datasciencedojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 7497 Data Science Dojo
Introduction to Data Mining: Basic Data Types
 
04:29
Continuing our series on Data Mining Fundamentals, we introduce you to the three data set types, Record, Ordered, and Graph and give you examples of when you would want to use each data set. -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8Lkc0 See what our past attendees are saying here: https://hubs.ly/H0f8Lkk0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://plus.google.com/+Datasciencedojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_scienc... Vimeo: https://vimeo.com/datasciencedojo
Views: 13161 Data Science Dojo
Introduction to Clustering
 
05:13
We will look at the fundamental concept of clustering, different types of clustering methods and the weaknesses. Clustering is an unsupervised learning technique that consists of grouping data points and creating partitions based on similarity. The ultimate goal is to find groups of similar objects. You will learn: - What is clustering? - Types of clustering methods: - Centroid-based clustering - Connectivity-based clustering - Distribution-based clustering - Density-based clustering - Clustering weaknesses -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 4000+ employees from over 780 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0g-Gx10 See what our past attendees are saying here: https://hubs.ly/H0g-GxH0 -- Like Us: https://www.facebook.com/datasciencedojo/ Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/data-science-dojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo/ Vimeo: https://vimeo.com/datasciencedojo
Views: 874 Data Science Dojo
Introduction to Text Analytics with R: Data Pipelines
 
31:49
In our next installment of introduction to text analytics, data pipelines, we take cover: – Exploration of textual data for pre-processing “gotchas” – Using the quanteda package for text analytics – Creation of a prototypical text analytics pre-processing pipeline, including (but not limited to): tokenization, lower casing, stop word removal, and stemming. – Creation of a document-frequency matrix used to train machine learning models About the Series This data science tutorial introduces the viewer to the exciting world of text analytics with R programming. As exemplified by the popularity of blogging and social media, textual data if far from dead – it is increasing exponentially! Not surprisingly, knowledge of text analytics is a critical skill for data scientists if this wealth of information is to be harvested and incorporated into data products. This data science training provides introductory coverage of the following tools and techniques: – Tokenization, stemming, and n-grams – The bag-of-words and vector space models – Feature engineering for textual data (e.g. cosine similarity between documents) – Feature extraction using singular value decomposition (SVD) – Training classification models using textual data – Evaluating accuracy of the trained classification models Kaggle Dataset: https://www.kaggle.com/uciml/sms-spam-collection-dataset The data and R code used in this series is available here: https://code.datasciencedojo.com/datasciencedojo/tutorials/tree/master/Introduction%20to%20Text%20Analytics%20with%20R -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f5K0c0 See what our past attendees are saying here: https://hubs.ly/H0f5JN90 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 18536 Data Science Dojo
Introduction to Data Mining: Data Aggregation
 
04:07
In this Data Mining Fundamentals tutorial, we discuss our first data cleaning strategy, data aggregation. Aggregation is combining two or more attributes (or objects) into a single attribute (or object). -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8M6V0 See what our past attendees are saying here: https://hubs.ly/H0f8Ln80 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://plus.google.com/+Datasciencedojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 9994 Data Science Dojo
Intro to Azure ML: Cleaning & Summarizing Data
 
23:09
Let’s understand the aggregate behavior of our features further by looking at summary statistics. Azure Machine Learning gives us easy access to mean, median, mode, min, and max. Let’s look at each measure to see what it means to the interpretation of the data. The summarize data module also gives us a count for each feature with missing values. We can then formulate a strategy for cleaning missing data. The cleaning functions used in this tutorial is not the optimal way to clean data, but we must learn to crawl before we walk. We’ll drop each row that has a missing value in our response class. Then use one of the measures of central tendency to fill in the other features median for numeric features and mode for categorical features. -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8pWp0 See what our past attendees are saying here: https://hubs.ly/H0f8pyW0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 5696 Data Science Dojo
Classification Models | Data Science in Minutes
 
02:46
Ever wonder what classification models are? Well, in machine learning there are many different models, all with different types of outcomes. In this quick tutorial, we go over classifications models. We talk about what they are, as well as what they are used for. -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0gcYdc0 See what our past attendees are saying here: https://hubs.ly/H0gcYNx0 -- Like Us: https://www.facebook.com/datasciencedojo/ Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/data-science-dojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo/ Vimeo: https://vimeo.com/datasciencedojo
Views: 635 Data Science Dojo
Introduction to the Confusion Matrix | Data Science in Minutes
 
03:54
Looking at a confusion matrix for the first time can be... confusing. But Data Science Dojo is here to help! In this tutorial, we will give you a brief overview of what a confusion matrix is, how to create your matrix, and when you can use it. Topics include: true positives and negatives, target classes, and predictive models. -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0gcpZT0 See what our past attendees are saying here: https://hubs.ly/H0gcqBY0 -- Like Us: https://www.facebook.com/datasciencedojo/ Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/data-science-dojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo/ Vimeo: https://vimeo.com/datasciencedojo
Views: 519 Data Science Dojo
Introduction to Text Analytics with R: N-grams
 
29:37
N-grams includes specific coverage of: • Validate the effectiveness of TF-IDF in improving model accuracy. • Introduce the concept of N-grams as an extension to the bag-of-words model to allow for word ordering. • Discuss the trade-offs involved of N-grams and how Text Analytics suffers from the “Curse of Dimensionality”. • Illustrate how quickly Text Analytics can strain the limits of your computer hardware. About the Series This data science tutorial introduces the viewer to the exciting world of text analytics with R programming. As exemplified by the popularity of blogging and social media, textual data if far from dead – it is increasing exponentially! Not surprisingly, knowledge of text analytics is a critical skill for data scientists if this wealth of information is to be harvested and incorporated into data products. This data science training provides introductory coverage of the following tools and techniques: – Tokenization, stemming, and n-grams – The bag-of-words and vector space models – Feature engineering for textual data (e.g. cosine similarity between documents) – Feature extraction using singular value decomposition (SVD) – Training classification models using textual data – Evaluating accuracy of the trained classification models The data and R code used in this series is available here: https://code.datasciencedojo.com/datasciencedojo/tutorials/tree/master/Introduction%20to%20Text%20Analytics%20with%20R -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f5JP_0 See what our past attendees are saying here: https://hubs.ly/H0f5K2v0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 14188 Data Science Dojo
Introduction to Data Mining: Sampling for Data Selection
 
04:03
In this Data Mining Fundamentals tutorial, we discuss the data preprocessing technique of sampling for data selection. Sampling is the main technique employed for data selection, and is often used for both the preliminary investigation of data and the final data analysis. -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8M760 See what our past attendees are saying here: https://hubs.ly/H0f8M7b0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://plus.google.com/+Datasciencedojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 6015 Data Science Dojo
Michael Hernandez's Story
 
02:29
Michael Hernandez had a vision for his team at Pearson Education “Ensure a customer's success with our products to earn brand loyalty. It's relational first, then we align their needs to our solutions based on what is learned, not just what we sell." With limited resources, he needed to prove the value of adding the right tools to focus on specific customers who need the most assistance. In order to combine the customer narrative with usage and support patterns for signs of risks, he turned to Data Science Dojo. Having led customer facing teams for many years, he knew there were patterns in the customer narrative, but needed to prove which actions would have the most impact to senior leadership. In September 2017 he attended our Data Science & Data Engineering Bootcamp in Austin with self-proclaimed “outdated programming knowledge”, but an eye for what to do with the data models once he could learn how to create them. After returning to Pearson, the impact was almost immediate. By aligning his vision of “ensuring success first”, he began to create data models outlining risk and opportunity indication campaigns. The collaboration of this data with various teams helped increase the speed and accuracy at which they could provide the right guidance at the right time for the customer’s success. They tried various implementations using tools such as R, Python and even Power BI to unleash a proactive support model instead of a historically reactive one. This new practice brought a dramatic growth in customer success for this product line, resulting in a giant leap for customer retention, growth and revenue. The impact was felt across the division, leading to a promotion, putting his team at the front line for expanding this across other digital products. Read more: https://datasciencedojo.com/michael-h... -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f5N6h0 See what our past attendees are saying here: https://hubs.ly/H0f5M_q0 -- Like Us: https://www.facebook.com/datascienced... Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/data... Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_scienc... Vimeo: https://vimeo.com/datasciencedojo
Views: 714 Data Science Dojo
Introduction to Data Mining: Data Attributes (Part 1)
 
04:52
In this video tutorial on Data Mining Fundamentals, we dive deeper into the vocabulary used in data mining, focusing on attributes. By the end of this tutorial, you will understand the different kinds of attribute classification, and when you should use each. -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8Lh-0 See what our past attendees are saying here: https://hubs.ly/H0f8LXl0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://plus.google.com/+Datasciencedojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 16213 Data Science Dojo
Scale R to Big Data with Hadoop & Spark
 
01:09:59
Outline: · Setup a Spark cluster with R installed (R server) · Wrangle data that is inside HDFS using R · Build and deploy a machine learning model using R Code and Prep Work (if you want to follow along): https://code.datasciencedojo.com/datasciencedojo/tutorials/tree/master/Scale%20R%20to%20Big%20Data%20Using%20Hadoop%20and%20Spark -- R is currently one of the most popular data science languages in the world. However, it’s always had constraints around scaling out to big data. What happens when you expand beyond a couple gigabytes of data? You packed up your data and you used something else; Python, Java, or Mahout to name a few. Now it’s possible to stick with R throughout your production analysis all the way to deployment, regardless of the data size. Companies like Apache, Revolution Analytics, Microsoft, and H20 showed us this year that distributed computing in R is possible. Today we’ll take a look at what the Microsoft stack is doing in terms of scaling R up to big data. In this talk we will show you Microsoft R Server, which is a Hadoop or Spark cluster where R is installed on every computer and is equipped with distributed processing libraries to utilize each and every computer in parallel. We’ll show you how to run your normal native R code via SSH, and how to get an RStudio server up and running on the cluster. We’ll show you how to wrangle data out of an HDFS and build machine learning models from your large dataset. Then show you how to pack up that model and deploy it to an elastically scaled web service so that anyone may call upon it for predictions and insights. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f8yZj0 See what our past attendees are saying here: https://hubs.ly/H0f8yZz0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 14190 Data Science Dojo
Introduction to Text Analytics with R: SVD with R
 
34:17
SVD with R includes specific coverage of: – Use of the irlba package to perform truncated SVD. – How to project a TF-IDF document vector into the SVD semantic space (i.e., LSA). – Comparison of model performance between a single decision tree and the mighty random forest. – Exploration of random forest tuning using the caret package. About the Series This data science tutorial is an Introduction to Text Analytics with R. As exemplified by the popularity of blogging and social media, textual data if far from dead – it is increasing exponentially! Not surprisingly, knowledge of text analytics is a critical skill for data scientists if this wealth of information is to be harvested and incorporated into data products. This data science training provides introductory coverage of the following tools and techniques: – Tokenization, stemming, and n-grams – The bag-of-words and vector space models – Feature engineering for textual data (e.g. cosine similarity between documents) – Feature extraction using singular value decomposition (SVD) – Training classification models using textual data – Evaluating accuracy of the trained classification models The data and R code used in this series is available here: https://code.datasciencedojo.com/datasciencedojo/tutorials/tree/master/Introduction%20to%20Text%20Analytics%20with%20R -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f5K5H0 See what our past attendees are saying here: https://hubs.ly/H0f5JTc0 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 9126 Data Science Dojo
Introduction to Text Analytics with R: Cosine Similarity
 
32:03
Cosine Similarity includes specific coverage of: – How cosine similarity is used to measure similarity between documents in vector space. – The mathematics behind cosine similarity. – Using cosine similarity in text analytics feature engineering. – Evaluation of the effectiveness of the cosine similarity feature. The data and R code used in this series is available via the public GitHub here About the Series This data science tutorial is an Introduction to Text Analytics with R. As exemplified by the popularity of blogging and social media, textual data if far from dead – it is increasing exponentially! Not surprisingly, knowledge of text analytics is a critical skill for data scientists if this wealth of information is to be harvested and incorporated into data products. This data science training provides introductory coverage of the following tools and techniques: – Tokenization, stemming, and n-grams – The bag-of-words and vector space models – Feature engineering for textual data (e.g. cosine similarity between documents) – Feature extraction using singular value decomposition (SVD) – Training classification models using textual data – Evaluating accuracy of the trained classification models The data and R code used in this series is available here: https://code.datasciencedojo.com/datasciencedojo/tutorials/tree/master/Introduction%20to%20Text%20Analytics%20with%20R -- At Data Science Dojo, we believe data science is for everyone. Our in-person data science training has been attended by more than 3600+ employees from over 742 companies globally, including many leaders in tech like Microsoft, Apple, and Facebook. -- Learn more about Data Science Dojo here: https://hubs.ly/H0f5K9v0 See what our past attendees are saying here: https://hubs.ly/H0f5KZ50 -- Like Us: https://www.facebook.com/datasciencedojo Follow Us: https://twitter.com/DataScienceDojo Connect with Us: https://www.linkedin.com/company/datasciencedojo Also find us on: Google +: https://plus.google.com/+Datasciencedojo Instagram: https://www.instagram.com/data_science_dojo Vimeo: https://vimeo.com/datasciencedojo
Views: 10832 Data Science Dojo

Here!
Here!
Top ten free dating sites
Here!
Teens fucking at home