Principal Component Analysis and Factor Analysis
econometricsacademy

-Introduction to factor analysis
-Factor analysis vs Principal Component Analysis (PCA) side by side
Gopal Malakar

Principal Component Analysis and Factor Analysis Example
econometricsacademy

NOTE: On April 2, 2018 I updated this video with a new video that goes, step-by-step, through PCA and how it is performed. Check it out!
https://youtu.be/FgakZw6K1QQ
RNA-seq results often contain a PCA or MDS plot. This StatQuest explains how these graphs are generated, how to interpret them, and how to determine if the plot is informative or not. I've got example code (in R) for how to do PCA and extract the most important information from it on the StatQuest website: https://statquest.org/2015/08/13/pca-clearly-explained/
StatQuest with Josh Starmer

Principal Component Analysis and Factor Analysis in Stata
econometricsacademy

Udacity

A Webcast to accompany my 'Discovering Statistics Using ....' textbooks. This webcast looks at how to do Factor Analysis on SPSS and interpret the output.

Andy Field

Ali Ghodsi's lecture on January 5, 2017 for STAT 442/842: Classification, held at the University of Waterloo.
Introduction to dimensionality reduction via principal component analysis (PCA). Mathematical framework of PCA optimization problem.

Data Science Courses

Principal Component Analysis and Factor Analysis in R
econometricsacademy

Currell: Scientific Data Analysis. Minitab analysis for Figs 9.6 and 9.7 http://ukcatalogue.oup.com/product/9780198712541.do
© Oxford University Press

Oxford Academic (Oxford University Press)

MIT 18.650 Statistics for Applications, Fall 2016
Instructor: Philippe Rigollet
In this lecture, Prof. Rigollet reviewed linear algebra and talked about multivariate statistics.
MIT OpenCourseWare

Principal Component Analysis and Factor Analysis in SAS
econometricsacademy

tutorial on PCA
XLSTAT

In this video you will learn about Principal Component Analysis (PCA) and the main differences with Exploratory Factor Analysis (EFA). Also how to conduct the PCA analysis on SPSS and interpret its results.

educresem

Determining the efficiency of a number of variables in their ability to measure a single construct.
TheRMUoHP Biostatistics Resource Channel

This is the first video in a multipart tutorial on the principal components analysis algorithm. In this video we cover the concept of a basis which is fundamental to understanding PCA.

algomanic

I demonstrate how to perform a principal components analysis based on some real data that correspond to the percentage discount/premium associated with nine listed investment companies. Based on the results of the PCA, the listed investment companies could be segmented into two largely orthogonal components.

how2stats

This video demonstrates conducting a factor analysis (principal components analysis) with varimax rotation in SPSS.

Dr. Todd Grande

Principal Component Analysis, is one of the most useful data analysis and machine learning methods out there. It can be used to identify patterns in highly complex datasets and it can tell you what variables in your data are the most important. Lastly, it can tell you how accurate your new understanding of the data actually is.
In this video, I go one step at a time through PCA, and the method used to solve it, Singular Value Decomposition. I take it nice and slowly so that the simplicity of the method is revealed and clearly explained.
StatQuest with Josh Starmer

The main ideas behind PCA are actually super simple and that means it's easy to interpret a PCA plot: Samples that are correlated will cluster together apart from samples that are not correlated with them. In this video, I walk through the ideas so that you will have an intuitive sense of how PCA plots are draw. If you'd like more details, check out my full length PCA video here: https://youtu.be/_UVHneBUBW0
StatQuest with Josh Starmer

In this video, we look at how to run an exploratory factor analysis (principal components analysis) in SPSS (Part 1 of 6).
Youtube SPSS factor analysis
Principal Component Analysis
Video Transcript: In this video we'll take a look at how to run a factor analysis or more specifically we'll be running a principal components analysis in SPSS. And as we begin here it's important to note, because it can get confusing in the field, that factor analysis is an umbrella term where the whole subject area is known as factor analysis but within that subject there's two types of main analyses that are run. The first type is called principal components analysis and that's what we'll be running in SPSS today. And the other type is known as common factor analysis and you'll see that come up sometimes. But in my experience principal components analysis is the most commonly used procedure and it's also the default procedure in SPSS. And if you look on the screen here you can see there's five variables: SWLS 1, 2 3, 4 and 5. And what these variables are they come from the items of the Satisfaction with Life Scale published by Diener et al. And what people do is they take these five items they respond to the five items where SLWS1 is "In most ways my life is close to my ideal;" and then we have "The conditions of my life are excellent;" "I am satisfied with my life;" "So far I've gotten the important things I want in life;" and then SWLS5 is "If I could live my life over I would change almost nothing." So what happens is the people respond to these five questions or items and for each question they have the following responses, which I've already input here into SPSS value labels: strongly disagree all the way through strongly agree, which gives us a 1 through 7 point scale for each question. So what we want to do here in our principal components analysis is we want to go ahead and analyze these five variables or items and see if we can reduce these five variables or items into one or a few components or factors which explain the relationship among the variables. So let's go ahead and start by running a correlation matrix and what we'll do is we're going to Analyze, Correlate, Bivariate, and then we'll move these five variables over. Go ahead and click OK and then here notice we get the correlation matrix of SWLS1 through SWLS5. So these are all the intercorrelations that we have here. And if we look at this off-diagonal where these ones here are the diagonal. And they're just a one because of variable is correlated with itself so that's always 1.0. And then the off-diagonal here represents the correlations of the items with one another. So for example this .531 here; notice it says in SPSS that the correlation is significant at the .01 level, two tailed. So this here is the correlation between SWLS2 and SLWS1. So all of these in this triangle here indicate the correlation between the different variables or items on the Satisfaction with Life Scale. And what we want to see here in factor analysis which we're about to run is that these variables are correlated with one another and at a minimum significantly so. Because what factor analysis or principal components analysis does is that it analyzes the correlations or relationships between our variables and basically we try to determine a smaller number of variables that can explain these correlations. So notice here we're starting with five variables, SWLS1 through five. Well hopefully in this analysis when we run our factor analysis we'll come out with one component that does a good job of explaining all these correlations here. And one of the key points of factor analysis is it's a data reduction technique. What that means is we enter a certain number of variables, like five in this example, or even 20 or 50 or what have you, and we hope to reduce those variables down to just a few; between one and let's say 5 or 6 is most of the solutions that I see. Now in this case since we have five variables we really want to reduce this down to 1 or 2 at most but 1 would be good in this case. So that's really a key point of factor analysis: we take a number of variables and we try to explain the correlations between those variables through a smaller number of factors or components and by doing that what we do is we get more parsimonious solution, a more succinct solution that explains these variables or relationships. And there's a lot of applications of factor analysis but one of the primary ones is when you're analyzing scales or items on a scale and you want to see how that scale turns out, so how many dimensions or factors doesn't it have to it.

Quantitative Specialists

Principal Component Analysis (PCA) using Python (Scikit-learn)
Michael Galarnyk

analisis faktor PCA with eviews

xanderputong

Part 1 - This video tutorial guides the user through a manual principal components analysis of some simple data. The goal is to acquaint the viewer with the underlying concepts and terminology associated with the PCA process. This will be helpful when the user employs one of the "canned" R procedures to do PCA (e.g. princomp, prcomp), which requires some knowledge of concepts such as loadings and scores.

Steve Pittard

Factor Analysis and PCA
Factor Analysis
Reduce large number of variables into fewer number of factors
Co-variation is due to latent variable that exert casual influence on observed variables
Communalities – each variable’s variance that can be explained by factors
Principal Component Analysis
Variable reduction process – smaller number of components that account for most variance in set of observed variables
Explain maximum variance with fewest number of principal components
PCA Factor Analysis
Observed variance is analyzed Shared variance is analyzed
1.00’s are put in diagonal – all variance in variables Communalities in diagonal – only variance shared with other variables are included – exclude error variance and variance unique to each variable
Analyze variance Analyze covariance
Examrace

This video demonstrates how conduct an exploratory factor analysis (EFA) in SPSS. The Principal Axis Factoring (PAF) method is used and compared to Principal Components Analysis (PCA).

Dr. Todd Grande

Video illustrates use of Principal components analysis in SPSS for the purposes of data reduction. Illustrates how to reduce a set of measured variables to a smaller set of components for inclusion as predictors in a regression analysis. Illustrates use of component scores. Parallel analysis demonstration provided using Parallel analysis engine found at http://ires.ku.edu/~smishra/parallelengine.htm

Views: 9942
Step by step detail with example of Principal Component Analysis PCA
Gopal Malakar

Jinsuh Lee

This video provides an introduction to factor analysis, and explains why this technique is often used in the social sciences. Check out https://ben-lambert.com/econometrics-course-problem-sets-and-data/ for course materials, and information regarding updates on each of the courses. Quite excitingly (for me at least), I am about to publish a whole series of new videos on Bayesian statistics on youtube. See here for information: https://ben-lambert.com/bayesian/ Accompanying this series, there will be a book: https://www.amazon.co.uk/gp/product/1473916364/ref=pe_3140701_247401851_em_1p_0_ti

Ben Lambert

Video tutorial on running principal components analysis (PCA) in R with RStudio.
Please view in HD (cog in bottom right corner).
Hefin Rhys

Video covers
- Overview of Principal Componets Analysis (PCA) and why use PCA as part of your machine learning toolset
- Using princomp function in R to do PCA
- Visually understanding PCA

Melvin L

In this video you will learn Principal Component Analysis using SaS. You will learn how to perform PCA using Proc Factor and Proc Princomp
Analytics University

StatQuest with Josh Starmer

In this video I have talked about the basics of Principal Component Analysis. I have also talked about the difference between Principal Component Analysis (PCA) and Exploratory Factor Analysis (EFA)
Analytics University

This video explains what is Principal Component Analysis (PCA) and how it works. Then an example is shown in XLSTAT statistical software.

XLSTAT

Data Science for Biologists
Dimensionality Reduction: Principal Components Analysis
Part 1
Data4Bio

Applied Multivariate Statistical Modeling by Dr J Maiti,Department of Management, IIT Kharagpur.For more details on NPTEL visit http://nptel.ac.in

nptelhrd

Learn how to reduce many variables to a few significant variable combinations, or principal components. See how to create the components on covariances, correlations, or unscaled; examine the contribution of each variable to the related principal component; and save the principal component values to the data table for future analysis.

JMPSoftwareFromSAS

We can find the direction of the greatest variance in our data from the covariance matrix. It is the vector that does not rotate when we multiply it by the covariance matrix. Such vectors are called eigenvectors, and have corresponding eigenvalues. Eigenvectors that have the largest eigenvalues will be the principal components (new dimensions of our data).

Victor Lavrenko

In this python for data science tutorial, you will learn about how to do principal component analysis (PCA) and Singular value decomposition (SVD) in python using seaborn, pandas, numpy and pylab. environment used is Jupyter notebook.
This is the 19th Video of Python for Data Science Course! In This series I will explain to you Python and Data Science all the time! It is a deep rooted fact, Python is the best programming language for data analysis because of its libraries for manipulating, storing, and gaining understanding from data. Watch this video to learn about the language that make Python the data science powerhouse. Jupyter Notebooks have become very popular in the last few years, and for good reason. They allow you to create and share documents that contain live code, equations, visualizations and markdown text. This can all be run from directly in the browser. It is an essential tool to learn if you are getting started in Data Science, but will also have tons of benefits outside of that field. Harvard Business Review named data scientist "the sexiest job of the 21st century." Python pandas is a commonly-used tool in the industry to easily and professionally clean, analyze, and visualize data of varying sizes and types. We'll learn how to use pandas, Scipy, Sci-kit learn and matplotlib tools to extract meaningful insights and recommendations from real-world datasets

TheEngineeringWorld

Learn how to visualize the relationships between variables and the similarities between observations using Analyse-it for Microsoft Excel.
The tutorial covers the following tasks:
- Understanding the relationship between variables
- Reducing the dimensionality of the data
- Understanding the similarities between observations
For more information and to download the tutorial examples, visit http://analyse-it.com/docs/tutorials/correlation/overview

Analyse-it

In this video, we look at how to run an exploratory factor analysis (principal components analysis) in SPSS (Part 3 of 6).
Video Transcript: we'll also pull up our Scree plot here. These two tables here, the Total Variance Explained and Scree Plot, both deal with what's known as our factor extraction methods. If you recall when we went through SPSS, the options, we left the eigenvalue greater than one rule option selected as the default, but we also selected that a Scree plot be output in our analysis. And these are two of the most commonly used procedures for deciding how many components or factors to retain; how many do you want to keep in our solution. Here for our Total Variance Explained table, notice first of all that we have 5 components in our rows here. And you may be wondering, well wait a second, I thought factor analysis, the whole purpose of it, was to reduce our number of variables into a smaller number of components? And if you are thinking that, you're correct, that is our purpose here. But, as just a matter of definition, it's always the case that the number of variables we input in our analysis, will always be equal to the number of components shown here. So we have five variables input in our analysis, therefore we have 5 rows or 5 components shown here. Now here in our Initial Eigenvalues table, notice that we have these various eigenvalues. So the first one is 3.136 and everything after that is less than 1. Now if you recall our first rule was eigenvalue greater than one rule. So that was, keep the number of factors or components that have eigenvalues greater than one. All other components with eigenvalues less than one, such as these here, we do not keep. If you look at the Extraction Sums of Squared Loadings section of this table, notice that there's only one value here now. And what this means is this is how many components SPSS retained or kept, based on the rule. So since only one component had an eigenvalue greater than one, we only have one component in our solution here. So the results of this rule tells us, or indicates, that we want to have one component. So in other words, we reduce those 5 variables down to one component. Or that one component, from this perspective, does a pretty good job at explaining the relationships between SWLS1 through SWLS5. One way to assess how good of a job this analysis did at explaining the relationships between those variables, is to look at the percent of variance accounted for by the component. And in this example, our one component solution accounted for 62.72% of the variance, or about 63% of variance, which is pretty good in practice. I typically see solutions between 40% and 60% of the variance, in the 40s through 60s, in that range. I don't typically see many solutions with variance higher than 70, and a solution below 40 is not very strong. But that's typically the range that I'll see them in, so I would say that 63% is pretty good in practice. Now an interesting thing here, recall that we had 5 components. If you add up these eigenvalues they will equal to 5, within rounding error. So the sum of the eigenvalues is always equal to the number of components, or put another way, the number of original variables in your analysis. So if I had 10 variables in my analysis here, then these values would sum up to 10. And in fact would be 10 rows in this table. Now since I have 5 variables, I'm going to have 5 components output in my initial solution, and the eigenvalues will sum to 5. And the reason why that's good to know is that if you divide the eigenvalue for our retained component the 3.136/5 you will get exactly .6272 or 62.72% when converted to a percentage. So the percent of variance accounted for is literally the magnitude of the eigenvalue divided by the sum of the eigenvalues, or 5 in this case. OK, so in summary, our eigenvalue greater than one rule indicated that one component should be retained. Next let's look at the Scree plot. So here our Scree plot, notice first of all that on the X-axis, the component number is plotted, so this is the first component, second component, third, and so on. And on the Y-axis we have our eigenvalue plotted. And in fact if you think about it, this graph is really just plotting, notice this first value, 3.136, that is right here. Component 2 is somewhere between .6 and .7, and if you look here, here we go component 2, .625. Notice component 3 drops off just a little, it is .534 Component 4 is .463, and then component 5 is .231. So this Scree plot is literally just these eigenvalues plotted from left to right. Now the rule of thumb for interpreting the Scree plot is as follows:

Quantitative Specialists

Representing multivariate random signals using principal components. Principal component analysis identifies the basis vectors that describe the largest fraction of the variance in the observed data. It is used to find a low-dimensional representation for high-dimensional signals. PCA can be used to improve the SNR by a factor of N/p where the signal has p components, the noise is white, and the data dimension is N.

Barry Van Veen

This video walks you through some basic methods of Principal Component Analysis like generating screeplots, factor loadings and predicting factor scores

MKT Res

This video shows how to perform a PCA with FactoMineR and how to plot readable graphs.
François Husson

PCR (Principal Components Regression) is a regression method that can be divided into three steps:
The first step is to run a PCA (Principal Components Analysis) on the table of the explanatory variables,
Then run an Ordinary Least Squares regression (OLS regression) on the selected components,
Finally compute the parameters of the model that correspond to the input variables.
Principal Component Regression models:
PCA allows to transform an X table with n observations described by variables into an S table with n scores described by q components, where q is lower or equal to p and such that (S'S) is invertible. An additional selection can be applied on the components so that only the r components that are the most correlated with the Y variable are kept for the OLS regression step. We then obtain the R table.
The OLS regression is performed on the Y and R tables. In order to circumvent the interpretation problem with the parameters obtained from the regression, XLSTAT transforms the results back into the initial space to obtain the parameters and the confidence intervals that correspond to the input variables.
XLSTAT

In this video, we are going to see exactly how we can perform dimensionality reduction with a famous Feature Extraction technique - Principal Component Analysis PCA. We’ll get into the math that powers it
CodeEmporium

