Showing posts with label statistics. Show all posts
Showing posts with label statistics. Show all posts

Friday, January 31, 2020

Data Science & Statistics: Hypothesis testing. Null vs alternative

In this tutorial we'll introduce hypothesis testing. There are four steps in data-driven decision-making. First, you must formulate a hypothesis. Second, once you have formulated a hypothesis, you will have to find the right test for your hypothesis. Third, you execute the test. And fourth, you make a decision based on the result. Let’s start from the beginning. What is a hypothesis? Though there are many ways to define it, the most intuitive I’ve seen is: “A hypothesis is an idea that can be tested.” This is not the formal definition, but it explains the point very well. So, if I tell you that apples in New York are expensive, this is an idea, or a statement, but is not testable, until I have something to compare it with. For instance, if I define expensive as: any price higher than $1.75 dollars per pound, then it immediately becomes a hypothesis. Alright, what’s something that cannot be a hypothesis? An example may be: would the USA do better or worse under a Clinton administration, compared to a Trump administration? Statistically speaking, this is an idea, but there is no data to test it, therefore it cannot be a hypothesis of a statistical test. Actually, it is more likely to be a topic of another discipline. Conversely, in statistics, we may compare different US presidencies that have already been completed, such as the Obama administration and the Bush administration, as we have data on both. Generally, the researcher is trying to reject the null hypothesis. Think about the null hypothesis as the status quo and the alternative as the change or innovation that challenges that status quo. In our example, Paul was representing the status quo, which we were challenging. *Special Offer 20% Off*. Complete Data Science Online Training Program. Earn a data science degree at your own pace. Access your 20% off here: https://bit.ly/30YeGom So, you want to become a data scientist? Great! Our free step by step guide will walk you through how to start a career in data science: https://bit.ly/2TZF0gx Follow us on YouTube: https://www.youtube.com/c/365DataScience Connect with us on our social media platforms: Website: https://bit.ly/2TrLiXb Facebook: https://www.facebook.com/365datascience Instagram: https://www.instagram.com/365datascience Q&A Hub: https://365datascience.com/qa-hub/ LinkedIn: https://www.linkedin.com/company/365d... Prepare yourself for a career in data science with our comprehensive program: https://bit.ly/2XIOUSS Get in touch about the training at: support@365datascience.com Comment, like, share, and subscribe! We will be happy to hear from you and will get back to you! #hypotesis #testing #tutorial

Data Science & Statistics: Type I error vs Type II error



In general, we can have two types of errors - type I error and type II error. Sounds a bit boring, but this will be a fun lecture, I promise! First we will define the problems, and then we will see some interesting examples. Type I error is when you reject a true null hypothesis and is the more serious error. It is also called ‘a false positive’. The probability of making this error is alpha – the level of significance. Since you, the researcher, choose the alpha, the responsibility for making this error lies solely on you. Type II error is when you accept a false null hypothesis. The probability of making this error is denoted by beta. Beta depends mainly on sample size and population variance. So, if your topic is difficult to test due to hard sampling or has high variability, it is more likely to make this type of error. As you can imagine, if the data set is hard to test, it is not your fault, so Type II error is considered a smaller problem. Follow us on YouTube: https://www.youtube.com/c/365DataScience Connect with us on our social media platforms: Website: https://bit.ly/2TrLiXb Facebook: https://www.facebook.com/365datascience Instagram: https://www.instagram.com/365datascience Q&A Hub: https://365datascience.com/qa-hub/ LinkedIn: https://www.linkedin.com/company/365d... Prepare yourself for a career in data science with our comprehensive program: https://bit.ly/2HnysSC Get in touch about the training at: support@365datascience.com Comment, like, share, and subscribe! We will be happy to hear from you and will get back to you!

Data Science & Statistics: Population vs sample


Population vs sample - The first step of every statistical analysis you will perform is to determine whether the data you are dealing with is a population or a sample. A population is the collection of all items of interest to our study and is usually denoted with an uppercase N. The numbers we’ve obtained when using a population are called parameters. A sample is a subset of the population and is denoted with a lowercase n, and the numbers we’ve obtained when working with a sample are called statistics. *Special Offer 20% Off*. Complete Data Science Online Training Program. Earn a data science degree at your own pace. Access your 20% off here: https://bit.ly/2RvBCbv Populations are hard to define and observe. On the other hand, sampling is difficult. But samples have two big advantages. First, after you have experience, it is not that hard to recognize if a sample is representative. And, second, statistical tests are designed to work with incomplete data; thus, making a small mistake while sampling is not always a problem. So, you want to become a data scientist? Great! Our free step by step guide will walk you through how to start a career in data science: https://bit.ly/2TZF0gx Follow us on YouTube: ✅https://www.youtube.com/c/365DataScie... Connect with us on our social media platforms: ✅Website: https://bit.ly/2TrLiXb ✅Facebook: https://www.facebook.com/365datascience ✅Instagram: https://www.instagram.com/365datascience ✅Q&A Hub: https://365datascience.com/qa-hub/ ✅LinkedIn: https://www.linkedin.com/company/365d... Prepare yourself for a career in data science with our comprehensive program: ✅https://bit.ly/2HnysSC Get in touch about the training at: support@365datascience.com Comment, like, share, and subscribe! We will be happy to hear from you and will get back to you! #data #science #datascience #statistics #population #sample