{
"nbformat": 4,
"nbformat_minor": 4,
"metadata": {
"title": "Shuffle vs. Resample",
"kernelspec": {
"name": "ir",
"display_name": "R",
"language": "R"
},
"language_info": {
"name": "R",
"codemirror_mode": "r",
"pygments_lexer": "r",
"mimetype": "text/x-r-source",
"file_extension": ".r",
"version": "4.0.3"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I created this notebook to help us play around with the concepts of shuffle and resample."
]
},
{
"cell_type": "code",
"metadata": {
"trusted": false
},
"execution_count": 58,
"source": [
"# This code will load the R packages we will use\n",
"\n",
"suppressPackageStartupMessages({\n",
" library(mosaic)\n",
" library(supernova)\n",
" library(ggpubr)\n",
" #library(Lock5withR)\n",
" #library(fivethirtyeight)\n",
"})\n",
"\n",
"font_size = function (size) {\n",
" theme(text = element_text(size = size))}\n",
"\n",
"\n",
"middle <- function(x, prop = .95) {\n",
" sorted <- sort(x)\n",
" tail_size <- (1 - prop) / 2\n",
" upper_cut <- sorted[floor(length(x) * (1 - tail_size))]\n",
" lower_cut <- sorted[ceiling(length(x) * tail_size)]\n",
" ((x <= upper_cut) + (x > lower_cut)) == 2\n",
"}\n",
"\n",
"\n",
"# Adjust the plots to have nice sizes\n",
"options(repr.plot.width = 6, repr.plot.height = 4)\n",
"\n",
""
],
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## An Extreme Data Set\n",
"\n",
"It may be easier to appreciate what `shuffle()` is doing with a more extreme example rather than a more realistic data set!\n",
"\n",
"Imagine two groups, A and B. All cases in group A have an outcome of 0 and all those in B have an outcome of 10. For now let's assume there are 50 cases in each group (but you can play around and set `n` to anything you want)."
]
},
{
"cell_type": "code",
"metadata": {
"trusted": false
},
"execution_count": 40,
"source": [
"n <- 50\n",
"\n",
"group <- c(rep(\"A\", n), rep(\"B\",n))\n",
"outcome <- c(rep(0, n), rep(10,n))\n",
"extremedata <- data.frame(outcome, group)\n",
"\n",
"head(extremedata, 2*n)"
],
"outputs": [
{
"output_type": "execute_result",
"execution_count": 40,
"metadata": {},
"data": {
"text/html": [
"<table itemscope=\"\" itemtype=\"http://schema.org/Table\" data-itemscope=\"root\">\n",
" <thead>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"></th>\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">outcome</th>\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">group</th>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"></th>\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><dbl></th>\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><chr></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">1</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">2</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">3</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">4</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">5</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">6</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">7</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">8</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">9</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">11</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">12</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">13</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">14</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">15</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">16</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">17</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">18</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">19</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">20</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">21</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">22</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">23</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">24</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">25</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">26</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">27</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">28</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">29</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">30</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">⋮</td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">⋮</td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">⋮</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">71</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">72</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">73</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">74</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">75</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">76</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">77</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">78</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">79</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">80</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">81</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">82</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">83</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">84</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">85</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">86</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">87</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">88</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">89</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">90</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">91</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">92</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">93</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">94</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">95</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">96</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">97</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">98</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">99</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">100</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>"
]
}
}
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Linking _randomness_ and `shuffle()`\n",
"\n",
"We're trying to link together \"randomly generated data\" to `shuffle()` to having 0 difference between groups (represented as $\\beta_1 = 0$ and ultimately $Y_i = \\beta_0 + \\epsilon_i$). Let's take a look at why.\n",
"\n",
"Our groups start off very extreme. Group perfectly predicts outcome. If you knew a case was in group A, you would predict they would have an outcome of 0 and there would be no error.\n",
"\n",
"Indeed, if you look at the `favstats` you'll see that the means are 0 and 10 (and there is no spread at all within groups). In this case, the data have a very large $b_1$! That is, $b_1 = 10$!"
]
},
{
"cell_type": "code",
"metadata": {
"trusted": false
},
"execution_count": 72,
"source": [
"gf_histogram(~ outcome, data = extremedata) %>%\n",
"gf_facet_grid(group ~ .) %>%\n",
"gf_vline(xintercept = ~mean, data = favstats(outcome ~ group, data = extremedata))\n",
"\n",
"favstats(outcome ~ group, data = extremedata)\n",
"\n",
"print(\"This is the sample's b1:\")\n",
"b1(outcome ~ group, data = extremedata)"
],
"outputs": [
{
"output_type": "execute_result",
"execution_count": 72,
"metadata": {},
"data": {
"text/html": [
"<table itemscope=\"\" itemtype=\"http://schema.org/Table\" data-itemscope=\"root\">\n",
" <thead>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">group</th>\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">min</th>\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">Q1</th>\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">median</th>\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">Q3</th>\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">max</th>\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">mean</th>\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">sd</th>\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">n</th>\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">missing</th>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><chr></th>\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><dbl></th>\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><dbl></th>\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><dbl></th>\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><dbl></th>\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><dbl></th>\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><dbl></th>\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><dbl></th>\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><int></th>\n",
" <th itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><int></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">A</td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">50</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" </tr>\n",
" <tr itemscope=\"\" itemtype=\"http://schema.stenci.la/TableRow\">\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\">B</td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">10</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">50</span></td>\n",
" <td itemscope=\"\" itemtype=\"http://schema.stenci.la/TableCell\"><span\n",
" data-itemtype=\"http://schema.org/Number\">0</span></td>\n",
" </tr>\n",
" </tbody>\n",
"</table>"
]
}
},
{
"output_type": "stream",
"name": "stdout",
"text": "[1] \"This is the sample's b1:\"\n"
},
{
"output_type": "execute_result",
"execution_count": 72,
"metadata": {},
"data": {
"text/html": [
"<span data-itemtype=\"http://schema.org/Number\" data-itemscope=\"root\">10</span>"
]
}
},
{
"output_type": "display_data",
"metadata": {},
"data": {
"image/png": [
""
]
}
}
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"What would happen if we shuffled these groups? (I'll color the shuffled data in `dodgerblue`.)"
]
},
{
"cell_type": "code",
"metadata": {
"trusted": false
},
"execution_count": 73,
"source": [
"gf_histogram(~ shuffle(outcome), data = extremedata, fill = \"dodgerblue\") %>%\n",
"gf_facet_grid(group ~ .) "
],
"outputs": [
{
"output_type": "display_data",
"metadata": {},
"data": {
"image/png": [
""
]
}
}
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that because the groups are randomly generated, A has an equal chance of being associated with a high or low score (and so does B). So now take a look at the shuffled $b_1$! Observe how small these shuffled $b_1$s are. (Run the code below a few times.)"
]
},
{
"cell_type": "code",
"metadata": {
"trusted": false
},
"execution_count": 71,
"source": [
"extremedata$shuff_outcome <- shuffle(extremedata$outcome)\n",
"\n",
"gf_histogram(~ shuff_outcome, data = extremedata, fill = \"dodgerblue\") %>%\n",
"gf_facet_grid(group ~ .) %>%\n",
"gf_vline(xintercept = ~mean, data = favstats(shuff_outcome ~ group, data = extremedata))\n",
"\n",
"print(\"This is a shuffled b1:\")\n",
"b1(shuff_outcome ~ group, data = extremedata)"
],
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": "[1] \"This is a shuffled b1:\"\n"
},
{
"output_type": "execute_result",
"execution_count": 71,
"metadata": {},
"data": {
"text/html": [
"<span data-itemtype=\"http://schema.org/Number\" data-itemscope=\"root\">1.6</span>"
]
}
},
{
"output_type": "display_data",
"metadata": {},
"data": {
"image/png": [
""
]
}
}
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The shuffled $b_1$s end up being smaller because the randomized groups are very similar to each other now! So we characterize the **random process (of shuffle)** as a DGP where $\\beta_1=0$, the true difference is 0. And now, many of the shuffled data look very much like they could have been generated by the empty model (the grand mean of the whole data set -- depicted in `blue`)."
]
},
{
"cell_type": "code",
"metadata": {
"trusted": false
},
"execution_count": 70,
"source": [
"extremedata$shuff_outcome <- shuffle(extremedata$outcome)\n",
"\n",
"gf_histogram(~ shuff_outcome, data = extremedata, fill = \"dodgerblue\") %>%\n",
"gf_facet_grid(group ~ .) %>%\n",
"gf_vline(xintercept = ~mean, data = favstats(shuff_outcome ~ group, data = extremedata)) %>%\n",
"gf_vline(xintercept = ~mean, data = favstats(~shuff_outcome, data = extremedata), color = \"blue\")\n",
""
],
"outputs": [
{
"output_type": "display_data",
"metadata": {},
"data": {
"image/png": [
""
]
}
}
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Linking _DGP is basically just like our data_ and `resample()` (resampling our cases)\n",
"\n",
"We're also trying to link together \"DGP is like our data\" to `resample()` to saying as $\\beta_1 = \\b_1 = 10$ and ultimately $Y_i = \\beta_0 + 10X_i + \\epsilon_i$). Let's take a look at why.\n",
"\n",
"When we resample (depicted in `tomato`), we get a very similar plot to our original data."
]
},
{
"cell_type": "code",
"metadata": {
"trusted": false
},
"execution_count": 69,
"source": [
"gf_histogram(~ outcome, data = resample(extremedata), fill = \"tomato\") %>%\n",
"gf_facet_grid(group ~ .) "
],
"outputs": [
{
"output_type": "display_data",
"metadata": {},
"data": {
"image/png": [
""
]
}
}
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To make it a bit easier to see, here we've put the original extreme data (in gray), the resampled data (in `tomato`), and shuffled data (in `dodgerblue`) next to each other."
]
},
{
"cell_type": "code",
"metadata": {
"trusted": false
},
"execution_count": 76,
"source": [
"originalplot <- gf_histogram(~ outcome, data = extremedata) %>%\n",
"gf_facet_grid(group ~ .) \n",
"\n",
"resampleplot <- gf_histogram(~ outcome, data = resample(extremedata), fill = \"tomato\") %>%\n",
"gf_facet_grid(group ~ .) \n",
"\n",
"shuffleplot <- gf_histogram(~ shuff_outcome, data = extremedata, fill = \"dodgerblue\") %>%\n",
"gf_facet_grid(group ~ .) \n",
"\n",
"ggarrange(originalplot, resampleplot, shuffleplot, \n",
" labels = c(\"Original\", \"Resample\", \"Shuffle\"),\n",
" ncol = 3)"
],
"outputs": [
{
"output_type": "display_data",
"metadata": {},
"data": {
"image/png": [
""
]
}
}
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that because the groups are made up of resampled cases, resampled group A is still made up of cases that used to be in the original group A (and the same with group B). So now take a look at the resampled $b_1$! Indeed, if you run the code a few times, the resampled $b_1$s are the same as our original $b_1$."
]
},
{
"cell_type": "code",
"metadata": {
"trusted": false
},
"execution_count": 75,
"source": [
"resampledata <- resample(extremedata)\n",
"\n",
"gf_histogram(~ outcome, data = resampledata, fill = \"tomato\") %>%\n",
"gf_facet_grid(group ~ .) %>%\n",
"gf_vline(xintercept = ~mean, data = favstats(outcome ~ group, data = resampledata))\n",
"\n",
"print(\"This is a resampled b1:\")\n",
"b1(outcome ~ group, data = resampledata)\n",
""
],
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": "[1] \"This is a resampled b1:\"\n"
},
{
"output_type": "execute_result",
"execution_count": 75,
"metadata": {},
"data": {
"text/html": [
"<span data-itemtype=\"http://schema.org/Number\" data-itemscope=\"root\">10</span>"
]
}
},
{
"output_type": "display_data",
"metadata": {},
"data": {
"image/png": [
""
]
}
}
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So to recap resample, we characterize the **DGP like our sample** as a DGP where $\\beta_1=b_1$, the true difference is the same as that of our sample (10). And now, many of the resampled data look very much like they could have been generated by the best fitting complex model model (the two group means).\n",
"\n",
"## A note about wording\n",
"\n",
"For our purposes, it's easier to call it `shuffle()` versus `resample()` because those are the R functions we use. In data science and statistics, they call `shuffle()`!number(0)**RANDOMIZATION** and they call `resample()`!number(0)**BOOTSTRAPPING**."
]
}
]
}