fashion() output with corrr  

Tired of trying to get your data to print right or formatting it in a program like excel? Try out fashion() from the corrr package:

d <- data.frame(
  gender = factor(c("Male", "Female", NA)),
  age    = c(NA, 28.1111111, 74.3),
  height = c(188, NA, 168.78906),
  fte    = c(NA, .78273, .9)
)
d
#>   gender      age   height     fte
#> 1   Male       NA 188.0000      NA
#> 2 Female 28.11111       NA 0.78273
#> 3   <NA> 74.30000 168.7891 0.90000

library(corrr)
fashion(d)
#>   gender   age height  fte
#> 1   Male       188.00     
#> 2 Female 28.11         .78
#> 3        74.30 168.79  .90

But how does it work and what does it do?

The inspiration: correlations and decimals #

The insipration for fashion() came from my unending frustration at getting a correlation matrix to print out exactly how I wanted. For example, printing correlations typically looks something like:

mtcars %>% correlate()
#> # A tibble: 11 x 12
#>    rowname        mpg        cyl       disp         hp        drat
#>      <chr>      <dbl>      <dbl>      <dbl>      <dbl>       <dbl>
#> 1      mpg         NA -0.8521620 -0.8475514 -0.7761684  0.68117191
#> 2      cyl -0.8521620         NA  0.9020329  0.8324475 -0.69993811
#> 3     disp -0.8475514  0.9020329         NA  0.7909486 -0.71021393
#> 4       hp -0.7761684  0.8324475  0.7909486         NA -0.44875912
#> 5     drat  0.6811719 -0.6999381 -0.7102139 -0.4487591          NA
#> 6       wt -0.8676594  0.7824958  0.8879799  0.6587479 -0.71244065
#> 7     qsec  0.4186840 -0.5912421 -0.4336979 -0.7082234  0.09120476
#> 8       vs  0.6640389 -0.8108118 -0.7104159 -0.7230967  0.44027846
#> 9       am  0.5998324 -0.5226070 -0.5912270 -0.2432043  0.71271113
#> 10    gear  0.4802848 -0.4926866 -0.5555692 -0.1257043  0.69961013
#> 11    carb -0.5509251  0.5269883  0.3949769  0.7498125 -0.09078980
#> # ... with 6 more variables: wt <dbl>, qsec <dbl>, vs <dbl>, am <dbl>,
#> #   gear <dbl>, carb <dbl>

But this is just plain ugly. Personally, I wanted:

This is exactly what fashion does:

mtcars %>% correlate() %>% fashion()
#>    rowname  mpg  cyl disp   hp drat   wt qsec   vs   am gear carb
#> 1      mpg      -.85 -.85 -.78  .68 -.87  .42  .66  .60  .48 -.55
#> 2      cyl -.85       .90  .83 -.70  .78 -.59 -.81 -.52 -.49  .53
#> 3     disp -.85  .90       .79 -.71  .89 -.43 -.71 -.59 -.56  .39
#> 4       hp -.78  .83  .79      -.45  .66 -.71 -.72 -.24 -.13  .75
#> 5     drat  .68 -.70 -.71 -.45      -.71  .09  .44  .71  .70 -.09
#> 6       wt -.87  .78  .89  .66 -.71      -.17 -.55 -.69 -.58  .43
#> 7     qsec  .42 -.59 -.43 -.71  .09 -.17       .74 -.23 -.21 -.66
#> 8       vs  .66 -.81 -.71 -.72  .44 -.55  .74       .17  .21 -.57
#> 9       am  .60 -.52 -.59 -.24  .71 -.69 -.23  .17       .79  .06
#> 10    gear  .48 -.49 -.56 -.13  .70 -.58 -.21  .21  .79       .27
#> 11    carb -.55  .53  .39  .75 -.09  .43 -.66 -.57  .06  .27

And if I want to change the number of decimal places and have a different place holder for NA values (na_print):

mtcars %>% correlate() %>% fashion(decimals = 1, na_print = "x")
#>    rowname mpg cyl disp  hp drat  wt qsec  vs  am gear carb
#> 1      mpg   x -.9  -.8 -.8   .7 -.9   .4  .7  .6   .5  -.6
#> 2      cyl -.9   x   .9  .8  -.7  .8  -.6 -.8 -.5  -.5   .5
#> 3     disp -.8  .9    x  .8  -.7  .9  -.4 -.7 -.6  -.6   .4
#> 4       hp -.8  .8   .8   x  -.4  .7  -.7 -.7 -.2  -.1   .7
#> 5     drat  .7 -.7  -.7 -.4    x -.7   .1  .4  .7   .7  -.1
#> 6       wt -.9  .8   .9  .7  -.7   x  -.2 -.6 -.7  -.6   .4
#> 7     qsec  .4 -.6  -.4 -.7   .1 -.2    x  .7 -.2  -.2  -.7
#> 8       vs  .7 -.8  -.7 -.7   .4 -.6   .7   x  .2   .2  -.6
#> 9       am  .6 -.5  -.6 -.2   .7 -.7  -.2  .2   x   .8   .1
#> 10    gear  .5 -.5  -.6 -.1   .7 -.6  -.2  .2  .8    x   .3
#> 11    carb -.6  .5   .4  .7  -.1  .4  -.7 -.6  .1   .3    x

Look but don’t touch #

There’s a little bit of magic going on here, but the point to know is that fashion() is returning a noquote version of the original structure:

mtcars %>% correlate() %>% fashion() %>% class()
#> [1] "data.frame" "noquote"

That means that numbers are no longer numbers.

mtcars %>% correlate() %>% sapply(is.numeric)
#> rowname     mpg     cyl    disp      hp    drat      wt    qsec      vs 
#>   FALSE    TRUE    TRUE    TRUE    TRUE    TRUE    TRUE    TRUE    TRUE 
#>      am    gear    carb 
#>    TRUE    TRUE    TRUE

mtcars %>% correlate() %>% fashion() %>% sapply(is.numeric)
#> rowname     mpg     cyl    disp      hp    drat      wt    qsec      vs 
#>   FALSE   FALSE   FALSE   FALSE   FALSE   FALSE   FALSE   FALSE   FALSE 
#>      am    gear    carb 
#>   FALSE   FALSE   FALSE

Similarly, missing values are no longer missing values.

mtcars %>% correlate() %>% sapply(function(i) sum(is.na(i)))
#> rowname     mpg     cyl    disp      hp    drat      wt    qsec      vs 
#>       0       1       1       1       1       1       1       1       1 
#>      am    gear    carb 
#>       1       1       1

mtcars %>% correlate() %>% fashion() %>% sapply(function(i) sum(is.na(i)))
#> rowname     mpg     cyl    disp      hp    drat      wt    qsec      vs 
#>       0       0       0       0       0       0       0       0       0 
#>      am    gear    carb 
#>       0       0       0

So fashion() is for looking at output, not for continuing to work with it.

What to use it on #

fashion() can be used on most standard R structures such as scalars, vectors, matrices, data frames, etc:

fashion(10.277)
#> [1] 10.28
fashion(c(10.3785, NA, 87))
#> [1] 10.38       87.00
fashion(matrix(1:4, nrow = 2))
#>     V1   V2
#> 1 1.00 3.00
#> 2 2.00 4.00

You can also use it on non-numeric data. In this case, all fashion() will do is convert the data to characters, and then alter missing values:

fashion("Hello")
#> [1] Hello
fashion(c("Hello", NA), na_print = "World")
#> [1] Hello World

Now is a good time to take a look back at the opening example to see that it works on a data frame and with a factor column.

Exporting #

Don’t forget that it’s easy to export your fashioned output with something like:

my_data %>% fashion() %>% write.csv("fashioned_file.csv")

So what are you waiting for? Go forth and fashion()!

Sign off #

Thanks for reading and I hope this was useful for you.

For updates of recent blog posts, follow @drsimonj on Twitter, or email me at drsimonjackson@gmail.com to get in touch.

If you’d like the code that produced this blog, check out the blogR GitHub repository.

 
30
Kudos
 
30
Kudos

Now read this

Visualising Residuals

Residuals. Now there’s something to get you out of bed in the morning! OK, maybe residuals aren’t the sexiest topic in the world. Still, they’re an essential element and means for identifying potential problems of any statistical model.... Continue →