This spring, I'm teaching a new class on data visualization with R. I'm posting all materials as I go. Feel free to follow along. Each lecture has slides and an interactive worksheet.
#rstats
Very excited to announce my latest project, a book on data visualization. Approximately half of the book is written, and all completed chapters are now available as online preview.
#rstats
#dataviz
I also don’t get this whole R vs python thing. They clearly have got very different strengths and weaknesses.
R: analyze tabular data and make visualizations.
python: strangle small mammals to death and crush their bones.
As news outlets are creating increasingly complex data analysis and visualization projects, we should start demanding fully reproducible analysis scripts with every story.
New blog post: PCA tidyverse style.
I've been struggling with doing PCA using an idiomatic tidyverse approach. Now I think I've figured it out.
#rstats
#tidyverse
Now on CRAN: ggridges 0.5, with support for shading by probability, points overlaid on density curves, rainclouds, and rugs.
#rstats
#dataviz
#ggridges
Attention
#rstats
users:
In a few weeks, I'm going to release
#cowplot
1.0, and there are going to be some important changes from the current release. I encourage you to check out the development version now and verify things work for you. Thread.
Spatial plotting just improved a lot in the development version of ggplot2. In a nutshell, you can now mix and match regular geoms with `geom_sf()` and `coord_sf()`. If you're doing any geospatial plotting, please test this out. 1/n
#rstats
#ggplot2
Sad to announce I'll have to abandon my project of writing a book on
#dataviz
entirely in
#rstats
, with all figures programmatically generated. I just learned it is not possible. Ignore the 21 chapters already online. They are a mirage.
I've now mostly figured out how to implement bivariate color scales in
#ggplot2
.
The guide box still needs some visual tweaking, though.
@lenkiefer
@hadleywickham
You can read the free ebook "Fundamentals of Data Visualization" by Claus O. Wilke on
#dataviz
with
#rstats
on the following website of the named author:
I spent way too much time this weekend writing orthographic projection code. Now that it works, let's celebrate with a spinning globe.
#rstats
#dataviz
The latest
#ggplot2
has a `clip = "off"` option to allow drawing outside of the plot panel. This allows for all sorts of neat plotting tricks. E.g., direct labeling. ("Toyota Corolla" extends beyond the plot area.)
#rstats
#dataviz
Designer: I don't have software to make a pie chart.
Manager: Just manually draw two differently colored wedges and write "78%" into the bigger one. Nobody will be able to tell if it's off by a little.
This is why we’ve gone away from installing software on student computers in my data science class. Installing python is literally harder than anything you could reasonably do with it (in an intro class) once it’s installed properly.
I took my 1st CS course at age 18. I almost dropped it 3 days in b/c I couldn’t install the software. My friend had to help me install it.
We are married now. Sometimes I make him install the updates on my Mac, for old times sake.
True story.
I'm receiving feedback that (paraphrased) installing software "is easy" or "builds character" or "is a critical skill."
My response: I doubt you've ever taught programming at scale to non-CS majors. Thread.
This is why we’ve gone away from installing software on student computers in my data science class. Installing python is literally harder than anything you could reasonably do with it (in an intro class) once it’s installed properly.
I just merged support for one of the most frequently requested features into the ggplot2 development branch: Plot titles that span the entire plot.
#rstats
#ggplot2
It seems that the graduate student tax is in the senate tax bill as well. Whatever universities will do to address this issue, this will cause a resource drain from the US research enterprise. Bye-bye US leadership in STEM.
Nice! “Before Slack, you would have to go look in five different places even to find a file,” says
@SlackHQ
's
@aunder
. “Was it a PDF? Was it a file in the cloud? Slack serves as the common denominator that allows you to search across all of them.”"
I continue to be amazed that I can make a figure like this one in ~15 lines of
#ggplot2
code. (And half of that is theme code to get the styling right. The logic is only 6 lines.)
#rstats
#dataviz
What is the value of a PhD? The other day I commented on the financial value, but let's dive a little bit deeper into the value in terms of personal growth and development of career skills.
(This is long. Click to expand and read.)
Does it make sense to…
Our students get 6-figure offers straight out of grad school. Attended a defense last week (not my student, but I'm on the committee) where the student has a $210k/yr offer and was contemplating whether she should take it or not.
I recently added this error message to ggplot2. But now I have second thoughts. Maybe a more precise error message would have been "object of type 'closure' is not subsettable".
#rstats
The log-scale plot of case fatality rates of COVID-19 vs age highlights an interesting finding: COVID-19 is consistently worse than the seasonal flu, at all ages. There is no age-dependent effect, just an overall shift in outcomes to the worse.
ICYMI: This is an incredibly important release for anybody who needs to deal with colors. I couldn't have written my book without this package. In particular, lightening and darkening of colors on the fly is so useful.
New release of
#rstats
pkg
#colorspace
: refined and named palettes,
#ggplot2
color scales, visualization and assessment, interactive color apps (shiny + Tcl/Tk), color vision deficiency emulation, and much more.
#dataviz
#endrainbow
(1/12)
While working on the next ggplot2 release, I've come to believe that for every bug in the ggplot2 code base there's at least one published R package that uses it as a feature.
I don't have a SoundCloud, but I've written a book on data visualization you can read for free here. It's not about programming, and there's neither R nor python in the book.
Why is it that most people in tech are so poorly informed about biology? No we couldn't make animals smarter. We literally don't have the technology. We don't even know the biological basis of "being smart." Maybe in 50 years. Not today. Not next year. Not this decade.
why aren’t we “uplifting” other species?
this is a classic sci fi plot point that I feel like we just skipped over
if we wanted to make animals much smarter, we could, no?
@balajis
That assumes it's easier to rebuild all physical infrastructure than to get self-driving to work. I wouldn't make that assumption. Self-driving will be solved soon (couple of years at most). 1/n