In the most recent rOpenSci community call, Juliane Manitz presented on standards for statistical software in the pharmaceutical industry. One of her slides in particular was about how the pharmaceutical industry assesses the reliability of a package.
What’s a matrix-column? The tibble package in R allows for the construction of “tibbles”—a sort of “enhanced” data frame. Most of these enhancements are fairly mundane, such as better printing in the console and not modifying column names.
Taking a potentially continuous treatment, binning it into categories, and doing ANOVA results in reduced statistical power and complicated interpretation. Yet, as a graduate student, I was advised to bin continuous treatment variables into categories multiple times by different people.
If your data is just 1’s and 0’s, it can be difficult to visualize alongside a best-fit line from a logistic regression.
Even with transparency, the overplotted data points just turn into a smear on the top and bottom of your plot, adding little information.
In March (which feels like years ago, now), when Universities started seriously thinking about their response to COVID-19, I was teaching Ecological Models and Data as instructor of record and finishing my dissertation up.
I’m currently teaching Ecological Statistics and Data, a class I inherited from Lee Brown and Elizabeth Crone. In a lecture on population dynamics, they do some really cool things with generalized linear model—things that I don’t think are standard practice and as far as I can tell from googling, aren’t well documented.
Tea Science Tuesdays are Instagram live streams where I’ll talk informally about some aspect of tea science while enjoying some tea. Each week, there will be a topic, a suggested tea if you want to drink along, and a suggested “reading” (sometimes a video).
I woke up this morning to an email saying my first R package, holodeck, was on it’s way to CRAN! It’s a humble package, providing a framework for quickly slapping together test data with different degrees of correlation between variables and differentiation among levels of a categorical variable.
This was my first time attending RStudio::conf, and I went primarily to explore my career options in data science. I mainly stuck to teaching and modeling related talks since that’s how I already use R.
I’m currently in Hangzhou, China at the Tea Research Institute(TRI) for my fourth and last time. It’s bitter sweet (like my favorite teas ;-) ) since I’m both glad to be nearing the end of my PhD, and sad to say goodbye to all the friends I’ve made and a city I’ve really grown to enjoy living in.