I’m currently teaching Ecological Statistics and Data, a class I inherited from Lee Brown and Elizabeth Crone. In a lecture on population dynamics, they do some really cool things with generalized linear model—things that I don’t think are standard practice and as far as I can tell from googling, aren’t well documented. And let me tell you, I did a lot of googling to make sure I understood this stuff before teaching it.
I woke up this morning to an email saying my first R package, holodeck, was on it’s way to CRAN! It’s a humble package, providing a framework for quickly slapping together test data with different degrees of correlation between variables and differentiation among levels of a categorical variable.
Example use of holodeck library(holodeck) library(dplyr) df <- #make a categorical variable with 10 observations and 3 groups sim_cat(n_obs = 10, n_groups = 3, name = "Treatment") %>% #add 3 variables that covary sim_covar(n_vars = 3, var = 1, cov = 0.
This was my first time attending RStudio::conf, and I went primarily to explore my career options in data science. I mainly stuck to teaching and modeling related talks since that’s how I already use R. Here are my major takeaways from the conference.
Shiny is the new hotness Shiny apps are interactive web apps that run on R code, and there was a big focus on Shiny development at the conference this year.
My PhD has involved learning a lot more than I expected about analytical chemistry, and as I’ve been learning, I’ve been trying my best to make my life easier by writing R functions to help me out. Some of those functions have found a loving home in the webchem package, part of rOpenSci.
Papers that use gas chromatography to separate and measure chemicals often include a table of the compounds they found along with experimental retention indices and literature retention indices.
Last semester I took a class that used Python. It was my first time really seriously using any programing language other than R. The students were about half engineers and half biologists. The vast majority of the biologists knew R to varying degrees, but had no experience with Python, and the engineers seemed to generally have some experience with Python, or at least with languages more similar to it than R.
One thing I’ve learned from my PhD at Tufts is that I really enjoy working data wrangling, visualization, and statistics in R. I enjoy it so much, that lately I’ve been strongly considering a career in data science after graduation. As a way to showcase my data science skills, I’ve been working on a side project to use webscraping and multivariate statistics to answer the age old question: Are cupcakes really that different from muffins?
The LI-6400XT is a portable device used to measure photosynthesis in plant leaves. As you take measurements by pressing a button on the device, they are recorded into memory. In order to keep track of which measurments go with which plants (or experimental treatments), there is an “add remark” option where you can enter sample information before taking measurements.
When the data are exported, you get a series of .
As part of my fieldwork in China, I collected harvested tea leaves that were damaged by the tea green leafhopper. I want to quantify the amount of leafhopper damage for each harvest. I was able to find several solutions for quantifying holes in leaves or even damage to leaf margins, but typical leafhopper damage is just tiny brown spots on the undersides of leaves. I did find some tutorials on using ImageJ to analyze diseased area on leaves, but found that the leafhopper damage spots were too small and too similar in color to undamaged leaves for these tools to work reliably and be automated.