Getting Translations with rvest and Selenium

In this guide, I’ll go over how you can use web scraping rvest and Selenium to get translations from Google Translate. Note: I encourage responsible scraping - I always try to do it with some space between requests. You can only do 5000 characters at a time with the free Google translate. I will say that I tried to do this with just rvest and the predictability of the links for Google translate - but I could not get rvest to pull the right data off the page, so here’s a slightly more difficult approach that appears to work. [Read More]

Multilevel Modeling Workshop Materials

Many thanks to Rutgers University Spanish and Portuguese Department (https://span-port.rutgers.edu/) for asking me to come talk about multilevel models. I enjoyed talking to the group, meeting Twitter friends in real life!, and I am especially impressed by what their department is doing in what is often considered a qualitative science. I used RStudio's Cloud to share a workspace with all the materials, packages, and other information you might need. I built the slide show using markdown, so that people could watch the slides and/or take their own notes. [Read More]

Updating Your CV with Packages

Hi guys! I have finally done it! I updated my CV with Rmarkdown using Steve's Markdown Templates. I was tempted to use the new vitae package, but I had already gone down this path before that came out, just finally getting back to it. Link to the entire CV folder for you to use/view do stuff with: CV. Please ignore the html files in that folder, it does “knit” automatically as part of the website build using markdown - you should be using PDF and LaTex for the CV part. [Read More]

Gathering Text from the Web

Hi everyone! I don't really feel like working too hard today, so I decided to write a blog post about how my student Will and I used rvest to mine articles from several different news sources for a project. All the scripts and current ongoings of this project can be found on our OSF page - this project is also connected to the GitHub folder with the files. First, we picked four web sources to scrape - The New York Times, NPR, Fox News, and Breitbart because of their known political associations, and specifically, we focused on their political sections. [Read More]

Mediation Moderation Workshop

Hi everyone! I have been super swamped with a bunch of due dates that all hit in April. For a small brag, and I like making lists: 9 revise and resubmits (four we've sent back, two have been accepted!) 4 conference posters and one invited talk 1 submitted grant (fingers crossed!) 2 invited papers 2 theses that I'm chairing, 2 that I'm on the committee for Data camp! It's been nuts, so haven't left the house much or done much of anything else. [Read More]

Working With Messy Text

Heyo! I am doing my best to procrastinate here on a blustery Tuesday afternoon. So, I decided to share some code I've put together that solves problems in R that I used to do in perl. HTML or C++ was probably my first real language, but I love the heck out of perl. It's never done me wrong (unlike you PHP). Anyways! The context of this project is that we are developing a dictionary of words to complement the work done by Jonathan Haidt and Jesse Graham - learn more. [Read More]

New Publication - Detect Low Quality Data

My coauthor John Scofield and I just had a publication accepted at Behavior Research Methods - you can check out the publication preprint at OSF. We thew together a website for the paper that summarizes everything we found, as well as puts all the materials together in one place - check it out. We create a really nice R function to help you detect low quality data, which you can find on GitHub, and I even made a video that explains all the parts to the function at YouTube. [Read More]