Automating the Linguistic Annotated Bibliography (LAB)

Ari L. Cunningham, Erin M. Buchanan, & Nikhil Chate


The Team

What is the LAB?

What is the need?

Automation: Curate the Training Data

Automation: Curate the Training Data

Automation: Curate the Training Data

Search Term BRM LRE PsycINFO PLoS One
Lexical Norms 11 6 9 302
Lexical Database 116 97 58 302
Linguistic Norms 4 6 37 596
Linguistic Database 14 22 6 596
Corpus 507 1000 NA NA
Norms 818 218 NA NA
Unique 1030 961 103/475 801

Automation: What are the features?

Automation: What are the features?

semantic ambiguity homonyms meaning frequency homonym norming methods data movie subtitles free association homonym meaning annotations comparison homonym meaning frequency estimates derived movie television subtitles free association explicit ratings words ambiguous interpretation dependent context advancing theories ambiguity resolution important general theory language processing resolving inconsistencies observed ambiguity effects experimental tasks focusing homonyms words bank unrelated meanings edge river financial institution present work advances theories methods estimating relative frequency meanings factor shapes observed ambiguity effects develop method estimating meaning frequency based meaning homonym evoked lines movie television subtitles human raters replicate extend measure meaning frequency derived classification free associates evaluate internal consistency measures compare published estimates based explicit ratings meaning frequency compare set norms predicting performance lexical semantic decision mega

Automation: Building an algorithm

Automation: Building an algorithm

Automation: Results

Predicted No Predicted Yes
Not Included 138 7
Included 18 47

What appears to be most informative?

What are we missing?

What happens next?

  1. All articles found in this search procedure
  2. Users can submit their suggestions for LAB2.0 articles on our new shiny app
  3. Articles with high classification probabilities from the algorithm
  4. Articles with at least two “yes” votes from LAB2.0

Thanks and Sources

Extra Info on the C-LSTM Model

The model is a hybrid deep learning architecture as detailed below: