Is priming consistent across languages? Preliminary findings from the SPAML: Semantic Priming Across Many Languages
Erin M. Buchanan & The Psychological Science Accelerator
Harrisburg University
The Psychological Science Accelerator
- The PSA is a CERN for psychological science
- Globally distributed network of researchers with more than 1000 members in 82 countries
- Open science principles and practices
- PSA007: Semantic Priming Across Many Languages
Semantic Priming
- Semantic priming has a rich history in cognitive psychology
- Semantic priming occurs when response latencies are facilitated (faster) for related word-pairs than unrelated word-pairs
- Usually measured with the lexical decision or naming task
- The Semantic Priming Project (Hutchison et al., 2013) provided priming values for 1661 English word-pairs
Semantic Priming
- Semantic priming replicates pretty well
- WEIRD words
- Single language focus or multilingual individuals
- A lack of data sets that are matched on language within one study
- How can we leverage the computational skills found in natural language processing with the open data publications to improve this research?
- Goals of of the SPAML:
- Assess semantic priming across (at least) 10 languages using matched stimuli
- Provide a large-scale data set for reuse in linguistics
- Registered Report at Nature Human Behaviour
The Stimuli
- Corpus Text Data: Open Subtitles Project
- Subtitles have shown to be critically useful data sets for word frequency calculation (New et al., 2007; Brysbaert & New, 2009; Keuleers et al., 2010; Cuetos et al., 2012; Van Heuven et al., 2014; Mandera et al., 2015; and more)
- Freely available subtitles in 63 languages for computational analysis
- Approximately 43 languages contain enough data to be usable for these projects
The Stimuli
- For each language:
- Collect the top 10,000 most frequent nouns, verbs, adjectives, and adverbs
- Find the top five most similar words using cosine from subs2vec (van Paridon & Thompson, 2021)
- Cross-reference this list across languages
- Pick the most overlapping stimuli limiting repeats and proper names
- 1000 final pairs
- Important: driven by the language, not English translation
Nonwords and Translators
- Nonwords are generated with a Wuggy-like algorithm (Keuleers & Brysbaert, 2010)
- Translators check all pairs for proper translation, form, and meaning
- They suggest the appropriate words for retaining meaning between cue-target
- They fix nonwords to ensure they are pronounceable, not too fake
- Dialects are considered and separated when appropriate
Procedure
- View a simple version: https://psa007.psysciacc.org/
- Overall task:
- A single stream lexical decision task
- All words cue-target are judged, cue-target linked by order
- Trials are formatted as:
- A fixation cross (+) for 500 ms
- CUE or TARGET in Serif font
- Lexical decision response (word, nonsense word)
- Keyboards are WILD
- 400 pairs = 800 trials
Power and Study Design
- Power focused on using accuracy in parameter estimation to adequately measure each individual item (see anything by Ken Kelley)
- We simulated using the English Lexicon Project and Semantic Priming Project
- Minimum: n = 50 per target word by condition (related, unrelated)
- Stopping: SE = .09
- Maximum = n = 320
- Adaptive sampling checks and samples pairs once an hour to randomize the study
Data Provided
- The data will be provided in several forms:
- Subject/trial level: for every participant
- Item level: for each individual item, rather than just cue or just concept
- Priming level: for each related pair compared to the unrelated pair
Current Data Collection
![]()
*Big thanks to ZPID and Harrisburg U
Priming Distribution Results
Cross Cultural Comparison
Cross Cultural Comparison
Final Thoughts
- This work to diversify participants, languages, and researchers represented is aided by big team science approaches
- Priming effects are found across different writing systems
- Variability between languages appears to be approximately .02
- More languages currently underway
Recruitment and any Questions?
- Thank you for listening!
- We want you - join our team for data collection by contacting me
- All levels of researchers welcome
- Authorship is provided for those who meet the collaboration agreement
- All PSA collaborators are listed with their author information online