Downloading the data
Please note that our data is available for download as .csv files in the Data tab. Most of the data that we collected from online, public sources is not anonymized. However, all data about students is anonymous.
If you believe there is an error in our data set, please contact us.
Invited conference speakers data
LingAlert was used to make a list of recent conferences and their invited speakers. Specific details about the conferences were manually collected from each conference website. For each conference, the following was recorded: year of the conference, region, main topic, theoretical bias, and language bias. Each invited speaker’s CV or website was used to determine their graduating university, current affiliation, perceived sex, and year of graduation.
- Number of Conferences: World = 48; North America = 22
- 87% of the conference invitee data is after 2003.
Data collection took place in November 2016. More recent data will be collected in the coming months.
We collected faculty data from 50 universities in the United States with linguistics departments/programs. Faculty data was collected by going to individual institution websites and using people’s websites or CVs to annotate for perceived sex and subfield.
Subfields were determined using the research interest or personal biographies section of institutions’ “People” page. These were checked against individuals’ CVs, including lists of publications. Each individual person is coded for every subfield they do research in. This is to ensure that we are making relevant comparisons: a person who specializes in both acquisition and syntax, for example, is presumably going to conferences and advising students in both of these fields.
What is included in each group
- Applied: second-language or pedagogy related
- Experimental: Child Language Acquisition and Psycholinguistics
- Phonology & Phonetics: Phonology and Phonetics
- Sociolinguistics: Sociolinguistics
- Syntax & Semantics: Syntax, Semantics, Morphology
Student data reflect enrollment in the 2016-2017 academic year.
An email survey was sent to 50 institutions with linguistics programs in the United States. This survey asked department secretaries or advisors to provide us with a count of how many male and female graduate students are currently concentrating in certain areas. We also asked about the number of male and female undergraduates majoring in linguistics. The survey is reproduced below:
- How many female/male undergraduate students are majoring in linguistics at your institution? [example answer: 20 female / 20 male]
- How many female/male graduate students are specializing in Semantics, Syntax, and/or Morphology?
- How many female/male graduate students are specializing in Phonology and/or Phonetics?
- How many female/male graduate students are specializing in Sociolinguistics?
- How many female/male graduate students are specializing in Language Acquisition?
- How many female/male graduate students are specializing in a subfield other than those listed above (psycho/neurolinguistics, applied linguistics, speech pathology, etc.) or in more than one of the above subfields?
For those universities that did not reply, we used their websites to estimate how many male and female graduate students were concentrating in certain areas.
NSF and NCES data
In addition, we gathered data from NSF about the number of Linguistics PhDs earned in the US per year by males and females since 1958. Data from the NCES provides information on the proportion of women earning BAs, MAs, and PhDs in the US since the early ’90s. (Please note the different time periods available.)
LSA Executive Committee
Data on the makeup of the Executive Committee was gathered from the LSA website for each year since 2005.