“Would they hire you to talk to aliens?”
“That’s so funny I have a friend who studies French literature!”
“So what do you think of Chomsky’s political views?”
“Linguistics? At MIT? I didn’t know they had that. I thought they just did science and stuff.”
Thanks to the popularity of the movie Arrival, the world now has a pretty good idea of what us linguists do. You know… figuring out how to communicate with aliens, internalizing all of space and time, it’s all in the job description.
Okay not really. Like not even a little bit. We don’t even have a special affinity for linguini (I did not make this up).
It is not surprising that people don’t know much about the study of linguistics. Few schools offer linguistics classes at an early stage of education, and the field itself is broad enough that even linguists don’t always know what their colleagues in other subfields do.
I’m used to getting pseudo-science questions about language pretty much wherever I go, and while I expect this from the general population, I was surprised find that people at MIT are often nearly as confused!
And while most of the responses I get from MIT people are that of polite interest or excitement, people still generally make the assumption that my field is somehow categorically unrelated to science and engineering.
(The less enthusiastic responses seem to convey doubt about whether non-science/engineering fields have a place here, but that is a topic for another day.)
My goal is to debunk some myths about what linguistics is and what it means to be a scientist. I’ll be talking specifically about theoretical linguistics (which we call generative linguistics) with a special focus on syntax (my subfield).
But there is a whole world of sociolinguistics that I encourage you to look into as well if you are interested.
Theoretical linguistics explained
In linguistics, we hypothesize that the human language faculty is a universal property of the human species that is built on common foundational principles.
The key here is that linguistics is not the study of a language or even many languages, but rather the study of language as a concept, which all the languages of the world are instances of.
In the way that we believe the laws of physics to be invariant across different physical environments, we believe the laws of language structure to be invariant across languages (though things may appear different in different physical and linguistic environments).
In line with this hypothesis, we propose abstract representations of language that can be transformed in a number of ways to yield the diverse structures that we see within and across languages. And to get at this underlying representation, we look at data from all languages to find patterns.
I studied physics as an undergraduate so I’m going to use some analogies from Newtonian mechanics to show you what I mean. In Newtonian mechanics we have a formal framework, i.e. calculus, which has different types of objects and operations, i.e. variables and derivatives, integrals, etc.
On top of the framework, we have model specific constraints on how to use the framework to describe actual phenomena. For example we have the notion of forces and acceleration, and Newton’s second law, which relates the two (F=ma). To model facts about language, we too have a framework and a model that constrains it.
Or in other words, we too propose the existence of objects, and operations that relate objects to each other and build structure, and we propose principles to constrain those operations.
The types of evidence that we use to create a syntactic model come from looking at necessary conditions for the formation of grammatical sentences.
When looking at the structure of a particular language, we ask native speakers of that language to give us what we call “grammaticality judgments”.
What we are interested in is what kinds of sentences their internalized system can produce/parse, and what kinds of things it can’t. We are not interested in the “rules” of the language that they learned in school, but rather what the fundamental system looks like naturally.
Example 1: John and I...
For example in English, we learn in school that the sentence John and I went to the store is correct, and that any other way of saying it should be incorrect. But this does not capture the intuition that native English speakers have about their language.
Of the following 4 sentences, 3 of them would naturally be uttered by a native English speaker, not just one (* means ungrammaticality). Furthermore the (d) example sounds wrong even though it should be “more correct” than (b) or (c).
1. John and I went to the store.
2. John and me went to the store.
3. Me and John went to the store.
4. *I and John went to the store.
The following English examples are meant to take you on a brief excursion of the kinds of reasoning we do as linguists. To understand the following paradigm, we need to be able to describe how different objects can coexist in some environments but not others.
Example 2: Different objects in different environments
To do that we need concepts for objects like ‘that the world is round’ and ‘Wallace’, and concepts for different environments, such as ‘after a verb’, and ‘in a conjunction’.
e. I know that the world is round.
f. I know Wallace.
g. I know Wallace and Gromit/I know that the world is round and that fish like cheese.
h. *I know that the world is round and Wallace.
The underlined portions in (e) and (f) are in the same environment because they both come after a verb. And yet, when we put those objects in a conjunction after the verb, we see that something goes wrong.
While both of those objects can independently appear after a verb (examples (e) and (f)), and they can both independently appear in a conjunction (example (g)), they cannot appear together in a conjunction after a verb (or anywhere else, example (h)). So we will need a way to describe the difference between those objects and why the two environments accommodate them differently.
Furthermore our descriptions of these environments have to be more precise than say ‘after a verb’ because of sentences like the following:
i. Can eagles that fly swim? (compare to *Eagles can fly swim) (from Chomsky 2012)
j. *Can eagles that fly planes? (compare to Eagles can fly planes)
In principle nouns should be able to follow verbs like fly, and verbs shouldn’t. So if we define an environment with the description, ‘after a verb’, we predict the wrong results for (i) and (j). But if we acknowledge the fact that fly is part of a relative clause, we can begin to understand the data.
By looking at the behavior of different objects in different environments cross-linguistically, we gain insight into the kinds of principles that underlie our language faculty.
While the specific conditions for grammaticality in English will differ from other languages, we can still learn a lot about basic structural constraints by looking at any given language.
Following the scientific method
The study of linguistics closely follows the scientific method in the same ways that other sciences do. We form hypotheses by positing a framework and a model, gather evidence that can falsify said hypotheses, and modify the theory based on careful analysis of the data.
We just happen to be at an earlier stage of development than other scientific fields. Our framework might look different than say, algebra, but it still has formal properties and we work to make it more precise every day.
I haven’t provided any answers here, or shown you what our models really look like, but hopefully this has been helpful in understanding the kinds of things linguists like me think about.
The subtle structural knowledge that we have of these sentences, and how we map this structural knowledge onto meaning and sound, is what we as theoretical linguists want to understand.
If you’re ever bored, try coming up with ungrammatical sentences and see if you have ideas about why they are bad. I would love to hear about it!