Paul Smolensky
VerifiedJohns Hopkins University · Neuroscience
Active 1979–2024
Research topics
- Natural Language Processing
- Computer Science
- Artificial Intelligence
- Theoretical computer science
- Data Mining
- Mathematics
- Programming language
- Linguistics
Selected publications
2021 · 5 citations
- Computer Science
- Natural Language Processing
- Artificial Intelligence
Human language is often assumed to make "infinite use of finite means" - that is, to generate an infinite number of possible utterances from a finite number of building blocks. From an acquisition perspective, this assumed property of language is interesting because learners must acquire their languages from a finite number of examples. To acquire an infinite language, learners must therefore generalize beyond the finite bounds of the linguistic data they have observed. In this work, we use an artificial language learning experiment to investigate whether people generalize in this way. We train participants on sequences from a simple grammar featuring center embedding, where the training sequences have at most two levels of embedding, and then evaluate whether participants accept sequences of a greater depth of embedding. We find that, when participants learn the pattern for sequences of the sizes they have observed, they also extrapolate it to sequences with a greater depth of embedding. These results support the hypothesis that the learning biases of humans favor languages with an infinite generative capacity.
2019 · 16 citations
- Computer Science
- Computer Science
- Artificial Intelligence
Generating formal-language programs represented by relational tuples, such as Lisp programs or mathematical operations, to solve problems stated in natural language is a challenging task because it requires explicitly capturing discrete symbolic structural information implicit in the input. However, most general neural sequence models do not explicitly capture such structural information, limiting their performance on these tasks. In this paper, we propose a new encoder-decoder model based on a structured neural representation, Tensor Product Representations (TPRs), for mapping Natural-language problems to Formal-language solutions, called TP-N2F. The encoder of TP-N2F employs TPR `binding' to encode natural-language symbolic structure in vector space and the decoder uses TPR `unbinding' to generate, in symbolic space, a sequential program represented by relational tuples, each consisting of a relation (or operation) and a number of arguments. TP-N2F considerably outperforms LSTM-based seq2seq models on two benchmarks and creates new state-of-the-art results. Ablation studies show that improvements can be attributed to the use of structured TPRs explicitly in both the encoder and decoder. Analysis of the learned structures shows how TPRs enhance the interpretability of TP-N2F.
Recent grants
IGERT: Unifying the Science of Language
NSF · $3.2M · 2006–2015
INSPIRE Track 1: Gradient Symbolic Computation
NSF · $1.0M · 2013–2020
Frequent coauthors
- 34 shared
Géraldine Legendre
Johns Hopkins University
- 25 shared
Jianfeng Gao
University of Toronto
- 19 shared
Roland Fernandez
- 18 shared
Hamid Palangi
- 17 shared
R. Thomas McCoy
Princeton University
- 16 shared
Matthew Goldrick
Northwestern University
- 15 shared
Jennifer Culbertson
- 14 shared
Tal Linzen
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Paul Smolensky
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup