DHisGram – Syntax out of Africa: Deep History through Human Grammars (PI C. Guardiano) – Parametric Comparison Method

The project explores an ambitious challenge in contemporary historical linguistics: reconstructing deep language history through the hidden structures of grammar.

By combining theoretical linguistics, statistical modelling and comparative analysis, the research aims to establish a new method for investigating the origins, evolution and relationships of human languages across time.

At the core of the project is the Parametric Comparison Method (PCM), a procedure for phylogenetic reconstruction that uses syntactic parameters as historical evidence. Unlike traditional comparative methods, which rely mainly on sound correspondences and lexical etymologies, the PCM compares abstract grammatical structures. These structures make it possible to measure syntactic distances between languages, including languages separated by very long periods of time or belonging to different families.

The areas under investigation

The project focuses on two macro-areas marked by exceptional linguistic diversity and complex population histories: Southeast Asia and the Pacific, and the Americas.

In Southeast Asia and the Pacific, the research will examine selected languages from Sino-Tibetan, Austroasiatic, Papuan, Austronesian, and Australian language groups.

In the Americas, the project will investigate languages from several major families and groupings, including Eskimo-Aleut, Na-Dene, Salishan, Muskogean, Quechua, Uto-Aztecan, Mayan, Arawak, Tupi-Guaranian, Waikuruan, Macro-Jê, and Carib.

Research Questions

Southeast Asia and the Pacific. In light of recent debates in population genetics (Reich 2019), the project will test whether the grammatical signal supports stronger connections between Sino-Tibetan and Austroasiatic languages, or between these groups and the languages of New Guinea and Australia.

The Americas. The project examines whether the languages of the Americas preserve syntactic traces of different migration waves into the continent. It will test whether syntactic comparison can provide new evidence in the long-standing debate on the deep classification of American languages, including Greenberg’s (1987) Amerind hypothesis and the distinct position of Na-Dene and Eskimo-Aleut, against alternative classifications proposing a much higher number of independent language families (Campbell & Mithun 1979; Ringe 1992, 1996; Vajda 2010).

Scientific Challenges

Against traditional claims, the project aims to test whether grammars retain historical signals deep enough to contribute to the reconstruction of human prehistory. If successful, it will consolidate the PCM as a new tool for historical linguistics and provide independent linguistic evidence for debates currently dominated by genetics, archaeology and lexical comparison.

DHisGram is expected to contribute in three main directions.

First, it will provide new evidence on the historical relationships among languages and populations in Southeast Asia, the Pacific and the Americas.

Second, it will strengthen the methodological foundations of the PCM through the refinement of parametric analysis, computational modelling and statistical testing.

Third, it will expand current knowledge of syntactic variation by analysing a broad range of typologically diverse languages.

The scientific outputs include a series of publications addressing both methodological developments and macro-historical results.

Methodology

The research will collect and analyze syntactic data from a selected set of languages. The focus will be on nominal structures and on at least 100 binary syntactic parameters. Language data will be transformed into parameter grids and then analyzed through computational phylogenetic methods, statistical testing and comparison with archaeological and genetic evidence.

The research is organized into four Work Packages developed over a three-year timeline.

WP1 — Data Collection. The first phase focuses on the collection of syntactic data from selected languages across Southeast Asia, the Pacific and the Americas. Following established PCM procedures, researchers will conduct targeted elicitation sessions with trained native speakers and expert linguists, both online and in person. In parallel, automated methodologies for extracting and integrating linguistic data from existing digital sources and databases will be developed and tested.

WP2 — Data Analysis. Collected data will be analysed through the PCM framework and converted into parameter grids representing the grammatical profile of each language. Since many of the languages under investigation are typologically underexplored, this phase will also contribute to refining and expanding the parametric system itself by identifying previously unrecorded syntactic patterns.

WP3 — Quantitative Comparison and Historical Testing. Syntactic distances between languages will be measured using computational phylogenetic and statistical methods. The resulting phylogenies will be tested against existing linguistic classifications as well as archaeological, genomic and population-genetic evidence. Attention will be devoted to the statistical robustness of cross-family historical relations and to the development of methods specifically adapted to parametric linguistic data.

WP4 — Database Development. All linguistic data and parametric analyses will be integrated into the PCM Hub, a searchable digital database designed specifically for PCM research. The database will be progressively populated during the project and publicly released in its final phase as a permanent research infrastructure for future comparative and historical studies.

Open Positions

The project will open positions for PhD students, postdoctoral researchers and early-career researchers, with contracts ranging from one to two years, depending on the role and research area.

PostDoc hiring procedures will begin in June 2026. Applications for the PhD Program are open till June 30, 2026.

Applications are welcome from candidates with backgrounds in theoretical linguistics, syntax, historical linguistics, computational phylogenetics, statistics, computer science, database design and related fields.

Researchers interested in joining the project are encouraged to contact cristina.guardiano@unimore.it for further information about upcoming calls, research profiles and application procedures.