Models of language variation and change

Description of the project (from the Research Proposal written by Rita Manzini) - Data collection

Variation in minimally different language systems (microvariation) has been the preserve of sociolinguistics and dialectology and is traditionally treated as a cultural or social artefact strictly associated with language-specific contingencies, hardly depending on universal factors.

The level of depth and the sophistication of the linguistic materials required for formal syntactic analysis go beyond the observation of lexical items, phonetic typologies, or even structural surface patterns. Thus, data can hardly be collected from already existing databases (if any); novel collection is required.

Data will be elicited from native speakers in fieldwork sessions, with as little help as possible from questionnaires and other mediations.

It should be stressed that this is a cognitivist, not a behaviorist, linguistic project: hence, speakers will not be recorded or observed in their everyday life, but only explicitly and consciously interviewed for the purposes of eliciting syntactic information.

We plan to publish the fieldwork data, duly transcribed, glossed and translated, in an electronic format (on the website of the project), to serve as a documentation of otherwise not easily accessible languages.

Fieldwork will be carried out in the following linguistic domains:

  1. German(ic) language islands in the alpine regions of North-East Italy (Dolomiti area and Friuli): Mņcheno/Fernsentaler, Cimbrian, Sappadino/ Plodarisch, Saurano/ Sauris-Zahre, Timavese/ Tischlbongerisch, South-Carinthian of the Canale Valley. This work builds on several previous projects: CimbroLang - The lexicon-morphosyntax interface in language obsolescence with particular focus on Cimbrian semi-speakers in Trentino and Veneto regions (2008-2011, PI E. Bidese, Provincia di Trento); Il cimbro come laboratorio di analisi per la variazione linguistica in sincronia e diacronia (2009-2012, PI A. Tomaselli, Fondazione CariVerona); Sauris/Zahre – Feldforschung con raccolta di dati linguistici ai fini di una descrizione attuale del Saurano (2017-2018, PIs E. Bidese, A. Tomaselli, H. Weiß, Regione Trentino-Alto Adige); VinKo, Varietą in contatto/Varieties in contact/Varietäten in Kontakt, Universitą di Trento, https://www.dipsco.unitn.it/vinko. South-Slavic varieties in contact with Romance (Friulian) in the Resia Valley and (of special interest to us) Slovene varieties in contact with Romance (Friulian) and German (South-Carinthian) in the Canale Valley will also be investigated.
  2. Arbėresh varieties of the Italian South – with special attention to outlying communities in Molise (Portocannone), Campania (Greci), Apulia (S. Marzano), Lucania (Barile, Ginestra); additional fieldwork will be carried out in the core Calabrian Arbėresh speaking area including the districts of Catanzaro and Cosenza, as well as in Sicily (Piana degli Albanesi). Comparison with mainland Albanian will be carried out with speakers of both the standard Tosk variety (Gjrokastėr) and of Geg varieties (Shkodėr). We take advantage of fieldwork in Albania to collect data from Aromanian varieties, in contact with Albanian and with Greek (Fier, Diviakė, Libofshė, Kėllez, all in South Albania).
  3. Greek varieties of the Italian South: Bovesia (Reggio Calabria) and Grecģa Salentina (Calimera and the area of Lecce). These will be compared at least with standard Modern Greek, one variety from the Ionian islands (geographically -and presumably also historically- closer to Southern Italy), Tsakonian (the most conservative dialectal enclaves in Greece, not affected by koinezation), two representatives of the Northern dialects (Thessaloniki, Lesvos), Cypriot Greek. Of direct interest for comparison is also Asia Minor Greek including Romeyka Pontic (Ēaykara region, Turkey), as well as Cappadocian, Pharasiot, whose speakers are accessible though diaspora communities in mainland Greece.

Statistical corpus search will form the basis of work on Old and Middle English. Testing a given syntactic hypothesis for a mediaeval language requires a formidable work of data collection, because the revealing evidence may come from complex structures, relatively infrequent in the actual production. For the history of English, powerful tools for syntactic analysis are available, i.e. the electronic corpora elaborated at the University of Pennsylvania and the University of York. These corpora contain both word-by-word morphological annotation and constituent structure bracketing, which enables one to perform automated searches sensitive not only to co-occurrence and linear precedence, but also to structural dependencies. The size of the corpora makes it possible a statistical analysis of the results. The two corpora relevant for the present research are the YCOE (Taylor et al. 2003) and the PPCME2 (Kroch & Taylor 2000), covering the period 800-1500. When necessary for the purposes of investigating syntactico-semantic properties not considered in the original tagging, further annotation will be added to the original corpora.

Click here to read about Formal Analysis.