PARTHICO - Description of the project
Description of the project (from the project proposal)
Abstract
A central question in linguistics is to explain the tension between the universal nature of the human language faculty and the observed diversity of its empirical instantiations. Various proposals try to solve this tension relying on the notion of parameters, i.e. binary options allowing for cross-linguistic structural variation (Chomsky 1981, 2005; Roberts 2007; Biberauer 2019). The implementation of parameter theories incurs into two major problems: they usually resort to ungrammaticality judgments, a kind of evidence language acquirers do not have access to, and they leave unexplored the procedure behind parameter setting.
This project embraces these challenges by measuring the power of parametric models through an innovative method, i.e. the application of parameter theories to historical corpora, which qualify as an ideal testing ground, since they contain continuous texts which can be taken as a proxy for the primary linguistic data used by language acquirers. We test the list of nominal parameters and their setting procedure proposed in Crisma/Guardiano/Longobardi (2020) on the Old Italian corpus which will be finalized during the project. The procedure will be replicated on the syntactically annotated corpora of Early New High German, Historical Icelandic and Old and Middle English, which conform to the UPenn-style conventions and allow for automated searches of syntactic constituents, constructions and processes.
This project will offer an innovative contribution to the logical problem of language acquisition, the architecture of parameter systems and nominal syntax. Once validated on positive evidence and historical corpora, our parameter setting procedure will be able to bridge the gap between descriptive and explanatory adequacy. Also, by comparing Old Italian and historical Germanic with modern varieties, we will provide a principled characterization of the attested nominal structures in relation with their possible diachronic developments. In addition, we will highlight how the correlation and competition between structures can be traced back to the intricate network of dependencies among parameters. Furthermore, we will enlarge the list of nominal parameters by including a full characterization of the syntax of relative clauses. Last but not least, we will release a syntactically annotated corpus of Old Italian, free of charge and fully compatible with the UPenn-style treebank corpora of other historical languages.
State of the art
A central challenge in linguistics is to explain the tension between the universal nature of the human language faculty and the observed diversity of its empirical instantiations. Parametric models have been formulated precisely to solve this tension and to explain how language acquirers converge on the target grammar, notwithstanding alternative options (e.g. Lightfoot 1979, 2006; Chomsky 1981, 2005; Clark/Roberts 1993; Manzini/Wexler 1987; Kayne 2000; Baker 2001, 2008; Longobardi 2003, 2005, 2017; Roberts 2007, 2019; Longobardi/Guardiano 2009; Biberauer/Roberts 2015; Biberauer 2018, 2019; Cinque 2020).
In this respect, at least two major problems remain unsolved (Hornstein/Lightfoot 1981). The first concerns the nature of the data required to set parameters. The evidence used in the literature often consists of ungrammaticality judgments (i.e. negative evidence) or of patterns which are uncommon (exotic in Lightfoots 1989 terms) or only possible in pragmatically infelicitous contexts (Crisma/Guardiano/Longobardi 2020, henceforth CGL20). This kind of evidence is obviously not available in the primary data accessed by language learners (Lasnik/Lidz 2017) and therefore cannot be encompassed in any realistic model of parameter setting. The second problem concerns the way in which relevant information to set parameters is extracted from the primary data. Various principles, e.g. Input Generalization, have been proposed in the literature (Biberauer 2018, 2019). Yet, it is still unclear which of the co-varying manifestations of a given parameter are actually used to set it, how much input needs to be processed to set each parameter, and how the various manifestations available in the primary data are combined to set different parameters.
A recent attempt to address the first issue is CGL20, where a parameter setting procedure based on positive evidence only is described. This model is based on the following tenets. (i) Parameters are abstract structures which are added to the mental grammar when the acquirer is exposed to the relevant triggering experience. In the absence of surface evidence, no structure is added (default value; see also Biberauer 2019). (ii) Parameters are set only on the basis of positive evidence, as language acquirers do not have access to negative evidence (Lasnik/Lidz 2017 and references cited). (iii) Parameters have two levels of deductive structure: (a) each parameter is associated with a cluster of co-varying manifestations (Taraldsen 1980; Chomsky 1981; Rizzi 1982; Kroch 1989; Kayne 2000, a.o.); (b) parameters form an intricate network of interdependencies, called implications, such that some parameter states are predictable, i.e. neutralized, from the state(s) of other parameters (Baker 2001; Guardiano/Longobardi 2017; Roberts 2019). (iv) Parameters are set on the basis of a core subset of manifestations, which qualify as p-expressions in Clark/Robertss (1993) terms, the informal notion of Restricted List; uncommon manifestations are not used to set parameters. This parameter setting procedure has begun to be tested on historical corpora (Crisma/Guardiano/Longobardi 2019a,b), to check whether it can be successfully used to extract information from an input consisting of continuous speech. This implementation has raised further theoretical and methodological issues concerning both the structure of human grammars and the procedure behind their acquisition, which requires an extension of the empirical basis.
To this end, we enlarge the language sample by including Old Italian (OI), Early New High German (ENHG), Historical Icelandic (HIce), and Old and Middle English (OE/ME). We will explore the internal structure of the nominal domain (henceforth, DP) using and expanding the parametric array described in CGL20. As for Italian, whereas the DP syntax of Modern Italian has been densely examined, no studies we are aware of have provided a complete and systematic analysis of the DP at the older stages of the language. Research has focused on isolated aspects of OI DPs, e.g. indefinite articles, possessives, quantifiers, relative clauses, the rise of definite determiners (see a.o. Bianchi 1999; Stark 2005, 2007; Giusti 2002; Giorgi/Giusti 2010 Renzi 1976, 2010; Dardano 2012; Poletto 2014; Poletto/Sanfelici 2019; Sanfelici/Poletto 2021. The state of the art is similar for the historical Germanic languages). An interesting aspect in OI is the fact that DP structural patterns are reported to follow from two distinct grammars: while certain patterns are consistent with the modern Italian grammar, others are typically found in Germanic languages and not in modern Italian. To explore this issue, dedicated quantitative analyses are required, which, with few exceptions (e.g. Stark 2005), have not been carried out so far. We will use OI as a case study to investigate the patterns of competition between two or more grammars (Kroch 1989, 1994).
Description of the research
The project explores the reliability of parameter setting models through their implementation on historical corpora, under the assumption that, in a sense, setting the parameters on historical corpora mimics the task faced by the language acquirer when confronted with the input.
We address the following three Research Questions (RQ):
RQ1. What does the implementation of parameter setting procedures on historical corpora reveal about the structure of human grammar?
RQ2. Can we identify formal criteria which predict how much input must be encountered to set each parameter?
RQ3. How does the parameter setting procedure work in case of conflicting evidence?
Answering these questions will contribute to bridge the gap between descriptive and explanatory adequacy of parametric models. Our Working Hypotheses (WH) are cast within the minimalist model of the language faculty proposed in CGL20, which assumes an underspecified universal grammar and a rich network of implications among parameters. The WH are summarized as follows. WHa. Among parameters, those related to the morphological realization of features have a privileged status, being set with little input (Borer 1984).
WHb. When a parameter is set, the status of the parameters which depend on it becomes neutralized. The rich network of implications across parameters downsizes the burden of converging on the target grammar.
WHc. When a language attests mutually exclusive patterns, call it conflicting evidence, syntactic change is happening. This may also extend to patterns that are only superficially mutually exclusive, call it ambiguous evidence. The implicational structure of parameters constrains the typology of conflicting and ambiguous evidence available in a language.
METHODOLOGY
We adopt the parameter setting procedure proposed in CGL20 and test whether and how it can be successfully applied to the historical corpora available for studying diachronic syntax. This procedure consists of the following tools: (i) a list of about a hundred binary DP parameters; (ii) a list of formulas which define cross-parametric implications in this dataset; (iii) for each parameter, a list of surface manifestations generated by that parameter; (iv) a list of YES/NO questions associated to each manifestation, which are used to collect the relevant evidence required to set parameters: if there is positive evidence for the existence of that structure, the parameter is set (signaled by the symbol [+]), while if no positive evidence is encountered, the parameter is not set (signaled by the symbol [-]); one YES answer is sufficient to set a parameters value unambiguously to [+], even when the other questions do not receive an answer; (v) the subset of questions that qualify as Restricted List.
This procedure has been tested on a few dozen modern languages (Guardiano et al. 2020; Ceolin et al. 2020, 2021). This model is suitable to address our RQs because it provides the only available practical guide to systematically explore cross-linguistic variation within a compact module of the grammar working with positive evidence only. This renders the model highly testable and our WHs easily falsifiable.
In this project we test this model on the syntactically annotated corpora of OI, ENHG (Light 2011), HIce (IcePaHC, Wallenberg et al. 2011), and OE/ME (YCOE, Taylor et al. 2003; PPCME2, Kroch/Taylor 2000). With the term OI, we refer to Medieval Florentine as proposed in traditional philological literature (Renzi 2004; Salvi/Renzi 2010). We follow the spirit of the Grammatica dellitaliano antico in contrasting Medieval Florentine and Standard Modern Italian and considering them two stages of Italian (Renzi 2004; Salvi/Renzi 2010). The OI corpus is currently under construction and consists of 24 texts from the XIII to the XV century annotated for part of speech, 8 of which have been also syntactically annotated adopting the UPenn treebank conventions (Bird 2020, Università di Padova). One of the initial goals of the project (to be accomplished by the end of the first year) is completing the syntactic annotation of this corpus.
We will use the OI corpus to preliminary test our three RQs, which will then be extended to the other corpora. We believe that OI constitutes the ideal testing ground for our RQs for two reasons. In OI, structural patterns found in Modern Italian grammar coexist with structural patterns typically found in Germanic languages. Hence, at first sight OI provides cases of (at least superficially) conflicting evidence. In addition, the syntax of OI DP can be straightforwardly compared not only to Modern Italian but also to other modern Romance languages of Italy (Guardiano/Stavrou 2014; Guardiano et al. 2016; www.parametricomparison.unimore.it).
The procedure applied to the OI corpus will be replicated on the UPenn-style treebank corpora of historical Germanic languages, namely ENHG, HIce and OE/ME. As for the history of German, we chose precisely ENHG as at this stage the syntax of DP seems to be subject to a major overhaul, resulting in a series of visible changes such as the collapse of inflection classes, the fixation of DP-internal word order, the development of new determiners from former adjectives/pronouns (Ebert et al. 1993, Demske 2001; Fuß 2021; English seems to have undergone a similar fate). The extent to which these changes are related to one another remains to be established.
In what follows we illustrate our methodology. To extract the relevant data, we adopt two complementary strategies: (a) an automatic search on the corpora, which provides quantitative information, and (b) a qualitative hands-on examination, whenever needed. An automatic search on the corpora is pursued for those questions that can plausibly be converted into queries, that is to say, that ask about properties annotated in the corpus. In this case, the corpora will be interrogated using CorpusSearch (http://corpussearch.sourceforge.net/). In contrast, for those questions that cannot be converted into queries because they inquire properties which either lack an annotation or cannot be extracted indirectly, we will analyze single texts manually.
For concreteness, we provide an example using the same parameter outlined in CGL20, i.e. GRAMMATICALIZED PERSON, which distinguishes languages that express Person distinctions on categories other than pronouns (e.g. Germanic, Romance) from languages which do not (e.g. Japanese). This parameter has many co-varying manifestations, each corresponding to a question. Here, we provide an example of how the first two can be answered in OI using the automatic search function. The two questions are: (1) Is there agreement in person between an argument and a verb? (2) Are there overt expletive pronouns in subject function?
The conversion of question (1) into the query language of CorpusSearch poses some challenges. It cannot be asked directly because in the OI corpus, as in the other Penn treebank corpora, person agreement on verbs is not annotated as a rule. Nevertheless, we can extract this information indirectly by searching for inflected forms of verbs which are very frequent, like avere to have and dire to say and checking whether they match their subject. As for subjects, since our definition file contains a list of forms for first, second and third singular/plural pronouns, we exploited these definitions. For this purpose, a coding query was designed, which consists of a list of queries associated with values (http://corpussearch.sourceforge.net/CS-manual/Coding.html). We ran it on the 8 currently parsed texts and we obtained 16169 tokens matching the search, which is a large amount of tokens as compared to the small size of the corpus. Interestingly, a positive answer to question (1) is provided already at line 10 of the coding output file, which suggests that the parameter can be set using a limited amount of text.
By contrast, question (2) can be more directly converted into queries; yet, the search does not produce sufficient results to give a YES answer to it. Expletive subject pronouns in the Penn annotation scheme come in two types: (a) those coindexed with a nominal expression or a clause; (b) those lacking an associate, as with weather verbs. In the former case, a special annotation is provided, co-indexing the expletive subject with its associate. This annotation applies in OI for the case of optional third person expletive subjects in impersonal constructions discussed in the literature (Salvi 2008). We ran the dedicated query and the search returned 0 tokens. With weather verbs, expletive subjects are indistinguishable from a referential subject. We searched for specific weather verbs, like to rain/snow/hail. The output contained 11 tokens, of which 7 exhibited the predicate without a lexicalized subject and 4 a plural subject as in Le fiammelle che piovono da la sua biltade the little flames that rain from her beauty (Dante, Convivio 20). These data are not sufficient for a YES answer to question (2). Yet, as will be shown below, they provide interesting material concerning our RQs (anyway, the parameter GRAMMATICALIZED PERSON is set to [+] on the basis of the YES answer to question (1)).
OBJECTIVES
Our goal is to address the three RQs by creating a set of tools which constitute the outputs of the project and will be shared with the scientific community interested in parameter theories and diachronic syntax.
RQ1. What does the implementation of parameter setting procedures on historical corpora reveal about the structure of human grammar?
Historical corpora provide an ideal tool to test and validate the parameter setting procedure illustrated in CGL20, since they contain continuous texts, which can be taken as a proxy for the primary linguistic data. Based on the comparative investigation of the DP structure through quantitative annotations, statistical analyses and formal proposals, we will eventually refine the list of parameters and the network of their manifestations. Ultimate outcomes of working on historical corpora are (i) a formal definition of Restricted List and (ii) a formal definition of default value for a parameter.
RQ2. Can we identify formal criteria which predict how much input must be encountered to set each parameter?
A central aim in research on language acquisition is to explain the order of acquisition, i.e. why certain linguistic forms and structural patterns emerge before others. Researchers have distinguished between early, late and very late acquired phenomena (Tsimpli 2014), suggesting that the order of their acquisition results from the interplay of several factors. A similar observation is reached by preliminary work on the implementation of the parametric setting procedure on historical corpora (Crisma/Guardiano/Longobardi 2019a,b). From this research, it is clear that not all parameters have the same likelihood to be set in one text. This is also what our case study suggests, where question (1) of parameter GRAMMATICALIZED PERSON has a privileged status. More generally, manifestations involving morphological exponents, like agreement phenomena or gender/number inflection, seem to be more pervasive than manifestations pertaining to the interfaces, like referentiality. If so, our methodology has a potential to formally define and measure this pervasiveness.
RQ3. How does the parameter setting procedure work in case of conflicting evidence?
The project starts from the well-known observation in diachronic syntax works that syntactic change is gradual and that there may be variation inside a single text. As originally formalized in Kroch (1989), mutually exclusive linguistic forms or structural patterns which are not functionally differentiated can coexist in a single language and a single text. These forms and patterns, labeled in Krochs works syntactic doublets, are considered the reflections of unstable competition between mutually exclusive grammatical options (Kroch 1994: 181) (see also Kroch 1989; Taylor 1990; Pintzuk 1991; Santorini 1993). To account for the existence of syntactic doublets, Kroch and colleagues proposed a grammar-in-competition model, which assumes that multiple grammars coexist in an individual: one member of the doublet is generated by one grammar and the other by the competing grammar. This proposal has very interesting consequences for language change as it can model the gradualness of the change as well as the co-existence of mutually exclusive structural patterns (conflicting evidence).
Within this approach, one of our goals is to define what counts as syntactic doublets. We adopt a strict operative definition of syntactic doublets as the structural patterns that provide evidence for both a positive and a negative answer to the same question of the same parameter. We make the working hypothesis, WHc, that when a system contains mutually exclusive structural patterns, i.e. syntactic doublets, syntactic change is happening. This may also extend to structural patterns that are only superficially mutually exclusive (ambiguous evidence).
For concreteness, we illustrate a case of syntactic doublets with the setting of the parameter WEAK PERSON in OI. This parameter distinguishes languages in which nominal arguments headed by kind names can occur without a phonetically realized determiner (e.g. English, German, Wolof) from languages that always use an overt article (e.g. Italian, Spanish, French, Basque). Consider the Italian and German translation of White elephants are extinct: It. Gli elefanti bianchi sono estinti, Germ. Weiße Elefanten sind ausgestorben, which manifest opposite values of the parameter. There is no article in German, while an article is required in Italian, which means that German assigns a positive value to the parameter WEAK PERSON, whereas Italian does not. In OI both structures are attested (Giorgi/Giusti 2010; Sanfelici, to appear). Descriptively, we may conclude that OI obeys two grammars: a modern one and a Germanic-like one. Interestingly, this observation accounts for a constellation of other phenomena, among which the order between the noun and the adjectives and the form of the relative pronoun in free relative clauses. As for the latter phenomenon, a correlation emerges: if a language allows for article-less kind names, the case realized on the relative pronoun in free relative clauses can be the one assigned within the relative clause as in OI: Amate [da cui male aveste] Love those from whom you received damages (Dante, Commedia, Purg.13,36) (Sanfelici to appear). This correlation seems to hold when looking at other languages, like Modern German and its historical stages (Bertollo 2014; Fuß 2021). Conversely, in those languages where kind names obligatorily co-occur with the definite article, the case on the relative pronoun is the one assigned from the matrix clause predicate, as in Modern Italian. To which extent the correlation between case realization on the relative pronoun in free relative clauses and the WEAK PERSON parameter holds will be investigated in this project testing the other clusters of properties connected to this parameter. In so doing, we will expand the list of parameters related to the DP proposed in Longobardi/Guardiano (2009) and subsequent works. In addition, we will compare the patterns exhibited in OI with those found in the historical Germanic languages to establish the realm and degree of the competition present in OI.
Another interesting case is represented by ambiguous evidence which potentially generates un-stability into the system. We illustrate this point with OI possessives (Giorgi/Giusti 2010, a.o.). The same noun can be modified by a possessive which can in turn be preceded (1a) or not (1b) by a definite article. (1a) is grammatical also in modern Italian, while (1b) is not.
(1) a. Lo re [...] prese la sua partita
The king took the his party (Novellino 18, 169) b. [...] io difenderò mia partita
I will defend my party (Novellino 81, 315)
In languages in which a determiner is required for a DP to be assigned a definite reading, prenominal possessives come in two types. In some languages, possessives assign themselves a definite reading to the DP without co-occurring with any determiner: this is one of the manifestations associated to a parameter, called D-CHECKING POSSESSIVES in CGL20. In other languages, like Modern Italian, possessives do not have this property and indeed they systematically co-occur with determiners in definite DPs, il mio libro vs. *mio libro. This is one of the manifestations of another distinct parameter, called ADJECTIVAL POSSESSIVES in CGL20. Universal Grammar does not impose any principled constraints which excludes both parameters to be set positively in one and the same language. Yet, there can specific structural configurations, e.g. the prenominal position, in which a choice between the determiner and adjectival functions of the prenominal possessive is imposed to the speaker by third factor principles to avoid synonymity (Bolinger 1968; for similar principles see Aronoff 1976; Kiparsky 1982). Thus, the co-existence of both of prenominal deteminer and adjectival possessives as in (2) is not expected in one and the same language. The two structural patterns create an unstable system which undergoes a change from Old to Modern Italian. This syntactic change is driven by third factor considerations.
For those phenomena that can be addressed with coding queries, we will investigate and measure whether the structural changes motivated by third factor considerations display the same characteristics of changes affecting syntactic doublets, e.g. the constant rate effect, reported in Kroch (1994 and subsequent work). In addition, the project will address whether and how the possible interactions between different manifestations can be encoded through implicational formulas. We will ask how many grammars in competition can be generated. In principle, we can have as many grammars as the questions we have. Yet, empirically, certain structural patterns are never attested, although predicted by a competing-grammar model (Biberauer/Holmberg/Roberts 2014). Put differently, the model needs to be constrained. To this end, we will formalize a model for the generation of grammars in competition, considering the role of the following factors: (i) principles of Inertia (Longobardi 2001; Keenan 2002); (ii) the different types of cross-parametric dependencies and implicational relations across different manifestations; (iii) frequency considerations (Yang 2002).
OUTPUTS
The project will offer an empirical and theoretical contribution to the logical problem of language acquisition, the status of parameter systems, and the syntax of the DP. The outputs will be relevant for theoretical linguistics, diachronic syntax and Romance linguistics. They will also have impact on language acquisition theories. The major outputs can be summarized as follows.
(1) A syntactically annotated corpus of OI fully compatible with the treebank corpora of other historical languages, which will be made available in an open access repository. It will be searchable with very similar, when not identical, queries: this opens the possibility to investigate historical Romance and Germanic languages comparatively with the same replicable methodology.
(2) A refined list of DP-parameters and related questions/manifestations proposed in CGL20. We will add novel parameters concerning the internal structure of relative clauses (Bertollo/Cavallo 2012; Bertollo 2014; Sanfelici to appear and references therein), which will enable us to gain a better understanding of the relation between the left periphery and the lower portion of the DP.
(3) A principled definition of the Restricted List of CGL20, containing the subset of core manifestations/questions assumed to be used as p-expressions (Clark/Roberts 1993) to actually set the parameter.
(4) Novel descriptions of the DP-syntax in OI corroborated with quantitative analyses, thereby implementing previous studies (e.g. Giorgi/Giusti 2010; Dardano 2012; Poletto 2014, among others).
(5) A principled characterization of the attested architectures of DPs in relation with their possible diachronic developments, obtained comparing OI and historical Germanic with modern varieties. Our formal apparatus in facts puts us in the best possible position to uncover and measure correlations between observable structures.
(6) A parameter setting procedure that bridges the gap between descriptive and explanatory adequacy. By testing the procedure on the corpora of different historical languages, which exhibit different DP syntax, we will validate the methodology and hopefully disentangle between competing hypotheses. The outputs of our project will have impact on acquisition studies as well.
(7) A parametric theory able to constrain the generative power of the competing-grammar model, obtained by investigating the typology of conflicting evidence, i.e. both real and superficial syntactic doublets.
DISSEMINATION
Dissemination and outreach will cover the whole duration of the project. The project results will be disseminated via presentations at international conferences/workshops (at least five) and papers to be submitted to top-level ISI and Scopus/WoS indexed Open Access journals. We plan a method paper describing the parsed corpus of OI, a paper discussing the methods to implement the parameter setting procedure on historical corpora, a comparative paper on relative clauses, a comparative paper on the DP structure in the languages investigated, and a final paper on the theory of parameters. Their impact will be measured in terms of standard citation indices such as the H index (applied to the PI and to research participants). We plan a final conference with major experts on parameters, where we will present our results. Finally, we plan two public outreach activities for the non-academic community to disclose our results.
SCIENTIFIC IMPACT
The project has a broad theoretical impact as it addresses one fundamental question in linguistics: to explain how speakers, when acquiring their native language, converge on a shared grammar notwithstanding the alternative possible options. The project addresses this issue through a novel method based on a systematic scan of historical corpora which allows for the extraction of syntactic signals relevant to set parameters using positive evidence only. The investigation of historical corpora is a promising testing ground for parameter setting models because it allows one to mimic the task of language acquirers when confronted with primary data. This is highly innovative, as traditional works on parameter variation are based either on introspective judgments or on selected data collected from grammars or similar sources. With the two methodologies proposed here, i.e. the automatic search through queries on the syntactically annotated corpora and the qualitative hands-on examination of the texts, parameter values emerge only from the surface patterns available in the texts, without resorting to negative evidence. The qualitative hands-on parameter setting procedure tested in this project can be applied almost without modifications to corpora such as the CHILDES database (http://childes.talkbank.org), analyzing both child productions and the portions of child-directed speech, and therefore qualifies as a tangible contribution to language acquisition studies.
These issues are strictly related to the wider problem of explaining language change. Combining traditional methods with new technologies, the project will set the ground for a model of diachronic variation where fine-grained qualitative analyses are corroborated by quantitative information. To this end, the research activities are based on the creation, implementation and analysis of syntactically annotated corpora. Corpora have been proven to be an extremely powerful tool to address fundamental questions in diachronic syntax because they allow one to investigate not only why and when linguistic change happens, but also to plot its trajectory.
The project will also impact on the linguistic and philological investigation of diachronic and synchronic variation across Italo-Romance because it will provide a first comprehensive and comparative study of the nominal domain in medieval and modern Italian varieties.
CULTURAL AND DIDACTIC IMPACT
Dissemination activities to the non-academic community are envisaged. UNIVR will play the leading role in forging a link between the RUs and the wider public, by organizing events which engage non-specialist audiences to share the scientific results and their possible applications. In this respect, two main objectives for two distinct target groups will be pursued: (i) a cultural impact on a wide public; (ii) a didactic impact resulting from the upskilling of language (L1/L2) teachers.
As for (i), a public meeting lasting approximately two hours will be organized to engage with the non-academic audience in an informal setting (e.g. European Researchers Night). On that occasion, it will be shown what our results drawn from the implementation of the parameter setting procedure on historical languages reveal about the structure of human languages and of language acquisition processes, highlighting the importance of intangible cultural heritage in the study of linguistic and cultural diversity. Moreover, it will be explained how new technologies can contribute to our understanding of language learning. We will design dedicated initiatives to make the scientific outputs accessible to people who are not familiar with linguistic research, such as demonstrations of how the research on historical corpora is carried out and interactive activities based on simplified queries to be performed directly by the audience.
In addition, (ii) a five-hour professional development course will be organized to promote the awareness of L1/L2 teachers about innovative linguistic research methodologies and to disseminate the obtained scientific results. We will take advantage of the established collaborations of UNIVR with regional school authorities (e.g. Ufficio Scolastico Regionale del Veneto) and further extend these collaborations to other regional authorities.
The professional development course will follow the workshop model and will consist of three steps:
a) A one-hour preliminary meeting, where the parametric setting procedure will be presented to the teachers especially by focusing on those aspects which can have a didactic impact and can thus be implemented in school curricula. It will be shown how open-access historical corpora can be used in language teaching to make students aware of the regularities underlying language variation and change. Moreover, it will be exemplified how language comparison helps our understanding of the mechanisms which are at the basis of linguistic diversity.
b) A three-hour meeting where teachers will work in small groups to design innovative learning units based on a simplified query-approach to linguistic data. The teachers will elaborate flexible work-projects to be implemented in Secondary School classes. Learner-centered methods will be adopted: students will explore linguistic questions, deal with different tasks and be confronted
with the data to come to a solution. On the basis of the target groups needs, the teachers will design the internal articulation of the activity and set the expected learning outcomes, which may consist of enhancing metalinguistic abilities, reflecting on diachronic language change, comparing different languages to understand how a parameter can be differently set, developing corpus literacy, etc.
c) A one-hour debriefing meeting where the teachers, after having implemented their learning units in class, will reflect on their grammar teaching modes, share best practices, and discuss whether the expected learning outcomes have been met by their students.
DATA SHARING
Relevant for the scientific community is the release of the syntactically annotated corpus of OI free of charge. This tool will be accessible through a dedicated website.
To effectively engage with the general public, the website created for the OI corpus will host also a public outreach section in which information about the project and third mission activities will be made available also through videos and other promotional materials. In addition, the learning units elaborated by the teachers during the professional development course will be published online (under their permission) to further disseminate innovative teaching practices.