Welcome to MULTPLATCOL!

What is MULTPLATCOL?

It is a corpus-based Multilingual Platform of Collocations Dictionaries. It is a project financially supported by The São Paulo Research Foundation (FAPESP - Process ner 2020/01783-2).

So far, MULTPLATCOL is made up of dictionaries in five languages: Portuguese, English, Spanish, French and Chinese, forming different language pairs. In the near feature, more languages will be added and more pairs of language will be formed.

MULTPLATCOL USERS

The platform is aimed to be customized for different target audiences according to their needs. It is specially designed for foreign language learners, foreign language teachers, learner and professional translators, material developers, lexicographers and researchers, or any other target audience who may be interested in learning collocations more deeply.

How did MULTPLATCOL Project come up?

MULTPLATCOL Project is the result of many different researches carried out by Dr. Adriane Orenha-Ottaiano, from São Paulo State University (UNESP), for more than 12 years.

What are MULTPLATCOL’s main goals?

It aims to promote learning and translation of collocations more effectively, so that the dictionaries’ user can develop their proficiency in the above mentioned languages, and thus use the language more naturally and effectively.

As this is the first Multilingual Platform of Collocations Dictionary in the languages above mentioned, especially with a focus on the various types of collocations, we hope to achieve the goal and challenge of meeting the dictionary users’ collocational needs so that they can achieve proficiency and native-like naturalness.

MULTPLATCOL’s Structure and Design

The MULTPLATCOL aim at fulfilling users’ needs regarding language encoding, and, as such, are considered to be a production dictionary. Besides helping users produce more authentic texts, MULTPLATCOL also has the purpose of developing users’ collocational competence, which is intrinsically connected with fluency. The wider the repertoire of collocations, the greater fluency a learner can achieve.

Moreover, the platform is intended to have an easy-to-use layout that offers the possibility of being customized. Since foreign language learners or dictionary users in general encounter challenges in using collocations in their native language, MULTPLATCOL is also designed to display monolingual dictionaries. Thus, it will serve as a monolingual, bilingual or multilingual dictionary (English, Portuguese, French, Spanish and Chinese), also taking into account that collocations are automatically activated for each language covered by the platform.

What are Collocations?

The literature on collocations shows that there are two most distinct approaches to operationalize the identification and definition of collocations, and in our research project we will define collocations guided by both.

Under a statistically oriented approach, we view collocations as frequent word combinations whose co-occurrence within a certain distance of each other is statistically higher than expected in comparison to any other words randomly combined in a specific language (Barfield; Gyllstad, 2009; Nesselhauf, 2005; Sinclair 1966, 1991; etc.). However, as Teubert (2004: 188) mentioned being statistically significant is not enough to identify a combination of words as a collocation: ‘They also have to be semantically relevant. They have to have a meaning of their own, a meaning that isn’t obvious from the meaning of the parts they are composed of’.

For this reason, it is important to describe collocations under a phraseological approach, and so we define collocations as pervasive, recurrent, and conventionalized combinations consisting of a base and a collocate (Haussmann 1979, 1989), which are lexically and/or syntactically fixed to a certain degree (Pazos-Bertrán, Orenha-Ottaiano & Xiong, in press). They can be said to be on a special meaning only in combination (when only combined) with the base (Alonso-Ramos, 1994; Corpas 1996; Haussmann, 1989; Heylen & Maxwell, 1994; Orenha-Ottaiano, 2020; Pamies, 2019; Penadés Martínez, 2017; Torner & Bernal, 2017).

Collocations may have a (more or less) restricted collocational range and this means that the more general a word (the base) is, the more senses it has and the greater number of collocates it attracts for each sense (à a broad collocational range). On the other hand, the more specific a word is, the fewer senses it has and the fewer number of collocates it attracts (à a narrow collocational range).

They are a language and a culture’s specific combinations and, as such, the collocability of their elements may vary significantly from a language to another, and thus each language is made up of its own collocational networks.

For example, if a learner or a translator wants to express the idea that it is very very cold, he or she could simply use more specific adverbs which could pass on more accurate information to his or her recipient, such as: It is bitter/extremely/freezing cold (example of adverbial collocations). If I say mitigating, two words would come up immediately if you are a native speaker of English: factor or circumstance. MULTPLATCOL will help you find the most suitable and frequent lexical items that co-occur with the word you are searching for!

What is the role of collocations in foreign language learning?

In the process of speaking, native speakers do not simply bring separate words together, they also use “prefabricated blocks”, as if they were only one word. Hence, what appears to be spontaneous is actually a stereotyped fixed and repetitive speech, and if the speaker does not have a vast repertoire of these stereotyped fixed units (collocations, for instance) at their disposable, their speech may not sound natural. Thus, the broader the repertoire of collocations, the better a learner will be able to communicate or write in a foreign language. So, that is the importance of and the reason why we compiled a platform like MULTPLATCOL!

What is the role of collocations in translation?

With regard to the translation of collocations, the translator must know more deeply and precisely the collocational network of each language, so that he can produce more idiomatic and fluent texts in the target language. Again, we draw the attention to the relevance of MULTPLATCOL!

MULTPLATCOL’s Methodology

MULTPLATCOL’s methodology relies on the combination of automatic methods to extract candidate collocations (Garcia et al., 2019a) with careful post-editing performed by lexicographers. The automatic approaches take advantage of NLP tools to annotate large corpora with lemmas, PoS-tags and dependency relations in five languages (English, French, Portuguese, Spanish and Chinese). Using these data, we apply statistical measures (Evert et al., 2017; Garcia et al., 2019b) and distributional semantics strategies to select the candidates (Garcia et al., 2019c) and retrieve corpus-based examples (Kilgarriff et al., 2008). We also rely on automatic definition extraction (Bond & Foster, 2013) so that collocations can be more effectively organized according to their specific senses.

The taxonomy of MULTPLATCOL’s collocations

In order to help the target audience use collocations more precisely and productively, the dictionaries will deal with all types of collocations. It covers various syntactic structures of collocations that fit into the following taxonomy:

  • Verbal
  • Noun
  • Adjectival
  • Adverbial

To understand more clearly, please check the structures of each type of collocations and their examples right below:

Figure 1: Taxonomy of collocations and examples (Orenha-Ottaiano, forthcoming)

AKNOWLEDGEMENTS

To the São Paulo Research Foundation (FAPESP), grant 2018/22943-8, for the Post-Doctoral Fellowship, which enabled us to conclude the compilation of MULTPLATCOL.

To the entire GBD and MULTPLATCOL teams, who helped make its creation possible!