Although Portuguese is one of the main world languages and researchers have been working on Portuguese electronic text collections for decades (e.g. Kelly, 1970; Biderman, 1978; Bacelar do Nascimento et al., 1984; see Berber Sardinha, 2005), this is the first volume in English that encapsulates the exciting and cutting-edge corpus linguistic work being done with Portuguese language corpora on different continents. The book includes chapters by leading corpus linguists dealing with Portuguese corpora across the world, and their contributions explore various methods and how they are applicable to a wide range of language issues.
The book is divided into six sections, each covering a key issue in Corpus Linguistics: lexis and grammar, lexicography, language teaching and terminology, translation, corpus building and sharing, and parsing and annotation. Together these sections present the reader with a broad picture of the field.
List of contributors Foreword, Mike Scott Acknowledgments Introduction, Tony Berber Sardinha and Telma de Lurdes São Bento Ferreira Section 1: Lexis and grammar 1. Looking at collocations in Brazilian Portuguese through the Brazilian Corpus, Tony Berber Sardinha 2. Lexical bundles in Brazilian Portuguese, Tony Berber Sardinha, Rosana de Barros Silva e Teixeira and Telma de Lurdes São Bento Ferreira 3. Changing ‘faces’: A case study of complex prepositions in Brazilian Portuguese, Tania Maria Granja Shepherd Section 2: Lexicography 4. The Corpus do Português and the Frequency Dictionary of Portuguese, Mark Davies 5. PtTenTen: A corpus for Portuguese lexicography, Adam Kilgarriff, Miloš Jakubícek, Jan Pomikalek, Tony Berber Sardinha and Pete Whitelock Section 3: Language teaching and terminology 6. Idiomaticity in a course book for Brazilian Portuguese as a foreign language, Telma de Lurdes São Bento Ferreira 7. Retrieving (onco)mastology terms in Portuguese corpora, Rosana de Barros Silva e Teixeira Section 4: Translation 8. Understanding Portuguese translations with the help of corpora, Ana Frankenberg-Garcia 9. The Per-Fide Corpus: A new resource for corpus-based terminology, contrastive linguistics and translation studies, José João Almeida, Sílvia Araújo, Nuno Carvalho, Idalete Dias, Ana Oliveira, André Santos and Alberto Simões 10. The CoMET Project: Corpora for teaching and translation, Stella E. O. Tagnin Section 5: Corpus building and sharing 11. Corpora at Linguateca: Vision and roads taken, Diana Santos 12. The Reference Corpus of Contemporary Portuguese and related resources, Maria Fernanda Bacelar do Nascimento, Amália Mendes, Sandra Antunes and Luísa Pereira 13. C-ORAL-BRASIL: Description, methodology and theoretical framework, Tommaso Raso and Heliana Mello Section 6: Parsing and annotation 14. PALAVRAS: A Constraint Grammar-based parsing system for Portuguese, Eckhard Bick 15. New corpora for ‘new’ challenges in Portuguese processing, Sandra Maria Aluísio, Thiago Alexandre Salgueiro Pardo and Magali Sanches Duran Index
Tony Berber Sardinha is Associate Professor, Department of Linguistics and Graduate Program in Applied Linguistics, Catholic University of Sao Paulo, Brazil Telma de Lurdes Sao Bento Ferreira is ESOL teacher and translation coordinator, Lexikos Cursos e Traducoes Ltda, Brazil
Reviews for Working with Portuguese Corpora
This book is chock-full of excellent papers, many of them by world-class corpus linguists. It should be on the reading list of anyone who has the slightest interest in corpus-linguistic perspectives on language. -- Michael Hoey, Baines Professor of English Language and Pro-Vice Chancellor, University of Liverpool, UK Working with Portuguese Corpora contains chapters that are very accessible to anyone interested in using corpora, as well as texts that are better understood by computational and/or corpus linguists. This blend makes the book accessible to a wide audience and valuable to linguists who utilize Portuguese language corpora for any purpose, from annotating corpora to developing language teaching materials. * Modern Language Journal * Working with Portuguese Corpora is a rich collection of research looking at Portuguese. That in itself is exciting - to have a major volume on a non-English language. But the editors did not stop there. Tony Berber Sardinha and Telma de Lurdes Sao Bento Ferreira have assembled an exciting group of scholars who apply various corpus approaches to language analysis from a lexical and grammatical level, to using information to explore pedagogical implications and applications for translation, as well as addressing issues related to annotating and parsing Portuguese corpora. This well rounded volume is a welcome addition to research on Portuguese. -- Randi Reppen, Professor of Applied Linguistics, Northern Arizona University, USA This impressive collection brings together the leading scholars in Portuguese corpus linguistics and makes some cutting-edge research that has previously been only discussed in Portuguese language publications accessible to a wider audience. I would recommend the volume to corpus researchers, Romance linguists, NLP researchers, and graduate students of corpus and applied linguistics. Readers will appreciate the detailed accounts of available Portuguese corpus resources and their practical applications in lexicography, phraseology, translation studies, terminology extraction, and language teaching. -- Ute Roemer, Assistant Professor of Applied Linguistics and ESL, Georgia State University, USA