BUCC 2012
The Fifth Workshop on Building and Using Comparable Corpora

Workshop program

09:00 – 09:10 Opening
Oral Presentations 1: Multilinguality (Chair: Pierre Zweigenbaum)
09:10 – 09:30 Philipp Petrenz, Bonnie Webber: Robust Cross-Lingual Genre Classification through Comparable Corpora
09:30 – 09:50 Qian Yu, François Yvon, Aurélien Max: Revisiting sentence alignment algorithms for alignment visualization and evaluation
Invited Projects Session (Chair: Serge Sharoff)
09:50 – 10:10 Inguna Skadiņa: Analysis and Evaluation of Comparable Corpora for Under-Resourced Areas of Machine Translation (ACCURAT, http://www.accurat-project.eu)
10:10 – 10:30 Andrejs Vasiļjevs: LetsMT! – Platform to Drive Development and Application of Statistical Machine Translation (LetsMT!, http://www.letsmt.eu)
10:30 – 11:00 Coffee Break
Invited Project Session (Contd.)
11:00 – 11:20 Núria Bel, Vassilis Papavasiliou, Prokopis Prokopidis, Antonio Toral, Victoria Arranz: Mining and Exploiting Domain-Specific Corpora in the PANACEA Platform (PANACEA, http://panacea-lr.eu)
11:20 – 11:40 Adam Kilgarriff, George Tambouratzis: The PRESEMT Project (PRESEMT, http://www.presemt.eu)
11:40 – 12:00 Béatrice Daille: Building Bilingual Terminologies from Comparable Corpora: The TTC TermSuite (TTC, http://www.ttc-project.eu)
12:00 – 12:30 Panel Discussion with Invited Speakers
12:30 – 14:00 Lunch Break
Oral Presentations 2: Building Comparable Corpora (Chair: Reinhard Rapp)
14:00 – 14:20 Aimée Lahaussois, Séverine Guillaume: A viewing and processing tool for the analysis of a comparable corpus of Kiranti mythology
14:20 – 14:40 Nancy Ide: MultiMASC: An Open Linguistic Infrastructure for Language Research
Booster Session for Posters (Chair: Marko Tadić)
14:40 – 14:45 Elena Irimia: Experimenting with Extracting Lexical Dictionaries from Comparable Corpora for: English-Romanian language pair
14:45 – 14:50 Iustina Ilisei, Diana Inkpen, Gloria Corpas, Ruslan Mitkov: Romanian Translational Corpora: Building Comparable Corpora for Translation Studies
14:50 – 14:55 Angelina Ivanova: Evaluation of a Bilingual Dictionary Extracted from Wikipedia
14:55 – 15:00 Quoc Hung-Ngo, Werner Winiwarter: A Visualizing Annotation Tool for Semi-Automatical Building a Bilingual Corpus
15:00 – 15:05 Lene Offersgaard, Dorte Haltrup Hansen: SMT systems for less-resourced languages based on domain-specific data
15:05 – 15:10 Magdalena Plamada, Martin Volk: Towards a Wikipedia-extracted Alpine Corpus
15:10 – 15:15 Sanja Štajner, Ruslan Mitkov: Using Comparable Corpora to Track Diachronic and Synchronic Changes in Lexical Density and Lexical Richness
15:15 – 15:20 Dan Ştefănescu: Mining for Term Translations in Comparable Corpora
15:20 – 15:25 George Tambouratzis, Michalis Troullinos, Sokratis Sofianopoulos, Marina Vassiliou: Accurate phrase alignment in a bilingual corpus for EBMT systems
15:25 – 15:30 Kateřina Veselovská, Nguy Giang Linh, Michal Novák: Using Czech-English Parallel Corpora in Automatic Identification of It
15:30 – 15:35 Manuela Yapomo, Gloria Corpas, Ruslan Mitkov: CLIR- and Ontology-Based Approach for Bilingual Extraction of Comparable Documents
15:35 – 16:30 Poster Session and Coffee Break (coffee from 16:00 – 16:30)
Oral Presentations 3: Lexicon Extraction and Corpus Analysis (Chair: Andrejs Vasiļjevs)
16:30 – 16:50 Amir Hazem, Emmanuel Morin: ICA for Bilingual Lexicon Extraction from Comparable Corpora
16:50 – 17:10 Hiroyuki Kaji, Takashi Tsunakawa, Yoshihoro Komatsubara: Improving Compositional Translation with Comparable Corpora
17:10 – 17:30 Nikola Ljubešić, Špela Vintar, Darja Fišer: Multi-word term extraction from comparable corpora by combining contextual and constituent clues
17:30 – 17:50 Robert Remus, Mathias Bank: Textual Characteristics of Different-sized Corpora
17:50 – 18:00 Wrapup discussion and end of the workshop

 

[Home] [CFP] [Topics] [Important dates] [Submission] [Committees] [Program] [Registration] [Venue] [Contacts] [Previous]