Lecture
Text Mining for Cultural Heritage: Case Studies in the Natural History Domain
Speaker: |
Piroska Lendvai |
Date: |
Tuesday, 2 June 2009 |
Time: |
15:00-17:00 |
Location: |
"Mediterranean Studies" Seminar Room, FORTH. Heraklion, Crete |
Host: |
Martin Doerr |
Abstract: |
Natural history offers an interesting mix of traditional and modern
ways of organizing data, information, and knowledge. Within the MITCH
project we develop knowledge enrichment methods for museum collection
data, enhancing information access for researchers in taxonomy and
biodiversity. Our material (metadata of collection objects, as well as
textual database records) is largely composed of natural language
text, which is generally more noisy and ambiguous than numeric data.
We present three case studies in text mining, drawing on supervised an
unsupervised machine learning methods: named entity recognition in
digitized field trip logbooks, automated discovery of metadata from
textual databases, and content mapping in scientific publications.
|
Bio: |
After graduating from Pecs University (Hungary) in Language and
literature studies, Piroska Lendvai obtained a PhD in 2004 from
Tilburg University (Netherlands), working on the topic of machine
learning techniques applied to natural language dialogues for the
extraction of pragmatic-semantic information from spoken user input.
She then joined the Dutch national IMIX project that aimed at
developing a spoken dialogue system for IE and QA in the medical domain.
To present she is a postdoc researcher in the MITCH project,
developing text mining methods in the cultural heritage domain. She
was co-chair and organiser of the workshop on Language Technology and
Resources for Cultural Heritage, Social Sciences, Humanities, and
Education in Athens in March 2009.
|