Download specialized text mining through specific keywords

Document related concepts
no text concepts found
Transcript
11th International Conference on Terminology and Knowledge Engineering
Ontology, Terminology & Text Mining
19-21 Jun 2014
DIN Deutsches Institut für Normung e. V.
Berlin, Germany
SPECIALIZED TEXT MINING
THROUGH SPECIFIC KEYWORDS
Plested, María Cecilia;
Pulgarín, Maira Alejandra;
Díaz, Adriana Lucía
Research Group for Terminology and Translation- GITT, University of Antioquia,
Colombian Terminology Network, Calasanz’ School, Medellin
KOLUMBIEN
BACKGROUND
Colombia
Ecuador
Perú
Chile
-
Colombia is
- a MONOLINGUAL country in
relation to Spanish
a MULTILINGUAL in relation to the
indigenous languages
OBJECTIVE
To present the structure of a
methodological procedure for the
specialized text mining through specific
keywords applicable to different LSP- text
typologies in specialized areas. In this
case, the use of a foreign language by the
experts is unusual
Many researchers are unaware of the use
appropriate search engines to select the
right documents in order to determine the
specialized texts in the needed foreign
language.
In that case, the benefits of a digital
interactive text mining can be greatly
reduced.
INTRODUCTION
According to the research carried out on
cognitive autonomy and reading literacy
(Pulgarín,Plested, 2012), readers of
specialized fields (most of them
postgraduate students) considered that
their level of a foreign language
appropriation was weak. That means they
do not have the competence to determine
keywords derived of a concept system in
order to build textual relationships to
understand a text in a foreign language or
Reading Strategies in a Foreign Language. Pulgarin, Plested, 2012
PROBLEM
For
most
of
researchers in many
specialized fields to
get
specific
BIBLIOGRAPHY in a
foreing language, even
if they can read in that
foreign language, is
like…
THEORETICAL SUPPORT
Our research’s theoretical starting point is based
on the WIKO Model. It is understood as an
organizational focus of expertise constituted by
conceptual units that represent the subject field or
discipline of every spoken or written process
coherently.
The contextualized terminology in a foreign
language must be seen as an additional tool in the
learning or us of that language. This practice
improves the acquisition basis and gives users the
possibility to elaborate and acquire new
knowledge.
In addition, one of the most important aspects is to
THEORETICAL SUPPORT
Knowledge as center, Plested, 1999, 2000
THEORETICAL SUPPORT
Methodology for Terminology. Díaz, 2007
Preparation of the terminological analysis
- the work team : terminologists + subject field
experts
- determination of responsibilities
M
E
T
H
O
D
O
L
O
G
Y
Multilingual work
- corpus
- determination of the key words
- pilot evaluation
Validation workshop
- inside the work team
- with the community of the subject field experts
Results
- Training of the experts about this specific
methodology management
-
SPECIALIZED TEXT MINING THROUGH SPECIFIC
KEYWORDS
MULTILINGUAL
GLOSSARIES
TEXT MINING
STRATEGIES
PROJECTS
Análisis diacrónico de los conceptos: “definición,
concepto, análisis terminológico y rastreo terminológico”.
Hacia una precisión conceptual en Colombia
Terminological Thesaurus in Translation and
Interpretation Terminology 'TETIT‘
Conceptualización Metodológica en Clínica Forense:
análisis de inconsistencias y ambigüedades (Forensics
project)
Social Circus of the Cirque du Soleil, Montréal Canada
Conceptualización Metodológica en Clínica Forense: análisis
de inconsistencias y ambigüedades (Forensics project)
La obtención de ADN en el
ámbito penal suele tener al
menos dos escenarios y dos
momentos distintos dentro
del proceso. Por un lado se
trata de obtener el ADN
dubitado (la prueba) de la
escena del delito o del
cuerpo de la victima y, por
otro, la obtención del ADN
de referencia de las personas
implicadas en el proceso con
las que realizar el análisis
genético comparativo.
ÁREAS
Odontologí
a
Clínica
Forense
Fotografía
RUC
(Registro
Único de
Cadávere
s
Psiquiatrí
ay
psicología
CRRV
(Centro
de
Referenci
a sobre la
Violencia)
Balística
Ciencias
forenses
Entomología
Biología
Forense
Topografía y
dibujo
Química
Grafología y
documentos
copia
Patología
Histopa
tología
Física
aplicada
CLíNICA FORENSE
Drogadicción
Persona
Compensada
Embriaguez
Estado de
salud de la
persona
privada de
libertad
Prisión domiciliaria
Alcoholismo
Violencia
sexual
Lesiones no
fatales de
causa
externa
Arma de fuego
Golpe contundente
Maltrato
Abuso
Social Circus of the Cirque du Soleil, Montréal
Canada
Screen from TBAC’s glossary. (Source: Díaz, 2013)
TERMINOLOGICAL PROCEDURE
Achieved results
Methodology for Terminology. Díaz, 2007
CONCLUSIONS
• Each terminology analysis permitted effective
text mining and determined specific keywords
to build the glossaries
• The application of COLTERM methodology for
terminological work combined with reading
strategies allowed a better use of specialized
keywords by experts, students or researchers,
which knowledge in a foreign language is
weak.
• SPECIALIZED TEXT MINING THROUGH SPECIFIC KEYWORDS
gave a harmonized and effective
CONCLUSIONS
• The production of specialized glossaries and
lexicons favored the creation of an
interdisciplinary awareness and scientific
community.
• All the processes in the projects were validated
through workshops of experts and guided
activities for autonomous learning.
• The specialized text mining through specific
keywords allowed experts in different subject
fields to determine a set of informational
contents which provide them a better
contextualized conceptual understanding of
LSP-documentation of their disciplines.
¡MUCHAS GRACIAS!
MERCI BEAUCOUP!
THANK YOU VERY MUCH!
WIR BEDANKEN UNS FÜR IHRE
AUFMERKSAMKEIT
Maria Cecilia Plested
[email protected]
Maira Alejandra Pulgarín
[email protected]
Adriana Lucia Díaz
[email protected]