What we do
Three complimentary elements of textual big data research:
- open source infrastructure
- a scientific programme
- technology transfer
The GATE infrastructure is the most wide-ranging and widely-used system of its
type in the world, with millions of users and hundreds of €millions of
economic impact. It includes:
- an IDE, GATE Developer: an
integrated development environment for language processing components
bundled with a very widely used Information
Extraction system and a comprehensive set of
other plugins
- a web app: GATE Teamware a collaborative
annotation environment for factory-style semantic annotation projects built
around a workflow engine and a heavily-optimised backend service
infrastructure
- a cloud computing solution for hosted large-scale text processing
(GATE Cloud.net)
- GATE Mímir: (Multi-paradigm
Information Management Index and Repository) a massively scalable
multiparadigm index built on Ontotext's
semantic repository family, GATE's
annotation structures database plus full-text indexing from
MG4J
- a framework, GATE Embedded:
an object library optimised for inclusion in diverse applications giving
access to all the services used by GATE Developer and more
- an architecture: a high-level organisational picture of how language
processing software composition
- a process for the creation of robust and maintainable services
- a wiki/CMS (GATE Wiki.sf.net), mainly to host
our own websites and as a testbed for some of our experiments