e-Science logo Nesc logo
About NeSC
e-Science Institute
e-Science Hub
e-Science Events
Presentations & Lectures
Technical Papers
Global Grid Links
UK e-Science Centres
UK e-Science Teams
Career Opportunities
Bibliographic Database


Paper ID: 1582

GEDDM: Comparisons of OGSA-DAI and GridFTP for access to and conversion of remote unstructured data in legal data mining.
Karen,Loughran Mark,Prentice Paul,Donachy Terry,Harmer Ron H,Perrott Sarah,Bearder Jens,Rasch

Appeared in: Proceedings of the UK e-Science All Hands Conference 2005 website: http://www.allhands.org.uk/2005/
Page Numbers:
Publisher: Engineering and Physical Sciences Research Council
Year: 2005
ISBN/ISSN: 1-904425-53-4
Contributing Organisation(s):
Field of Science: e-Science

URL: http://www.allhands.org.uk/2005/proceedings/papers/311.pdf

Abstract: Managing unstructured data is a problem that has been around for as long as people have been using computers to electronically process information. As demands for data collection increases so does the number of formats and structures in which it is stored, presenting inherent problems for data mining applications. Additionally, the sheer volume of data presents challenges for access and conversion in a timely manner. To further compound this problem the size of datasets will increase exponentially in future. There is therefore a need to access and convert large quantities of data from a variety of formats in a common, parallel and structured manner. GEDDM is a collaborative industrial e-Science project in conjunction with BESC and industrial partners Datactics Ltd. A Common Semantic Model (CSM) is defined to assist with the representation and conversion of data from various sources. This model facilitates the conversion of data residing in a range of formats into a common format for subsequent data mining. The project exposes CSM conversion capabilities via a suite of Grid Services called Data Conversion Services (DCS). This paper presents two implementations of the DCS. One under OGSA-DAI and another under GridFTP. Implementation and results are discussed, evaluated and conclusions presented.

Keywords: e-Science, AHM 2005



Last Updated: 22 Jun 12 11:02
This is an archived website, preserved and hosted by the School of Physics and Astronomy at the University of Edinburgh. The School of Physics and Astronomy takes no responsibility for the content, accuracy or freshness of this website. Please email webmaster [at] ph [dot] ed [dot] ac [dot] uk for enquiries about this archive.