Molecular Science on the Grid

Peter Murray-Rust, Unilever Centre, University of Cambridge

JISC Consultation Workshop: Building Collaborative eResearch Environments, NESC, 2004-02-23

Power Corrupts; Powerpoint corrupts absolutely (Tufte). Material is therefore in XML

This presentation will be largely interactive using XML tools and remote access. Main site: http://wwmm.ch.cam.ac.uk/

The problem

Remote collaboration (let alone eCollaboration) is not common in chemistry. The major problems are cultural. Chemists may occasionally share central equipment, but will compete in most issues.

Chemistry is conservative. The eRevolution has yet to impact software and information providers.

As examples OpenSource, OpenData and OpenAccess are largely unknown and unappreciated by senior chemists.

IPR issues stultify and frustrate development - most chemical data is owned and sold by unimaginative companies and quasi-companies that wish to protect untenable restrictive practices.

There is no ontology. There are no standards. The companies preserve noninteroperability as a way of protecting dimishing market share rather than looking at what the rest of the world is doing...

Rays of hope

The biosciences need chemistry and are increasingly frustrated with this - they are bypassing traditional paths and finding their own solutions:

The Internet revolution has shown the future. Our undergraduates use Google for their literature searches. They normally fail because of IPR. BUT Open Access should revolutionalise this.

W3C technology and proticols are unstoppable.

Chemical Markup language

Therefore we (PMR and Henry Rzepa) developed CML (the very first XML language). CML feeds off all the W3C inititiatives (XML, XSLT, DOM, Schema, XPath, RSS, RDF, OWL, etc.). This is an enormously powerful driver.

A major driver for chemical eScience is therefore CML technology (technology push).

http://www.xml-cml.org for Chemical Markup Language

Selling eChemistry/CML

CML does not sell well within mainstream academic chemistry !! Our support comes from elsewhere.

Marketing Strategy

Develop mainstream W3C implementations

Talk passionately

Make everything Open

Use the power of the web for marketing

Collaborate with early adopters

Create web-based demonstrators

Distribute toolkits

Create evolutionary collaborative environments.

Create new "business models" - information barter. Trade services for Open data rather than try to sell data for money.

have faith and patience...

CML Toolkit

CML has a complete toolkit created mainly by volunteers. It includes:

Everything is OpenSource, OpenData and OpenAccess.

Gridification supported by DTI/Cambridge eScience. Includes: