Skip to navigation Skip to content

Paul Shaw

Staff picture: Paul Shaw
Information and Computational Sciences
Information and Computational Sciences
+44 (0)1382 568864

The James Hutton Institute
Dundee DD2 5DA
Scotland UK



Paul focuses on software development for plant genetic resources, genetics and plant breeding. He leads several projects where his research contributes towards making experimental data including plant passport, pedigree, phenotypic and genotypic data available to collaborators, research and breeding communities using a suite of database and visualization tools that his team develops. He is particularly interested in biological visualization and how data can be effectively presented, explored and accessed in logical, digestible chunks in order to gain maximum impact and insight. He is also interested in how biological entities, such as plant accessions in pedigrees, and samples in plant breeding and genetics experiments, can be visualized and modelled using graphs.

The main software his group develops and maintains include the informatics platforms Germinate for the storage of experimental data resulting from plant germplasm collections and Helium for the visualization of complex plant pedigrees. His group is also active in the development of innovative mobile applications for the efficient collection of experimental data and tools to help with general informatics requirements of genotyping platforms such as the barley 50K SNP platform and maintaining Hutton’s seed store and Underpinning Capacity collections.

He maintains regular interactions with the international plant genetic resources community both through research interests, collaborations and involvement in international groups such as the International Network.

Current research interests

  • Plant pedigree visualization.
  • Development of the Germinate platform.
  • Development of bespoke applications to support the storage, manipulation and visualisation of biological data and new high throughput genotyping technologies.

The increasingly diverse range of data-types that are being used in modern plant genetics and breeding is a problem, both in how data is stored and how scientists can query and visualize in scientifically meaningful ways. My primary research focus is on how we efficiently store and query large datasets and investigating how user interfaces can allow researchers access to this data in ways that allow increased scientific value to be gained. I am particularly interested in how we can increase the overall data quality of data-sets through error identification and the development of electronic data capture and reference applications (mobile based) to aid field scientists. Any problems we can reduce at the data capture stage ensures high data quality for downstream processing and analysis.

A main focus of my work is in the development of complex online databases implemented in MySQL for the dissemination of data to both internal and external (academic and industrial) audiences and the development of the supporting programming infrastructure based on Perl and Java that is required to build on these customized tools. More recently I have been examining the problems associated with visualization of large plant pedigrees and addressing some of these with the development of visualization tools such as Helium.

The above video shows an example of visualization of a large barley pedigree in the Helium pedigree visualization application.

I have been actively involved in a number of international collaborations including involvement in the Seeds of Discovery project in collaboration with CIMMYT in Mexico where the tools which my team and I develop have been used to allow the online presentation of the CIMMYT maize and wheat genebank accessions which are part of this initiative. More recently I have started managing a new 3.5 year project with the Crop Trust in which we are utilising the database and visualization toolls which I develop to support the presentation of data across a wide variety of crop wild relative (CWR) species. Germinate will be the underlying platform on to which this data will be stored and presented and gives Germinate an increasing exposure within the plant genetic resources community.

Germinate 3

Germinate is a generic plant genetic resources database and offers facilities to store both standard collection information and passport data along with more advanced data types such as phenotypic, genotypic and field trial data. Germinate was originally implemented in MySQL and Perl but is moving with version 3 to MySQL and Google Web Toolkit (GWT). Germinate is not a standalone desktop application and requires MySQL, Apache Tomcat and GWT to run. The usual use case would be installation on a Linux based server.

Currently there are Germinate installations for barley, wheat, potato, Lolium, Festuca, maize and pea although the generic nature of the system makes it suitable for storage of data from a wide variety of sources and species. We are currently developing Germinate to handle large scale trials data and data from high throughput genotpying experiments as well as supporting the Crop Trust CWR Pre-Breeding project where Germinate will be used to hold data across a wide diversity of species.

Recent advances have allowed us to further develop Germinate including the ability to restrict access to specific users or to only allow access to data based on acceptance of licence agreements. We are actively working to increase the functionality of Germinate and welcome comments and suggestions on potential collaboration, or how we can improve Germinate going forwards.

Other Databases

Germinate Pea
Pea database comprising accessions from the John Innes collection, one of the widest and most comprehensive sets of Psium germplasm worldwide. Also includes Retrotransposon-Based Insertion Polymorphism (RBIP) marker data. 

Germinate Grasses
Database holding information on Lolium and Festuca accessions as well as genotypic data based on the DArT platform.

Commonwealth Potato Collection Database
This potato database contains passport and disease resistance data across the accessions contained within the Commonwealth potato collection at The James Hutton Institute. This resource is based on the Germinate 2 platform.

The Arabidopsis Nucleolar Protein database (AtNoPDB) provides information on 217 Arabidopsis proteins in comparison to human and yeast proteins.

Wheat In-Situ Database
This resource was developed and implemented at SCRI to hold the data obtained from a joint BBSRC and Syngenta project for high-throughput wheat in situ hybridisation.

Plant snoRNA Database
This resource brings together information from three independant computer-assisted searches of the Arabidopsis genome for box C/D snoRNA genes and from studies of ncRNAs.

Areas of Expertise

  • Database design and implementation.
  • Web application and interface development.
  • Biological data visualization.
  • Large scale biological data handling and error checking.

Current research projects 

  • The Global Crop Diversity Trust/Norwegian Government ‘BOLD’ Biodiversity for Opportunities, Livelihoods and Development 2022-2032 (PI)
  • BBSRC International Partnering Award ‘BarleyEUNetwork’ BB/V018906/1 2021-2024 (Co-I)
  • EU Horizon 2020 ‘BreedingValue’ ID:101000747” 2021-2025 (Co-I)
  • Templeton World Charity Foundation/Global Crop Diversity Trust ‘Safeguarding crop diversity for food security: Pre-breeding complemented with Innovative Finance’ ID TWCF0400 2019-2022 (PI)
  • INNOVATE UK ‘CherryBerry’ ID:104624 2019-2021 (Co-I)
  • Global Crop Diversity Trust/Norwegian Government ‘Adapting Agriculture to Climate Change: Collecting, Protecting and Preparing Crop Wild Relatives’ GS17010 2017-2021 (PI)


  • Forster, B.P.; Franckowiak, J.D.; Lundqvist, U.; Thomas, W.T.B.; Leader, D.; Shaw, P.; Lyon, J.; Waugh, R. (2012) Mutant phenotyping and pre-breeding in barley., In: Shu, Q.Y., Forster, B.P. & Nakagawa, H. (eds.). Plant Mutation Breeding and Biotechnology. CAB International, Wallingford. Chapter 25, pp327-346.

  • Thomas, W.T.B.; Comadran, J.; Ramsay, L.; Shaw, P.; Marshall, D.F.; Newton, A.C.; O'Sullivan, D.; Cockram, J.; Mackay, I.; Bayles, R.; White, J.; Kearsey, M.J.; Luo, Z.; Wang, M.; Tapsell, C.; Harrap, D.; Werner, P.; Klose, S.; Bury, P.; Wroth, J.; Argilllier, O.; Habgood, R.; Glew, M.; Bochard, A-M.; Gymer, P.; Vequaud, D.; Christerson, T.; Allvin, B,; Davies, N.; Broadbent, R.; Brosnan, J.; Bringhurst, T.; Booer, C.; Waugh, R. (2014) Association genetics of UK elite barley (AGOUEB)., Final Project Report to HGCA, No 528, 125pp.
  • Ramsay, L.; Thomas, W.T.B.; Comadran, J.; Marshall, D.F.; Shaw, P.; Waugh, R. (2011) Genetic diversity in modern barley varieties., Annual Report of the Scottish Crop Research Institute for 2010, pp23-24.
  • Milne, I.; Shaw, P.; Milne, L.; Bayer, M.; Stephen, G.; Marshall, D.F. (2010) Visualising genetic diversity., Annual Report of the Scottish Crop Research Institute for 2009, pp21-22.

Printed from /staff/paul-shaw on 03/07/22 04:19:55 PM

The James Hutton Research Institute is the result of the merger in April 2011 of MLURI and SCRI. This merger formed a new powerhouse for research into food, land use, and climate change.