This is a dataset of documented potato varieties and breeding lines.
The fields available for each row vary greatly, according to the source that they're drawn from. I've included a "number_of_sources" field that might help to assess the popularity/relevance of a certain strain.
Source
This dataset was scraped from the European Cultivated Potato Database, published by Science and Advice for Scottish Agriculture.
Notes
The JSON file includes a list of image URLs in the 'images' field. Other than that, the CSV and JSON versions are identical with regards to the fields included.
Some fields might have more than one value, because different sources specify different values. In these cases, the different values are separated by "/".
There were some fields related to compliance to the Plant Health Directive (EC77/93). I removed them, but pester me if it's interesting for you and I can add the fields back.
The parse_html_pages.py
script is where the conversion happens. There is no need to run it, it will just re-generate the CSV and JSON files. It will also need the downloaded HTML files from the Europotato website. For this I used the DownThemAll extension for Firefox in order to download all the HTML files for each variety. I didn't include these since each strain also has a "url" field that points to the EuroPotato website.
Fields
These are all the fields that can be found on the database:
- title
- name
- number_of_sources
- url
- higher_taxon
- genus
- pedigree
- breeder
- breeders_rights
- synonyms
- national_list
- data_source
- resistance_to_late_blight_on_foliage(artificial_inoculum_in_the_field)
- country_of_origin
- flower_colour
- flower_frequency
- foliage_cover
- maturity
- primary_tuber_flesh_colour
- tuber_eye_depth
- tuber_shape
- tuber_skin_colour
- tuber_skin_texture
- tuber_shape_uniformity
- yield_potential
- dry_matter_content
- starch_content
- frost_resistance
- growth_habit
- light_sprout_colour
- tuber_eye_colour
- dormancy_period
- growth_cracking
- hollow_heart_tendency
- internal_rust_spot
- resistance_to_external_damage
- resistance_to_internal_bruising
- secondary_growth
- tuber_size
- tubers_per_plant
- after_cooking_blackening
- cooking_type_/_411_cooked_texture
- crisp_suitability
- enzymic_browning
- french_fry_suitability
- taste
- resistance_to_dry_rot_(fusarium_spp.)
- resistance_to_late_blight_on_foliage
- resistance_to_late_blight_on_foliage(laboratory_test)
- resistance_to_late_blight_on_tubers
- resistance_to_late_blight_on_tubers(laboratory_test)
- resistance_to_stem_canker_(rhizoctonia_solani)
- wart_(synchytrium_endobioticum)
- resistance_to_bacterial_soft_rot_(erwinia_spp.)
- resistance_to_blackleg_(erwinia_spp.)
- resistance_to_common_scab_(streptomyces_scabies)
- resistance_to_potato_leaf_roll_virus
- resistance_to_potato_virus_a
- resistance_to_potato_virus_x
- resistance_to_potato_virus_y_(strain_not_specified)
- resistance_to_globodera_rostochiensis_race_1
- pollen_fertility
- early_harvest_yield_potential
- field_immunity_to_wart_races
- tuber_glycoalkaloid
- resistance_to_powdery_scab_(spongospora_subterranea)
- resistance_to_globodera_pallida_race_2
- resistance_to_globodera_rostochiensis_race_2
- resistance_to_globodera_rostochiensis_race_3
- berries
- storage_ability
- frying_colour
- resistance_to_globodera_rostochiensis_race_4
- resistance_to_gangrene_(phoma_foveata)
- resistance_to_globodera_pallida_race_3
- resistance_to_late_blight_on_tubers(artificial_inoculum_in_the_field)
- resistance_to_potato_virus_m
- resistance_to_potato_virus_s
- secondary_tuber_flesh_colour
- resistance_to_globodera_pallida_race_1
- resistance_to_globodera_rostochiensis_race_5
- susceptibility_to_wart_races
- resistance_to_dry_rot_(fusarium_coeruleum)
- resistance_to_potato_virus_b
- resistance_to_potato_virus_c
- protein_content
- resistance_to_tobacco_rattle_virus
- resistance_to_early_blight_(alternaria_solani)
- resistance_to_potato_virus_yn
- drought_resistance
- stolon_length
- resistance_to_bacterial_wilt_(ralstonia_solanacearum)
- resistance_to_ring_rot_(clavibacter_michiganensis_ssp._sepedonicus)
- resistance_to_potato_mop_top_virus
- core_collection
- stolon_attachment
- tuber_greening_before_harvest
- presence_of_late_blight_r_gene
- resistance_to_dry_rot_(fusarium_sulphureum)
- adaptability
- resistance_to_late_blight_on_foliage(natural_inoculum_in_the_field)
- rate_of_bulking
- resistance_to_fusarium_wilt_(fusarium_oxysporum)
- resistance_to_late_blight_on_tubers(natural_inoculum_in_the_field)
- resistance_to_aphids
- resistance_to_tuber_moth