Scatterplot

patentat

Participants: Robert Oxhorn, Ricardo Lafuente, Michael Murtaugh

Scatterplot seized all the data that was collected on potatos in the form of csv-files, in order to experiment with a form of 'total' visualisation, gathering all possible correlations. On the one hand, the diagram shows the quantity of correlated data, on the other hand, th colour code shows to what extent the data correlate between each other.

Of course, there have been numerous obstacles, like the nomenclature for example. The European Cultured Potato Database is almost entirely non-digital. By developing Python scripts that translate de words in numbers, they could be represented in a file that allowed to establish correlations. But the desire to be precise raised questions, like how to interpret terms like 'medium' and 'little'? How to calculate the centimeters between 'medum' and 'small'?

Using numpy they created a matrix in which each column is relative to the other, producing values from -1 to 1, -1 being an inverse correlation, 0 inicating a non-correlation and 1 giving a positive correlation. The first important negative correlation turned out to be between 'the capacity of adaptability' and 'the taste': -0.917662935482. This said, in the entire database there were only three potatos that presented values for these two specific categories. From a statistical point of vue, this weak presence gives the correlation a low quality, but we prefer to keep the truth in numbers in a state of suspense.

The html-page uses two colour codes, blue indicates negative correlations while red shows positive correlations. All possible correlations are shown. This gives an immediate overview, altought it takes time to read into the interdependant relationships.

Try Scatterplot.