
Basic Statistics

We define the product space as the set of all proximity measures. We now concentrate on the product space built using trade flow data from 1998-2000, which consists of a 1006x1006 matrix whose entries are the proximity between products. Each row and column of this matrix represents a particular product and each off-diagonal element represents the proximity between a pair of products. Figure 1sA shows the proximity matrix where columns are sorted using its sitc4 code name. Figure 1sB shows the same matrix sorted using an average linkage clustering algorithm, revealing its modular structure and the many empty rows and columns which belong to untraded products. In fact only 775 products conforms the actual product space.
Figure 1. The product space matrix representation. A. The product space matrix sorted in incresing order of the Sitc4 numerical code. B. The product space hierarchically clustured shows a modular structure and also revelas that only 775 products actually belong to the space.
Proximity values follow a broad, approximately log-normal distribution. Figure 2s A shows the number of links below a certain threshold. Figure 2s B. shows the frequency distribution of proximities. This broad, heterogeneous distribution evidences that the product space has a few strong links and many marginal links, which are not significant and represent the background of the proximity measure.

Figure 2. Distribution of Proximity Values. A. Cumulative Distributon of Proximity Values. B. Density Distribution of Proximity Values.
At a value of 0.5 the proximity matrix has a giant connected component. Figure 3s shows a rough network representation of the proximity matrix in which only links above 0.5 have been considered. This visualization is not the most suitable and shows that a simple threshold criterion does not reveal much of the structure. We invite you to look at our more sophisticated visualization technique in the next section.
Figure 3. Network representation of the proximity matrix that considers only the proximity values above 0.5.
|