5.1. HyperAdmin input data file format

This section describes the stocks (statistical data) file that HyperAdmin expects as input.

[Important]

Please note the following requirements for the input data file:

  • the input data file must be a spreadsheet xls file (editable by Microsoft Excel and Open Office) named "*data*.xls": the filename must include the "data" characters sequence and the .xls extension is required.

  • the values of stocks must be provided for the lowest level of territorial units. This list is available in the example data template that depends on the selected structure/geometry model at previous step.

  • all values for all units must be filled;

Following sections describe the expected format (sheets, columns and possible values) for the version 2 of this data.xls "stocks" file.

5.1.1. About

Table 5.1 provides an example for this mandatory sheet in the data v2 input xls file.
Table 5.1. V2 sample About sheet
VERSION TIME_ENABLED
2 TRUE

This sheet aims at identifying the version of the format of this data file. Currently (2010-2011), only the value 2 is possible for the VERSION column.

The expected value for the TIME_ENABLED column is a boolean: only TRUE or FALSE values are possible:

  • The TRUE value shows that values are available for the sames labels of indicators at several dates: for example, the population in 2000, the population in 2002.

  • The FALSE value shows that each indicator is given for a single date.

5.1.2. Data

Table 5.2 provides an example for this mandatory sheet in the data v2 input xls file.
Table 5.2. V2 sample Data sheet
UT_IDpop2000pop2002area2000gdp2000gdp2002
AT1111152710
AT1123164811
AT1135176912

This sheet must provide at least three columns: UT_ID then at least two indicators identifiers (in HyperAtlas, there must be at least one numerator stock and one denominator stock). The Table 5.2 shows five indicators identifiers: pop2000, pop2002, area2000, gdp2000 and gdp2002. These identifiers must be described in the StockInfo sheet (see Section 5.1.8).

The UT_ID column must provide the list of territorial units at the lowest rank (example, at NUTS 3 level) of the dataset. The units are referenced by their identifiers that must match the given values in the associated structure.xls input file.

Then, each other cell provides a value for the given indicator column at the given unit row. For example in Table 5.2, 17 is the value for pop2002 indicator in AT113 territorial unit.

[Important]

Each cell must be valuated. Missing values are not accepted here.

5.1.3. Default

Table 5.3 provides an example for this optional sheet in the data v2 input xls file.
Table 5.3. V2 sample Default sheet
DEFAULT_NUM DEFAULT_DEN
pop area

This sheet aims at providing a default indicator to be selected in HyperAtlas at startup for the denominator and for the numerator combo boxes. Expected values for both columns are valid indicators identifiers that must match two of those defined in the StockInfo sheet (see Section 5.1.8).

5.1.4. Label

Table 5.4 provides an example for this mandatory sheet in the data v2 input xls file.
Table 5.4. V2 sample Label sheet
LABEL_ID LANG_CODE NAME DESC
1ENTotal populationTotal population in thousands
1FRPopulation totalePopulation totale en milliers
2ENAreaTotal area
2FRSuperficieSuperficie totale
3ENGDPGross domestic product
3FRPIBProduit intérieur brut
4ENGDP/InhabitantGross domestic product per inhabitant
4FRPIB/HabPIB par habitant
5ENDensityDensity of population
5FRDensitéDensité de population

This sheet aims at providing the internationalized names and descriptions for the indicators and predefined ratios. The LABEL_ID and LANG_CODE provides indexes for this table: for a given label identifier there may be several available translations. Thus, the LABEL_ID = 1 is available in english (LANG_CODE = EN) and french (LANG_CODE = FR) languages. In the StockInfo sheet, each indicator reference a label identifier. As several indicators may be similarly named and described (when an indicator is valuated for several dates), these labels have been exported here.

[Note]

The language identifier code must be a valid ISO Language Code. These codes are the lower-case, two-letter codes as defined by ISO-639. Nevertheless, the parser supports upper-cases. You can find a full list of these codes at a number of sites, such as: http://www.ics.uci.edu/pub/ietf/http/related/iso639.txt (2011-03-16).

Note that values in the LABEL_ID column may be referenced from the StockInfo sheet (see Section 5.1.8) and from the RatioStock sheet (see Section 5.1.7).

5.1.5. Metadata

Table 5.5 provides an example for this optional sheet in the data v2 input xls file.
Table 5.5. V2 sample Metadata sheet
UT_ID STOCK_ID PROVIDER_ID
AT111pop20001
AT112pop20002
 area2
 pop20021

This draft sheet aims at providing some basic metadata information for an indicator relatively or not to a territorial unit. Currently, only the source of data may be given as metadata.

For example in Table 5.5, the values of the pop2000 indicator identifier were retrieved from different sources for regions AT111 and AT112. On the contrary, all values for the area indicator, whatever the unit is, were provided by the same source. Idem for the pop2002 indicator.

The values in the PROVIDER_ID column must match the identifiers that are given in the Provider sheet (see Table 5.6). Likewise, the values in the STOCK_ID column must match the identifiers that are defined in the StockInfo sheet (see Table 5.8).

5.1.6. Provider

Table 5.6 provides an example for this optional sheet in the data v2 input xls file.
Table 5.6. V2 sample Provider sheet
PROVIDER_ID NAME CONTACT URL
1Eurostattoto@eurostat.euhttp://www.eurostat.eu
2INSEEtata@insee.frhttp://www.insee.fr

This sheet aims at providing the list of data providers. Their different ids are referenced from the Metadata sheet.

5.1.7. RatioStock

Table 5.7 provides an example for this optional sheet in the data v2 input xls file.
Table 5.7. V2 sample RatioStock sheet
RATIO_IDLABEL_IDNUM_IDDEN_IDVALIDITY_STARTVALIDITY_END
14gdp2000pop200020002000
24gdp2002pop200220022002
35pop2000area200020002000
45pop2002area200020022002

This sheet aims at defining relevant ratios for the HyperAtlas "ratio" combo box parameter. Table 5.7 shows the example of two such predefined ratios, each of them for two different dates:

  • the GDP/Inhabitant:

    • in 2000 (second line)

    • in 2002 (third line)

  • The density of population:

    • in 2000 (fourth line)

    • in 2002 (fifth line)

Each value in the RATIO_ID column must be unique. Doublons will overwrite the previous found value.

Note that the LABEL_ID references the sames labels for the given pairs of numerator/denominator at different dates (4 for lines 2 and 3, 5 for lines 4 and 5). These labels identifiers must be set in theLabel sheet (see Section 5.1.4).

The values in the NUM_ID column and the values in the DEN_ID column must match the identifiers of indicators that are defined in the StockInfo sheet (see Section 5.1.8).

The values in the VALIDITY_START column will only be considered if the value of the TIME_ENABLED column in the About sheet is TRUE (see Section 5.1.1). Then, one relevant ratio can be chosen in HyperAtlas for different dates. Identically for the values in the VALIDITY_END column. Though VALIDITY_START and VALIDITY_END columns are designed to handle time intervals, setting the same value in both columns makes the ratio associated to a timestamp.

[Important]

The expected format for both valididy start/end date fields is currently a year in the yyyy pattern. An input like 2010-12-31 is possible but this version of the application will only take into account the year, that is to say 2010 for this example.

5.1.8. StockInfo

Table 5.8 provides an example for this mandatory sheet in the data v2 input xls file.
Table 5.8. V2 sample StockInfo sheet
STOCK_ID LABEL_ID MEASURE_UNIT VALIDITY_START VALIDITY_END VISIBLE_FLAG
pop20001*100020002000TRUE
pop20021*100020022002TRUE
area20002km2 20002000TRUE
gdp20003euros20002000TRUE
gdp20023euros20022002TRUE

This sheet mainly aims at providing the identifiers of the indicators of the dataset. Here are a short description for each column of this sheet:

  • STOCK_ID: each value in this column must be unique. Any doublon will overwrite the previous found identical value. This column lists the identifiers of the indicators that are referenced in the other sheets. Note that several indicators may be associated to the same label (lines 2 and 3 for example), though they exist to distinguish the values of the population in 2000 and 2002.

  • LABEL_ID: each value in this column must reference an identifier defined in the Label sheet (see Section 5.1.4).

  • MEASURE_UNIT: simply provides the unit of measure for this indicator.

  • VALIDITY_START: shows the start date of validity for this indicator. This field will only be considerated if the value of the TIME_ENABLED column in the About sheet is TRUE (see Section 5.1.1 and Important note about expected date format).

  • VALIDITY_END: shows the end date of validity for this indicator. VALIDITY_START and VALIDITY_END fields are able to manage time intervals, but they can be used to associate a timestamp to the current stock: just write the same value in both cells (please see Important note about expected date format).

  • VISIBLE: this field acts like a flag, a boolean is expected for the values of this column. A TRUE value shows that this indicator will be available in the numerator and in the denominator combo boxes of HyperAtlas parameters panel. A FALSE value may be usefull to define relevant ratios whose indicators have no reason to be available in the numerator and denominator combo boxes. For example, the life expectancy pre-defined ratio considers indicators that have no sense out of this compute.