Wikidata/Archive/Describing wikidata structures
Depending on how the Wikidata proposal is implemented, there may be a need for a format to describe the structure of wikidata tables. This page proposes a format that uses the pipe syntax wikicode. Obviously, the wikidata implementation would ignore attributes (like border="1"), table headings (beginning with |+), and row/column headings (beginning with !).
Examples
[edit]Movie
[edit]| Year | year | |||||||
| Tagline | line | |||||||
| Plot summary | multiline | |||||||
| Cast | substructure-array |
| ||||||
| Runtime | number | minute | ||||||
| Country | line | autolink | ||||||
| Color | enumeration |
|
Element
[edit]| Symbol | line | |||||||||||||||||||||||||
| Atomic number | number | |||||||||||||||||||||||||
| Chemical series | enumeration |
| ||||||||||||||||||||||||
| Atomic mass | number | g/mol | ||||||||||||||||||||||||
| Electron configuration | line | |||||||||||||||||||||||||
| Electrons per shell | number | |||||||||||||||||||||||||
| Phase | enumeration |
| ||||||||||||||||||||||||
| Density at 0degC | number | g/L | ||||||||||||||||||||||||
| Melting point (at 2.5 MPa) | number | K | ||||||||||||||||||||||||
| Boiling point (at 2.5 MPa) | number | K | ||||||||||||||||||||||||
| Heat of fusion | number | kl/mol | ||||||||||||||||||||||||
| Heat of vaporization | number | kl/mol | ||||||||||||||||||||||||
| Heat capacity (at 25 degC) | number | J/(mol*K) | ||||||||||||||||||||||||
| Crystal structure | enumeration |
| ||||||||||||||||||||||||
| Atomic radius | number | pm | ||||||||||||||||||||||||
| Covalent radius | number | pm | ||||||||||||||||||||||||
| Van der Waals radius | number | pm | ||||||||||||||||||||||||
| Unstable isotopes | substructure-array |
|
Software
[edit]| Developer | line | |||||||||
| Latest version | line | |||||||||
| Release date of latest version | date | |||||||||
| OS | enumeration-other |
| ||||||||
| Genre | enumeration-other |
| ||||||||
| License | enumeration-other |
| ||||||||
| Website | url |
Star
[edit]| Mass | number | kg | |||||||
| Radius | number | km | |||||||
| Luminosity | number | L | |||||||
| Surface temperature | number | K | |||||||
| Age | number | y | |||||||
| Notable features | multiline | ||||||||
| Spectral type | enumeration |
|
Type keywords
[edit]The middle column in the examples above contains a keyword that describe the datatype of the field.
| Keyword | Meaning |
|---|---|
| number | The third column of the structure description contains the units of this number, if applicable. |
| year | Like a number, except it's autolinked and has no unit information |
| line | The field should be edited with an . If the third column contains autolink, then when the field is automatically linked. Otherwise the third column should be empty. |
| url | Like the line type, but it is automatically linked as an external link. |
| multiline | The field should be edited with a |
| enumeration | The third column of the structure description is a sub-table where each row is a possible value for the enumeration. When editing an instance of this structure, the user is presented with a combo-box of the possible values. |
| enumeration-other | Like an enumeration, except there's an Other option in the combo box, and a text field next to the combo box for entering something that isn't in the enumeration. |
| substructure-array | The third column is a nested structure description. The only restrictions on the structure description are that it can't have fields of type multiline or substructure-array, and it can't have more than 10 fields.
Consider the example of the Movie datastructure. It has a field called cast that is of type substructure-array. The substructure has two fields--Actor and Character--both of type line. This can store a mapping of actors to characters. |
| boolean | A check box is used to edit this field. |
| date | A day/month/year, but not a time |
PHP Functions
[edit]Table Parser
[edit]This function takes a string containing wikicode as an argument, and returns a 2-dimensional array of strings representing the first table found in the wikicode.
function wikidata_parse_table($code) {$lines = preg_split('/\\r?\\n/', $code);
$level = 0;
$array = array();
$row = 0;
$col = -1;
foreach ($lines as $line) {
if (preg_match('/^{\\|/', $line)) {
$level++;
if ($level > 1 && $col > -1) {
$array[$row][$col] .= "\n" . $line;
}
} else if (preg_match('/^\\|-/', $line)) {
if ($level == 1) {
$row++;
$col = -1;
} else if ($level > 1 && $col > -1) {
$array[$row][$col] .= "\n" . $line;
}
} else if (preg_match('/^\\|}/', $line)) {
if ($level > 1 && $col > -1) {
$array[$row][$col] .= "\n" . $line;
}
$level--;
if ($level == 0) {
return $array;
}
} else if ($level == 1
&& preg_match('/^\\|(.*)$/', $line, $matches)) {
$columns = explode("||", $matches[1]);
foreach ($columns as $column) {
$col++;
$array[$row][$col] = $column;
}
} else if ($level > 1 && $col > -1) {
$array[$row][$col] .= "\n" . $line;
}
}
return NULL;
}