JRA1 D1.4 Database Schema

From GEANT2-JRA1 Wiki

Contents

Visualization

The three following UML diagrams reflect the schema on this page.

Update 16th august 2005: I have moved the visualization diagrams to be inline in this document. (Andreas@UNINETT)

Common Database Schema

A common database schema for the representation of network topology and metrics needs to be defined (layer 3). The initial draft has been issued by DFN Munich and Uninett using the experience gained in the CNM and Nemo project. The schema is going to be used as a common basis for the application of CNM and Nemo to JRA1. In addition, an XML export will be offered to allow for special data representations (e.g. for larger projects). A visualisation of the database schema can be found at Uninett (click on diagrams).

Even though the schema was only intended as a basis for the visualisation, the topology tables and other parts are also useful to denote the information that is needed for the framework. In addition, the schema could also be extended to for information about layer 1 and layer 2 needed by JRA3.

history - Generic Versioned Object Table

column namedatatypedescriptionfurther information
IDlongunique identifer in tableprimary key, not null
versionlongunique versionprimary key, not null
nameStringhuman readable string for identification
valid_fromdate_timestart of validity
valid_todate_timeend of validity


Comments from old table:

  • can be used to store a history of the elements
  • we distinguish between automated and manual validity. For interfaces the automated validity can be derived e.g. from MIB ifAdminStatus, manual validity is used to filter elements by hand
  • Remark: We assuming that the database assigns an OID for each entry in the database (true for PostgreSQL) which is different from the ID fields in previous tables
  • "Add update_time and creation_time entries in most of the tables" (Thanassis)

Topology tables (OSI layer 3)


Visualization is updated 15th august 2005. If doing changes to table, contact Andreas@UNINETT to get the visualization updated too.

map_topology.jpg


monitoring_objects - Monitoring object table (inherits from generic versioned object table)

column namedatatypedescriptionfurther information
identifierstringUnique string identifying the object 
namestringName of the object 
descrstringDescription of the object 
  • Superclass for nodes and subnets (especially links)
  • needed for distinction of measurements in table 3.3.
  • Added identifier, name and description field. This is general for all monitoring objects. Identifier can be set using the observation point naming scheme proposed by UNINETT. (Andreas@UNINETT)

router - Node table (inherits from monitoring_object)

column namedatatypedescriptionfurther information
ip_mgmt_addressstringaddress to reach the nodeComment from Sven/Thanassis
typeintegermodel typelinked to node model table
locationUTMgeographical position
domainintegerlink to domain table
  • stores information about network nodes (routers, switches, monitoring stations (supposed to be relevant when using the visualisation for NOC/PERT purposes))
  • type can be used to distinguish between different hardware components
  • location can be used to derive locations in some of the maps
  • domain is used to reflect NRENs/Geant transparency policy

router_type - Router type table

column namedatatypedescriptionfurther information
IDintfor the enum model type
namestringName of the router type
descrstringDescription field of the router type
  • Table to store information about different equipment models (Proposed by Thanassis)
  • I renamed this to router_type, since we are only talking about routers anyway. (Andreas@UNINETT)

monitoring_station - Monitoring station table (inherits from monitoring_objects)

column namedatatypedescriptionfurther information
ip_mgmt_addressip
ip_mgmt_addresslocation
  • table to store information about monitoring stations (hardware component to perform measurements in the network, e.g. IPPM box)
  • is tied to one interface (sufficient?)
  • representation of measurements along a path?
  • correspondence to GFD measurement points (multiple monitoring station could be seen as a single measurement point or vice versa)?
  • description field removed, since inherited from monitoring object. (Andreas@UNINETT)

Monitoring station Monitoring object link table (inherits from generic versioned object table)

column namedatatypedescriptionfurther information
monitoring_station_idlongrefers to monitoring_station:ID
monitoring_object_idlongrefers to monitoring_object:ID


  • DEPRECATED. When having one relation to monitoring objects, it should be enough. If you want to montor more than one monitoring objects, then create a group :) and a group is a monitoring object. If you agree Andreas@DFN, then remove this section. (Andreas@UNINETT)

metricdef_link - Monitoring station Metric definition link table (inherits from generic versioned object table)

column namedatatypedescriptionfurther information
monitoring_station_idlongrefers to monitoring_station:ID
metric_definition_idlongrefers to metric_definition:ID

interface - Interface table (inherits from monitoring_objects)

column namedatatypedescriptionfurther information
node_IDlongunique identifier from node:IDsecondary key (unique with name), not null
speedunsigned longonly to distinguish between 32/64 bit counters
typeunsigned longmedia type MIB variableselection of interesting interfaces
MTU_sizeint
is_subinterface_oflongproposal to deal with sub-interfaces
  • stores information about single interfaces of a node
  • information should be retrieved directly from routers (Comment from Sven)
  • It should consist at least of a triple <interface IP address, system name (e.g., FastEthernet0/1), descriptive name (e.g., "Dante to DFN")>. Interface identifaction should not be maintained manually in the database, this would surely lead to inconsistencies. Interface identifications tend to change - cards are moved in routers, IP addresses are modified, etc. (Comment from Sven/Tom Kosnar)
  • type can be used for filtering (e.g. filter null interfaces, loopback interfaces)
  • sub-interfaces are not considered so far, because VLANs have not been identified to be interesting for the German Research Network. This could be different in GEANT.
  • "I expect that the entire community will see a lot more sub-interface usage as people create sub-rate light paths across 10GE networks using vlans or other technology to carve circuits with defined QOS out of their infrastructure. Subinterfaces are also used extensively at exchange points. A very common model is to bring in a 10GE connection and run point-to-point vlans over the exchange to each of your low-volume peers. This is already happening at Starlight and ManLan. It will become even more popular in the US as PacificWave and AtlanticWave come online in the next year." (Comment from Joe Metzger)
  • Interface names are derived from the if_decr MIB variable. These could be hard to understand for the users. Are there other solutions/proposals? (Comment by Joe Metzger)
  • IP address could be used as interface names to be in correspondance with traceroute results (Comment from Joe Metzger)
  • Removed name and description, since inherited from monitoring_objects. We cannot use if_descr and if_alias blindly in an interdomain context, since NRENs have different understanding and make different use of these properties. There must be a NREN specific translation of if_alias and if_descr into the identifier, name and description field. (Andreas@UNINETT)

interface_ip_address - Connection table (inherits from versioned object table)

column namedatatypedescriptionfurther information
ip_addressIPIP Address v4/6
interfacelongrefers to interface:ID
networklongrefers to subnet:ID
  • Introduced as a result of the discussion about multipoint interfaces, because the database schema should be able to model them. Proposal for this and the following table by Olav.
  • interface_ip_address is a relation between interface and network. (Andreas@UNINETT)

network - Subnet table (inherits from monitoring object table)

column namedatatypedescriptionfurther information
typeenumdistinguish between different types (link, VLAN, MPLS)
prefixprefixIP Prefix
  • stores information about subnets (either links or multipoint connections)

Prior discussion about links/multipoint:

  • Until now no multipoint links have been considered, because this makes the schema more complicated and these links are very seldom in German Research Network and Uninett. Are they more relevant to other networks? (Comment by Joe Metzger)
  • "I do see the need for multi-point interfaces when zooming in. In our maps (Uninett) we have modeled multi-access network using the {ATM,MAC} switch/hub as a node to which each member node connects. VLANs are used in POPs and to customers. We may consider VLAN/ATM/(MPLS?) clouds as level 3 and the physical interfaces level 2." (Olav)
  • "Currently the only multipoint transit interfaces within ESnet are at exchange points. I don't expect this to grow much. I believe it is worth it to think about how this will fit into the schema, but may not be important enough to actually change the design." (Joe Metzger)

prefix - IP prefix table (inherits from generic versioned object table)

column namedatatypedescriptionfurther information
networklongidentification from network:ID
prefixString
nameString
descrString
organizationString
useString
domainDomain
  • allows for multiple IP prefixes for subnetworks
  • to collected and updated automatically to avoid too much maintenance (e.g., each night)
  • This table only have loose relations to network. (Andreas@UNINETT)


group - Monitoring object grouping table (inherits from monitoring object table)

column namedatatypedescriptionfurther information
grouptypeint

group_consist_of - Monitoring object grouping element table (inherits from generic versioned object table)

column namedatatypedescriptionfurther information
monitoring_objectlongrefers to monitoring_object_grouping:ID
grouplongrefers to monitoring_object:ID
orderintoptional ordering...

Map tables

Visualization is updated 15th august 2005. If doing changes to table, contact Andreas@UNINETT to get the visualization updated too.

map_map.jpg

map - Map table (inherits from generic versioned object table)

column namedatatypedescriptionfurther information
IDlongunique identifer in tableprimary key, not null
nameStringEnglish map name (human readable)
URLStringURL of background image
  • stores information about visualisation maps
  • can contain a background image e.g. if map uses geographical information

geographical_map - Geographical map table

  • (inherits from map table, Uninett please add information from Nemo).

edge_coordinates - Map edge coordinate table (inherits from generic versioned object table)

column namedatatypedescriptionfurther information
maplongrefers to map:id
position...refers to position:id

map_element - Map element table (inherits from generic versioned object table)

column namedatatypedescriptionfurther information
IDlongunique identifer in tableprimary key, not null
nameStringEnglish name (human readable)
min_resolutionomtReferring to map resolution.
max_resolutionomtReferring to map resolution.
monitoring_objectslongunique identifier from monitoring_object:ID
  • do we need multiple URLs for a single element?
  • Generic mapping to topology tables, allows for more flexibility in the visualisation (represent interfaces of a router as nodes); is this flexibility needed?

map_node - Map node table (inherits from map element)

column namedatatypedescriptionfurther information
typestringType of node
positionvalue pairposition in map
  • stores information about network node representation in a map
  • position may be derived automatically if it is a geographical map

has_position - Map node Has position table (inherits from generic versioned object table)

column namedatatypedescriptionfurther information
map_IDlongunique identifier (map:ID)
map_node_IDlongunique identifier (map_node:ID)
position...describe position in map (x,y) or more complex?

action - Map element action table (inherits from generic versioned object table)

column namedatatypedescriptionfurther information
map_elementlongunique identifier (map_element:ID)
maplongunique identifier (map:ID)
orderlongfor ordering purposes in menus
titlestringgenerically specifies action to do
descrstringgenerically describes action to do
hrefstringURI


map_resolution - Map resolution

column namedatatypedescriptionfurther information
resolutionlong


map_link - Map link table (inherits from map element)

column namedatatypedescriptionfurther information
map_node_fromlongunique identifier (map_node:ID)
map_node_tolongunique identifier (map_node:ID)
  • stores information about network link representations in maps
  • link_layout describes how the link is displayed (e.g. arcs could be used for double connections), allow for clarity of

complicated maps

  • (History table could also be used for maps. Will this result in size/redundancy problems?)

Connected map node-to-link table (inherits from generic versioned object table)

column namedatatypedescriptionfurther information
map_node_IDlongunique identifier (map_node:ID)
map_link_IDlongunique identifier (map_link:ID)
  • allows for multipoint links
  • This table is DEPRECATED. One link can only be attached to two nodes. To express multipoint links, we support inserting a network-cloud node, and attaching this to the respectively connected nodes. (Andreas@UNINETT)

Metric tables

This diagram was updated 16th August 2005. Contact Andreas@UNINETT to request any updates on the diagram.

jra1_metric.jpg


The diagram below show an example of the templates expanded to three different data tables. One for ping and two for netflow. This diagram was updated 16th August 2005. Contact Andreas@UNINETT to request any updates on the diagram.

jra1_metric_expanded.jpg

Metric definition table

column namedatatypedescriptionfurther information
IDlongunique identifer in tableprimary key, not null
nameStringEnglish metric name (human readable)
data_sourceStringway to access the data (with parameters)direct SNMP, from general framework
intervalsecondsfrequency of update
data_typeenumdata type for metric (e.g. percentage, unsigned long, status)
unitstringunit used for measurement values
  • Generic table for the description of displayed metrics
  • Sven/Thanassis propose to skip interval field, because this can change over time. Problem: Where to store information about current interval instead?
  • "I think that regular period is sufficient for measurement scheduling. There can also be next_run field with the timestamp of the next test execution. This field can be used a) for one time scheduling (not regular) and b) to help resolve cases when the current test execution was longer than the period. We can compute next_run as: next_run = next_run + ((1+int(current_time - next_run)/period))*period. Test instances should also belong to groups indicating which of them can overlap. Some test instances should not overlap, such as iperfs over non-disjunct network paths." (Sven)
  • Do we need an accuracy field? Is it constant (-> should be part of metric definition table) or not (-> needs to stored for each measurement) (Comment from Thanassis)
  • One-way latency should have accuracy, and it should be stored for each measurement. Clocks drift over time so the accuracy of One-way latency measurements can change over time. (Jeff)

(obsolete: Metric scheduling table (replaced by Metric methodology table) )

column namedatatypedescriptionfurther information
IDlongunique identifer in tableprimary key, not null
interval_typeenum(periodic,poisson)Two different one-arg distributions
intervalintseconds between two measurements (for poisson this is average interval)
durationintduration of a measurement
  • "If there is going to be an extra table for the interval anyway, I would suggest adding fields to it to specify the "type" of interval. So far only periodic (regular) intervals have been discussed. We actually use a Poisson distribution for most of our "active" scheduling intervals as recommended by RFC 2330. It would be nice if that were supported as well. Poisson is simple because it is a single argument distribution just like a "periodic" distribution. (In the periodic case, the argument is just the interval, in the "Poisson" case the argument is mu, or the "average" interval.) It would not be difficult to make this extra table support "single" argument distributions, or even distributions up to some maximum number of arguments." (Jeff)

Tables for dealing with Metric methodology

Metric methodology template table
column namedatatypedescriptionfurther information
IDlongunique identifer in tableprimary key, not null
metric_methodology_template_time_resolution_IDlongrefers to metric_methodology_template_time_resolution:ID
Metric methodology parameter table
column namedatatypedescriptionfurther information
metric_methodology_template_IDlongrefers to metric_methodlogy_template:ID
parameterkeystringmethodlogy parameter keys describing the methodology (except scheduling information)
parametervaluestringmethodlogy parameter values describing the methodology (except scheduling information)
Metric methodology template time resolution table
column namedatatypedescriptionfurther information
IDlongunique identifer in tableprimary key, not null
interval_typeenum(periodic,poisson)Two different one-arg distributions
intervalintseconds between two measurements (for poisson this is average interval)
plusTimeToleranceintfrom NM-WG V1.0
minusTimeToleranceintfrom NM-WG V1.0
durationintduration of a measurement (NM-WG V1.0 testing time)

Tables dealing with measurement data

Measurement ID table
column namedatatypedescriptionfurther information
measurement_IDlongunique identifiernot null, primary key
metric_IDlongrefers to metric_definition:IDnot null
metric_methodology_template_IDlongrefers to metric_methodology_template:IDnot null
scheduledTimedatetime1NM-WG V1.0 time parameter (for on-demand measurements); =0 if this is a regularly scheduled measurement with timeOffset taken from methodlogy_template

used as measurement request descriptor

Measurement data directory table
column namedatatypedescriptionfurther information
metric_IDlongrefers to metric_ID:IDnot null
table_namestringactual name of data table
other fields ...

meta data about measurement data tables

Measurement subject table
column namedatatypedescriptionfurther information
IDlongunique IDnot null
Measurement single subject table (inherits from measurement subject table)
column namedatatypedescriptionfurther information
monitoring_object_IDlongrefers to monitoring_object:IDnot null

... add other subclasses for more complex subjects like point..point or even paths (e.g. for traceroute)

Measurement data table (generic (meta data about contained in measurement data directory table), could contain different metrics if polled at same time interval)
column namedatatypedescriptionfurther information
measurement_IDlongrefers to measurement_ID:IDnot null
measurement_subject_IDlongrefers to measurement_subject:ID
time_stampdate_timeactual time when measurement occurred
durationintseconds how long the measurement has actually been performed
valueanytype dependend from metric
  • 3.3: conceptual tables; could be seperate tables for each metric if set of metrics for JRA1 has been confirmed; no differentiation between node and link measurement tables

Prior discussion:

  • "I am not sure if we need separate tables for node and link measurements. Sometimes it is not clear if something is node or link measurement (such as an interface byte counter retrieved from a node). And we will also have values measured over the whole network path (between two monitoring stations). Perhaps each record can include one or two node identificators (for measurements between two node) with optional interface identificators (and these should be references to triples described in node table above)." (Comment from Sven)
  • "I guess conventional snmp can be considered as interfaces and not the links linking them - i.e. a one node thing. The link connection between two or more interfaces does not need to have statistics then. We will need to model a passive monitoring device, that would tap into a fibre containing multiple lambdas." (Comment from Olav)
  • "Ipfix speaks about Observation Points - I suggest that we also use it in a general way: for both passive Observation Points and interface Observation Points. "An Observation Point is a location in the network where IP packets can be observed. Examples include: a line to which a probe is attached, a shared medium, such as an Ethernet-based LAN, a single port of a router, or a set of interfaces (physical or logical) of a router." from Ipfix" (Comment from Olav)

Tables dealing with aggregated measurement data

Measurement aggregated data tables (generic (meta data about contained in measurement data directory table), could contain different metrics if polled at same time interval)

For Complex, pre-defined aggregationInterval Types:

Complex Aggregation Types Table

column namedatatypedescriptionfurther information
IDlongunique identifierprimary key, not null
descriptionstringe.g. weekly, daily

Complex Aggregation Interval Table

column namedatatypedescriptionfurther information
IDlongunique identifierprimary key, not null
aggregationIDlongrefers to aggregation_type:IDnot null
startTimedatetimenot null
intervalLengthlong
aggregationIndexint0..max, e.g. 0-7 for daily aggregation depending on the day of week
containingIntervallongrefers to aggregation_interval:ID

Complex Aggregation Data Table

column namedatatypedescriptionfurther information
aggregationIntervalIDlongrefers to aggreagtion_interval:IDnot null
measurement_IDlongrefers to measurement_ID:IDnot null
monitoring_object_IDlongrefers to monitoring_object:ID
time_stampdate_timeactual time when measurement occurred
durationintseconds how long the measurement has actually been performed
valueanytype dependend from metric

Metric table example

Metric definition table examples

column namedatatypevaluefurther information
IDlong1primary key, not null
nameStringlink utilizationEnglish metric name (human readable)
data_sourceStringdirect SNMP
intervalseconds300frequency of update
data_typeenumpercentage[2]data type for metric (e.g. percentage, unsigned long, status)
unitstring1unit used for measurement values
IDlong2primary key, not null
nameStringround trip timeEnglish metric name (human readable)
data_sourceStringgeneral framework
intervalseconds300frequency of update
data_typeenumdoubledata type for metric (e.g. percentage, unsigned long, status)
unitstringmillisecunit used for measurement values
IDlong3primary key, not null
nameStringavailable bandwidthEnglish metric name (human readable)
data_sourceStringgeneral framework
intervalseconds300frequency of update
data_typeenumdouble[2]data type for metric (e.g. percentage, unsigned long, status)
unitstringMbit/sunit used for measurement values

Metric methodology template table examples

column namedatatypevaluedescriptionfurther information
IDlong21unique identifer in tableprimary key, not null
metric_methodology_parameterlist_IDlong31refers to metric_methodology_parameterlist:parameterlistID
metric_methodology_template_time_resolution_IDlong41refers to metric_methodology_template_time_resolution:ID

Metric methodology parameterlist table examples

column namedatatypevaluedescriptionfurther information
parameterlistIDlong31identifer in table
parameterkeystringip version
parametervaluestringipv4
parameterlistIDlong31identifer in table
parameterkeystringpacketsize
parametervaluestring1k kb
parameterlistIDlong31identifer in table
parameterkeystringclass of service
parametervaluestringpremium ip
parameterlistIDlong32identifer in table
parameterkeystringmethodlogy parameter keys describing the methodology (except scheduling information)
parametervaluestringmethodlogy parameter values describing the methodology (except scheduling information)
parameterlistIDlong32identifer in table
parameterkeystringmethodlogy parameter keys describing the methodology (except scheduling information)
parametervaluestringmethodlogy parameter values describing the methodology (except scheduling information)

Metric methodology template time resolution table examples

column namedatatypevaluedescriptionfurther information
IDlong41unique identifer in tableprimary key, not null
interval_typeenum(periodic,poisson)periodicTwo different one-arg distributions
intervalint400seconds between two measurements (for poisson this is average interval)
plusTimeToleranceint0from NM-WG V1.0
minusTimeToleranceint0from NM-WG V1.0
durationint10duration of a measurement (NM-WG V1.0 testing time)
IDlong42unique identifer in tableprimary key, not null
interval_typeenum(periodic,poisson)poissonTwo different one-arg distributions
intervalint100seconds between two measurements (for poisson this is average interval)
plusTimeToleranceint0from NM-WG V1.0
minusTimeToleranceint0from NM-WG V1.0
durationint10duration of a measurement (NM-WG V1.0 testing time)

Metric methodology time period table examples

column namedatatypevaluedescriptionfurther information
metric_methodology_time_resolution_IDlong41refers to metric_methodology_time_resolution:ID
starttimelong81111112344

Measurement ID table examples

column namedatatypevaluedescriptionfurther information
measurement_IDlong51unique identifiernot null, primary key
metric_IDlong1refers to metric_definition:IDnot null
measurement_methodology_template_IDlong21refers to measurement_methodology_template:IDnot null
scheduledTimedatetime0NM-WG V1.0 time parameter (for on-demand measurements); =0 if this is a regularly scheduled measurement with timeOffset taken from methodlogy_template

Measurement data directory table examples

column namedatatypevaluedescriptionfurther information
measurement_IDlong51refers to measurement_ID:IDnot null
table_namestringmsrm_data_51actual name of table to save measurement data
other fields ...

Measurement data table examples

Table msrm_data_51:

column namedatatypevaluedescriptionfurther information
measurement_IDlong51refers to measurement_ID:IDnot null
monitoring_object_IDlongvaluerefers to monitoring_object:ID
time_stampdate_time811166336actual time when measurement occurred
durationint30seconds how long the measurement has actually been performed
valuedouble[2]32, 43type dependent on metric

Administrative tables

Domain table

column namedatatypedescriptionfurther information
IDlongunique identifier in table
nameString
typeenumdifferentation of domain types
  • stores information about different domains.
  • Does it make sense to have a differentiation between types of domains (e.g. NRENs, projects)

User table

column namedatatypedescriptionfurther information
IDlongunique identifier in table
namestring
authenticationStringinformation needed for authentication
  • stores information about single users
  • Should this table contain more information like e-mail address, street address, organization, etc?


User group table

column namedatatypedescriptionfurther information
IDlongunique identifier in table
nameString
domainenum
  • stores information about user groups

User to user group mapping table

column namedatatypedescriptionfurther information
user_IDlongfrom user table:ID
user_group_IDlongfrom user group table:ID
  • table for mapping users to user groups. Necessary, because a user may belong to different user groups.

Policy table

column namedatatypedescriptionfurther information
IDlongunique identifier in table
domainenum
user groupenum
metricenum
access rights?


  • Reflect different transparency policies (dependent on NREN, user/user group may also necessary); dependent on GFD
  • Different users may be assigned different policies and these can differ for different sources of information (nodes, interfaces). (Comment from Sven)
  • How should access rights look like?

Language table

Language table

column namedatatypedescriptionfurther information
referenced_OIDlongunique identifier in whole databasenot null
languageintnumber of languagenot null
translated_nameStringtranslated name in specified language
  • (Could be used for names in map/metric tables to allow for different languages)

Open Issues

  • What are resonable durations for storing the data (depending on metric, user group interests)?
  • Should we store only raw data and aggregate it on the fly or should we store aggregated data to increase the performance?
  • Should the granularity of older data be degraded (e.g. aggregate 5 minute intervals into 1 hour intervals)?
  • Which are the interests of different user groups (e.g. a network planner would like to see long term trends in the network usage, PERT people would like to have highly accurate current information to identify problems)


Comments

  • Using RRD style aggregate sounds great initially but it frequently comes back and bites you. (Joe Metzger)
  • Pre-computing aggregates is a useful performance optimization but it is no substitute for maintaining the detailed data. (Joe Metzger)
  • "I think that we should store aggregated values as well. Each metricmay require different aggregation parameters. This requires more investigation." (Sven)
  • "I think that we may want to store in the same database also information about scheduling of measurements." (Sven)
  • Database should be populated automatically, e.g. via scripts, in order to avoid inconsistencies. Minimum information should be set by hand. (Thanassis)
  • When refer to IP protocol, we should consider IPv4 and IPv6, in order to built a future-proof database. (Thanassis)
Personal tools