81st IETF Side Meeting
From Network Complexity
Side meeting in Quebec (IETF) (goal to create an IRTF research group on network complexity.)
Friday July 29th, 0900-1130. Hosted by Geoff Huston and Michael Behringer.
|Overview, agenda bashing||10 min||slides|
|Classifying Network Complexity||Michael Behringer||20 min||slides|
|End to End Complexity||Geoff Huston||20 min||slides|
|Comparative Complexity Analysis||Michael Behringer||10 min||slides|
|Catastrophic Failure Use Case||Michael Behringer||10 min||slides|
|RG Charter discussion||80 min|
Network Complexity Research Group (COMPLEXITY)
The Network Complexity Research Group aims at defining and analyzing the complexity of IP based networks.
There is a general perception that unnecessary complexity should be avoided, and when deciding between two approaches in networking, complexity is usual an important factor. However, the term “complexity” is rarely well defined, and decisions on complexity are mostly made on subjective terms.
The Network Complexity Research Group provides objective definitions, metrics and background research to help making decisions where complexity is a factor. The ultimate goal is to provide factual and objective information and metrics to be used in network design and protocol design. It is highly desirable to have practical and objective information on network complexity as an input into the IETF process.
Areas of interest include: 1) Research with the goal of defining “network complexity”, and defining relevant metrics. 2) Comparative research between various network architectures, protocols or approaches. 3) Methods and ideas to contain, control, or reduce complexity in IP based networks. 4) Collect use cases regarding specific network designs or failure cases where complexity played a role.
The group will report progress through a publicly accessible web site and presentations at IETF meetings. Relevant information and research developed by the NMRG will be submitted for publication as Experimental or Informational RFCs.
Membership: Membership is open.
Meetings: Complexity meetings are held usually in conjunction with the IETF.
Minutes from the Meeting
Complexity Discussion Notes
This is not a RG bof and not a workshop. It's a research meeting of sorts!
Opening assertion that "the complexity is getting out of hand" and the concern is that our networks become riskier in terms of operational integrity, and the potential for scaleability of the network architecture.
Stuart Cheshire says that there is no single event or transition event from "complex" to "too complex" - its a gradual process.
- this topic has been considered in two workshops
- complexity vs complicated is a theme here
- "complicated" is related to a deterministic system, whereas "complex" could be seen as being non-linear and resistant to modelling
- what about "the concept of "Do What I Mean" ?
- Is this about "extreme complexity" vs simplicity
- Is the question: "can I be sure that if I make a change in the active system will the system survive the change?" The evidence is that we really do not have this confidence any more. We use test labs and simulations for almost all changes on today's networks because we no longer have confidence in the system
- The observation is that complexity is increasing: numbers of routers, lines of code, features in routers, etc etc - things get bigger and over time complexity comes
- The related observation is that complexity can be beneficial - useful systems are often complex
- But is there a point that goes beyond the point of diminishing returns to a point of negative return - i.e. adding robustness gets into a complexity / robustness spiral where the additional robustness adds complexity which threatens robustness...
- tools and approaches for "dealing" with complexity
- is complexity composed of "states" plus "transitions"? The "state" universe includes the network, "the operator" and network management dimensions.
- Complexity impacts predictability and impacts security
What's missing here? Churn?
The more configuration options you provide the greater the number of states you create and if the number of viable / working states is small then the probability of miss is high
- Type A settings - no matter what the setting it just works
- Type B settings - only one setting works, all others fail
- Type C settings
e.g. Lion has IPv6 ON without an off setting - sometimes the range of choices needs to be refined and pruned
There are many "realms" of configuration, from host to local network to ISP to... and in some sense settings in one realm have impacts in other realms
There are issues of "open loop" complexity and the question of "intended" behaviour. Do we even know what the intended behaviour is? And how do we get there? Are there coupling issues that are not overt. Or just the sheer number of coupling, overt and implicit - are these couplings evidence of complexity?
Complexity as an artefact of the stack - layers make assumptions about upper and lower layers, but the assumptions may not carry across all possible variants of each element in the stack
From the perspective of the vendor, do "settings" imply a deferral of decision and the potential to escalate complexity - i.e. "let others solve the problem"
Or is this an issue of poor elemental design where the element is deliberately incapable of self-configuration and instead is setting driven
"to what goal?"
How and what is this used for? Why are we thinking about complexity?
"should I do network design / product / approach A or should I stay with B?"
"how can I compare approaches / designs / features / settings?"
Are there complexity "comparison" tools / questions?
[slide shows a potential taxonomy of "complexity" metrics]
is there a set of metrics than can provide a quantitative "complexity" comparison?
one exercise was configuration analysis: look at: size of the config, number of interfaces, template count, segmentation, number of internal dependences, number of external dependences
detailed examination of an event, where element failure leads to router/switch inconsistency, leads to switch element saturation, leads to service failure. The analysis exposed an escalating chain of failures and risks.
- Should we have a Research Group on this topic (5 -8 hands raised)
- "What would it study?"
- "That question is the first thing we'd study"
- Desire for "practical' research that has explicit outcomes that could assist in the network design and protocol design functions.
- Q: where is the responsibility for "most" of the complexity these days? There is a lot of vendor input in terms of settings and parameters on equipment and services. There is some considerable interest in 'fixing' this. The feature set in any mature product is getting overwhelming for many vendors.
- Discussion: Is 'time to market' also a factor? Getting product to market quickly often passes over the "unknowns" as settings and variables to the user. This then increases the number of 'forced' states in the system and this may impact complexity of the resultant environment.
- Discussion: Is there a case to limit the choice set?
- Has this been in the operator forums? Yes, over more than a decade! Traction has been hard.
- Are there approaches to 'reduce' complexity? Possible merged functionality, or potential setting pruning
- Is changing the fundamentals (i.e. multipath tcp) making life better or is this feature explosion that adds complexity? Can you ever remove the legacy items?
- There is a "don't throw it out' mentality that prevents us from discarding 'old' or superseded tools and approaches.
- There is research about measuring 'mistakes' in interfaces to systems that helps in understanding the complexity or at least understandability of the operation of the system.
- The salient question remains "to what end?" This is a multi-dimensioned space where "less complexity is better" is not really true. There are a number of desirable attributes and its often that case that attempting to 'improve' in one dimension has the potential to cost in other dimensions.
- "Bring in the actuaries!" There is not necessarily a set of absolute priorities, but a set of relative risks and a set of tolerances to risks, whether its cost, failures, hazards, time to repair, etc.
- Is this a case of "Complexity Considerations?" not necessarily. Are there small incremental goals.
Lars: How would a research group operate? Well in some sense you have already been running that work. So what's the IRTF interest? Potentially to suck knowledge in and possibly broadcast thoughts out is potentially a benefit of the IRTF, but you need to have a healthy life plan outside of the IRTF. This is a difficult topic that takes time and concentration in longer meetings. Discussion is hard at the IETF-styles venue.
Bob Briscoe commented that there is not really a research field in networking in complexity here. So this is a lot like attempting to establish a new research field where there is no recognised 'day job' (c.f. security or routing. Complexity is not a professional discipline) This is ambitious. Its not the wrong thing to do. It's a new area of study and analysis for this industry.
IRTF would be a mechanism of awareness gathering in this work.
Discuss: Part of the issue here is a mentality of openness of failures and openness of investigation. The industry is quite secretive about failures and cause analysis
The fields of complex systems and studying them is mature, and there are conferences on this as a science of complex systems. What is the relationship of this work to the science of complex systems?
John Doyle has been working in this area for a decade. There is much of this that would be relevant and useful.
How would be interested in contributing? 8 - 10 or so raised hands in the room.
Comment: I wouldn't think about this (complexity) much at all unless there was this kind of group to trigger it.
How subjective is this "complexity" analysis? There is a strong desire to have an objective quantification of complexity, but there is a feelign that some of this is not susceptible to quantification and the analysis of complexity gets to be subjective. This will be multi-dimensional.
(Thanks to Geoff Huston for taking the notes)