diff options
Diffstat (limited to 'static/freebsd/man9/mod_cc.9')
| -rw-r--r-- | static/freebsd/man9/mod_cc.9 | 421 |
1 files changed, 421 insertions, 0 deletions
diff --git a/static/freebsd/man9/mod_cc.9 b/static/freebsd/man9/mod_cc.9 new file mode 100644 index 00000000..09580aa9 --- /dev/null +++ b/static/freebsd/man9/mod_cc.9 @@ -0,0 +1,421 @@ +.\" +.\" Copyright (c) 2008-2009 Lawrence Stewart <lstewart@FreeBSD.org> +.\" Copyright (c) 2010-2011 The FreeBSD Foundation +.\" All rights reserved. +.\" +.\" Portions of this documentation were written at the Centre for Advanced +.\" Internet Architectures, Swinburne University of Technology, Melbourne, +.\" Australia by David Hayes and Lawrence Stewart under sponsorship from the +.\" FreeBSD Foundation. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR +.\" ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.Dd May 13, 2021 +.Dt MOD_CC 9 +.Os +.Sh NAME +.Nm mod_cc , +.Nm DECLARE_CC_MODULE , +.Nm CCV +.Nd Modular Congestion Control +.Sh SYNOPSIS +.In netinet/tcp.h +.In netinet/cc/cc.h +.In netinet/cc/cc_module.h +.Fn DECLARE_CC_MODULE "ccname" "ccalgo" +.Fn CCV "ccv" "what" +.Sh DESCRIPTION +The +.Nm +framework allows congestion control algorithms to be implemented as dynamically +loadable kernel modules via the +.Xr kld 4 +facility. +Transport protocols can select from the list of available algorithms on a +connection-by-connection basis, or use the system default (see +.Xr mod_cc 4 +for more details). +.Pp +.Nm +modules are identified by an +.Xr ascii 7 +name and set of hook functions encapsulated in a +.Vt "struct cc_algo" , +which has the following members: +.Bd -literal -offset indent +struct cc_algo { + char name[TCP_CA_NAME_MAX]; + int (*mod_init) (void); + int (*mod_destroy) (void); + size_t (*cc_data_sz)(void); + int (*cb_init) (struct cc_var *ccv, void *ptr); + void (*cb_destroy) (struct cc_var *ccv); + void (*conn_init) (struct cc_var *ccv); + void (*ack_received) (struct cc_var *ccv, uint16_t type); + void (*cong_signal) (struct cc_var *ccv, uint32_t type); + void (*post_recovery) (struct cc_var *ccv); + void (*after_idle) (struct cc_var *ccv); + int (*ctl_output)(struct cc_var *, struct sockopt *, void *); + void (*rttsample)(struct cc_var *, uint32_t, uint32_t, uint32_t); + void (*newround)(struct cc_var *, uint32_t); +}; +.Ed +.Pp +The +.Va name +field identifies the unique name of the algorithm, and should be no longer than +TCP_CA_NAME_MAX-1 characters in length (the TCP_CA_NAME_MAX define lives in +.In netinet/tcp.h +for compatibility reasons). +.Pp +The +.Va mod_init +function is called when a new module is loaded into the system but before the +registration process is complete. +It should be implemented if a module needs to set up some global state prior to +being available for use by new connections. +Returning a non-zero value from +.Va mod_init +will cause the loading of the module to fail. +.Pp +The +.Va mod_destroy +function is called prior to unloading an existing module from the kernel. +It should be implemented if a module needs to clean up any global state before +being removed from the kernel. +The return value is currently ignored. +.Pp +The +.Va cc_data_sz +function is called by the socket option code to get the size of +data that the +.Va cb_init +function needs. +The socket option code then preallocates the modules memory so that the +.Va cb_init +function will not fail (the socket option code uses M_WAITOK with +no locks held to do this). +.Pp +The +.Va cb_init +function is called when a TCP control block +.Vt struct tcpcb +is created. +It should be implemented if a module needs to allocate memory for storing +private per-connection state. +Returning a non-zero value from +.Va cb_init +will cause the connection set up to be aborted, terminating the connection as a +result. +Note that the ptr argument passed to the function should be checked to +see if it is non-NULL, if so it is preallocated memory that the cb_init function +must use instead of calling malloc itself. +.Pp +The +.Va cb_destroy +function is called when a TCP control block +.Vt struct tcpcb +is destroyed. +It should be implemented if a module needs to free memory allocated in +.Va cb_init . +.Pp +The +.Va conn_init +function is called when a new connection has been established and variables are +being initialised. +It should be implemented to initialise congestion control algorithm variables +for the newly established connection. +.Pp +The +.Va ack_received +function is called when a TCP acknowledgement (ACK) packet is received. +Modules use the +.Fa type +argument as an input to their congestion management algorithms. +The ACK types currently reported by the stack are CC_ACK and CC_DUPACK. +CC_ACK indicates the received ACK acknowledges previously unacknowledged data. +CC_DUPACK indicates the received ACK acknowledges data we have already received +an ACK for. +.Pp +The +.Va cong_signal +function is called when a congestion event is detected by the TCP stack. +Modules use the +.Fa type +argument as an input to their congestion management algorithms. +The congestion event types currently reported by the stack are CC_ECN, CC_RTO, +CC_RTO_ERR and CC_NDUPACK. +CC_ECN is reported when the TCP stack receives an explicit congestion notification +(RFC3168). +CC_RTO is reported when the retransmission time out timer fires. +CC_RTO_ERR is reported if the retransmission time out timer fired in error. +CC_NDUPACK is reported if N duplicate ACKs have been received back-to-back, +where N is the fast retransmit duplicate ack threshold (N=3 currently as per +RFC5681). +.Pp +The +.Va post_recovery +function is called after the TCP connection has recovered from a congestion event. +It should be implemented to adjust state as required. +.Pp +The +.Va after_idle +function is called when data transfer resumes after an idle period. +It should be implemented to adjust state as required. +.Pp +The +.Va ctl_output +function is called when +.Xr getsockopt 2 +or +.Xr setsockopt 2 +is called on a +.Xr tcp 4 +socket with the +.Va struct sockopt +pointer forwarded unmodified from the TCP control, and a +.Va void * +pointer to algorithm specific argument. +.Pp +The +.Va rttsample +function is called to pass round trip time information to the +congestion controller. +The additional arguments to the function include the microsecond RTT +that is being noted, the number of times that the data being +acknowledged was retransmitted as well as the flightsize at send. +For transports that do not track flightsize at send, this variable +will be the current cwnd at the time of the call. +.Pp +The +.Va newround +function is called each time a new round trip time begins. +The montonically increasing round number is also passed to the +congestion controller as well. +This can be used for various purposes by the congestion controller (e.g Hystart++). +.Pp +Note that currently not all TCP stacks call the +.Va rttsample +and +.Va newround +function so dependency on these functions is also +dependent upon which TCP stack is in use. +.Pp +The +.Fn DECLARE_CC_MODULE +macro provides a convenient wrapper around the +.Xr DECLARE_MODULE 9 +macro, and is used to register a +.Nm +module with the +.Nm +framework. +The +.Fa ccname +argument specifies the module's name. +The +.Fa ccalgo +argument points to the module's +.Vt struct cc_algo . +.Pp +.Nm +modules must instantiate a +.Vt struct cc_algo , +but are only required to set the name field, and optionally any of the function +pointers. +Note that if a module defines the +.Va cb_init +function it also must define a +.Va cc_data_sz +function. +This is because when switching from one congestion control +module to another the socket option code will preallocate memory for the +.Va cb_init +function. +If no memory is allocated by the modules +.Va cb_init +then the +.Va cc_data_sz +function should return 0. +.Pp +The stack will skip calling any function pointer which is NULL, so there is no +requirement to implement any of the function pointers (with the exception of +the cb_init <-> cc_data_sz dependency noted above). +Using the C99 designated initialiser feature to set fields is encouraged. +.Pp +Each function pointer which deals with congestion control state is passed a +pointer to a +.Vt struct cc_var , +which has the following members: +.Bd -literal -offset indent +struct cc_var { + void *cc_data; + int bytes_this_ack; + tcp_seq curack; + uint32_t flags; + int type; + union ccv_container { + struct tcpcb *tcp; + struct sctp_nets *sctp; + } ccvc; + uint16_t nsegs; + uint8_t labc; +}; +.Ed +.Pp +.Vt struct cc_var +groups congestion control related variables into a single, embeddable structure +and adds a layer of indirection to accessing transport protocol control blocks. +The eventual goal is to allow a single set of +.Nm +modules to be shared between all congestion aware transport protocols, though +currently only +.Xr tcp 4 +is supported. +.Pp +To aid the eventual transition towards this goal, direct use of variables from +the transport protocol's data structures is strongly discouraged. +However, it is inevitable at the current time to require access to some of these +variables, and so the +.Fn CCV +macro exists as a convenience accessor. +The +.Fa ccv +argument points to the +.Vt struct cc_var +passed into the function by the +.Nm +framework. +The +.Fa what +argument specifies the name of the variable to access. +.Pp +Apart from the +.Va type +and +.Va ccv_container +fields, the remaining fields in +.Vt struct cc_var +are for use by +.Nm +modules. +.Pp +The +.Va cc_data +field is available for algorithms requiring additional per-connection state to +attach a dynamic memory pointer to. +The memory should be allocated and attached in the module's +.Va cb_init +hook function. +.Pp +The +.Va bytes_this_ack +field specifies the number of new bytes acknowledged by the most recently +received ACK packet. +It is only valid in the +.Va ack_received +hook function. +.Pp +The +.Va curack +field specifies the sequence number of the most recently received ACK packet. +It is only valid in the +.Va ack_received , +.Va cong_signal +and +.Va post_recovery +hook functions. +.Pp +The +.Va flags +field is used to pass useful information from the stack to a +.Nm +module. +The CCF_ABC_SENTAWND flag is relevant in +.Va ack_received +and is set when appropriate byte counting (RFC3465) has counted a window's worth +of bytes has been sent. +It is the module's responsibility to clear the flag after it has processed the +signal. +The CCF_CWND_LIMITED flag is relevant in +.Va ack_received +and is set when the connection's ability to send data is currently constrained +by the value of the congestion window. +Algorithms should use the absence of this flag being set to avoid accumulating +a large difference between the congestion window and send window. +.Pp +The +.Va nsegs +variable is used to pass in how much compression was done by the local +LRO system. +So for example if LRO pushed three in-order acknowledgements into +one acknowledgement the variable would be set to three. +.Pp +The +.Va labc +variable is used in conjunction with the CCF_USE_LOCAL_ABC flag +to override what labc variable the congestion controller will use +for this particular acknowledgement. +.Sh SEE ALSO +.Xr cc_cdg 4 , +.Xr cc_chd 4 , +.Xr cc_cubic 4 , +.Xr cc_dctcp 4 , +.Xr cc_hd 4 , +.Xr cc_htcp 4 , +.Xr cc_newreno 4 , +.Xr cc_vegas 4 , +.Xr mod_cc 4 , +.Xr tcp 4 +.Sh ACKNOWLEDGEMENTS +Development and testing of this software were made possible in part by grants +from the FreeBSD Foundation and Cisco University Research Program Fund at +Community Foundation Silicon Valley. +.Sh FUTURE WORK +Integrate with +.Xr sctp 4 . +.Sh HISTORY +The modular Congestion Control (CC) framework first appeared in +.Fx 9.0 . +.Pp +The framework was first released in 2007 by James Healy and Lawrence Stewart +whilst working on the NewTCP research project at Swinburne University of +Technology's Centre for Advanced Internet Architectures, Melbourne, Australia, +which was made possible in part by a grant from the Cisco University Research +Program Fund at Community Foundation Silicon Valley. +More details are available at: +.Pp +http://caia.swin.edu.au/urp/newtcp/ +.Sh AUTHORS +.An -nosplit +The +.Nm +framework was written by +.An Lawrence Stewart Aq Mt lstewart@FreeBSD.org , +.An James Healy Aq Mt jimmy@deefa.com +and +.An David Hayes Aq Mt david.hayes@ieee.org . +.Pp +This manual page was written by +.An David Hayes Aq Mt david.hayes@ieee.org +and +.An Lawrence Stewart Aq Mt lstewart@FreeBSD.org . |
