1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
|
<table class="head">
<tr>
<td class="head-ltitle">IRDMA(4)</td>
<td class="head-vol">Device Drivers Manual</td>
<td class="head-rtitle">IRDMA(4)</td>
</tr>
</table>
<div class="manual-text">
<section class="Sh">
<h1 class="Sh" id="NAME"><a class="permalink" href="#NAME">NAME</a></h1>
<p class="Pp"><code class="Nm">irdma</code> — <span class="Nd">RDMA
FreeBSD driver for Intel(R) Ethernet Controller E810</span></p>
</section>
<section class="Sh">
<h1 class="Sh" id="SYNOPSIS"><a class="permalink" href="#SYNOPSIS">SYNOPSIS</a></h1>
<p class="Pp">This module relies on <a class="Xr">ice(4)</a></p>
<dl class="Bl-tag">
<dt>The following kernel options should be included in the configuration:</dt>
<dd><code class="Cd">options OFED</code>
<br/>
<code class="Cd">options OFED_DEBUG_INIT</code>
<br/>
<code class="Cd">options COMPAT_LINUXKPI</code>
<br/>
<code class="Cd">options SDP</code>
<br/>
<code class="Cd">options IPOIB_CM</code></dd>
</dl>
</section>
<section class="Sh">
<h1 class="Sh" id="DESCRIPTION"><a class="permalink" href="#DESCRIPTION">DESCRIPTION</a></h1>
<section class="Ss">
<h2 class="Ss" id="Features"><a class="permalink" href="#Features">Features</a></h2>
<p class="Pp">The <code class="Nm">irdma</code> driver provides RDMA protocol
support on RDMA-capable Intel Ethernet 800 Series NICs which are supported
by <a class="Xr">ice(4)</a></p>
<p class="Pp">The driver supports both iWARP and RoCEv2 protocols.</p>
</section>
</section>
<section class="Sh">
<h1 class="Sh" id="CONFIGURATION"><a class="permalink" href="#CONFIGURATION">CONFIGURATION</a></h1>
<section class="Ss">
<h2 class="Ss" id="TUNABLES"><a class="permalink" href="#TUNABLES">TUNABLES</a></h2>
<p class="Pp">Tunables can be set at the <a class="Xr">loader(8)</a> prompt
before booting the kernel or stored in <a class="Xr">loader.conf(5)</a>.</p>
<dl class="Bl-tag">
<dt id="dev.irdma_interface_number_.roce_enable"><var class="Va">dev.irdma<interface_number>.roce_enable</var></dt>
<dd>enables RoCEv2 protocol usage on <interface_numer> interface.
<p class="Pp">By default RoCEv2 protocol is used.</p>
</dd>
<dt id="dev.irdma_interface_number_.dcqcn_cc_cfg_valid"><var class="Va">dev.irdma<interface_number>.dcqcn_cc_cfg_valid</var></dt>
<dd>indicates that all DCQCN parameters are valid and should be updated in
registers or QP context.
<p class="Pp" id="dcqcn_min_dec_factor">Setting this parameter to 1 means
that settings in
<a class="permalink" href="#dcqcn_min_dec_factor"><i class="Em">dcqcn_min_dec_factor</i></a>,
<a class="permalink" href="#dcqcn_min_rate_MBps"><i class="Em" id="dcqcn_min_rate_MBps">dcqcn_min_rate_MBps</i></a>,
<a class="permalink" href="#dcqcn_F"><i class="Em" id="dcqcn_F">dcqcn_F</i></a>,
<a class="permalink" href="#dcqcn_T"><i class="Em" id="dcqcn_T">dcqcn_T</i></a>,
<a class="permalink" href="#dcqcn_B,"><i class="Em" id="dcqcn_B,">dcqcn_B,
dcqcn_rai_factor, dcqcn_hai_factor, dcqcn_rreduce_mperiod</i></a> are
taken into account. Otherwise default values are used.</p>
<p class="Pp">Note: "roce_enable" must also be set for this
tunable to take effect.</p>
</dd>
<dt id="dev.irdma_interface_number_.dcqcn_min_dec_factor"><var class="Va">dev.irdma<interface_number>.dcqcn_min_dec_factor</var></dt>
<dd>The minimum factor by which the current transmit rate can be changed when
processing a CNP. Value is given as a percentage (1-100).
<p class="Pp">Note: "roce_enable" and
"dcqcn_cc_cfg_valid" must also be set for this tunable to take
effect.</p>
</dd>
<dt id="dev.irdma_interface_number_.dcqcn_min_rate_MBps"><var class="Va">dev.irdma<interface_number>.dcqcn_min_rate_MBps</var></dt>
<dd>The minimum value, in Mbits per second, for rate to limit.
<p class="Pp">Note: "roce_enable" and
"dcqcn_cc_cfg_valid" must also be set for this tunable to take
effect.</p>
</dd>
<dt id="dev.irdma_interface_number_.dcqcn_F"><var class="Va">dev.irdma<interface_number>.dcqcn_F</var></dt>
<dd>The number of times to stay in each stage of bandwidth recovery.
<p class="Pp">Note: "roce_enable" and
"dcqcn_cc_cfg_valid" must also be set for this tunable to take
effect.</p>
</dd>
<dt id="dev.irdma_interface_number_.dcqcn_T"><var class="Va">dev.irdma<interface_number>.dcqcn_T</var></dt>
<dd>The number of microseconds that should elapse before increasing the CWND
in DCQCN mode.
<p class="Pp">Note: "roce_enable" and
"dcqcn_cc_cfg_valid" must also be set for this tunable to take
effect.</p>
</dd>
<dt id="dev.irdma_interface_number_.dcqcn_B"><var class="Va">dev.irdma<interface_number>.dcqcn_B</var></dt>
<dd>The number of bytes to transmit before updating CWND in DCQCN mode.
<p class="Pp">Note: "roce_enable" and
"dcqcn_cc_cfg_valid" must also be set for this tunable to take
effect.</p>
</dd>
<dt id="dev.irdma_interface_number_.dcqcn_rai_factor"><var class="Va">dev.irdma<interface_number>.dcqcn_rai_factor</var></dt>
<dd>The number of MSS to add to the congestion window in additive increase
mode.
<p class="Pp">Note: "roce_enable" and
"dcqcn_cc_cfg_valid" must also be set for this tunable to take
effect.</p>
</dd>
<dt id="dev.irdma_interface_number_.dcqcn_hai_factor"><var class="Va">dev.irdma<interface_number>.dcqcn_hai_factor</var></dt>
<dd>The number of MSS to add to the congestion window in hyperactive increase
mode.
<p class="Pp">Note: "roce_enable" and
"dcqcn_cc_cfg_valid" must also be set for this tunable to take
effect.</p>
</dd>
<dt id="dev.irdma_interface_number_.dcqcn_rreduce_mperiod"><var class="Va">dev.irdma<interface_number>.dcqcn_rreduce_mperiod</var></dt>
<dd>The minimum time between 2 consecutive rate reductions for a single flow.
Rate reduction will occur only if a CNP is received during the relevant
time interval.
<p class="Pp">Note: "roce_enable" and
"dcqcn_cc_cfg_valid" must also be set for this tunable to take
effect.</p>
</dd>
</dl>
</section>
<section class="Ss">
<h2 class="Ss" id="SYSCTL_PROCEDURES"><a class="permalink" href="#SYSCTL_PROCEDURES">SYSCTL
PROCEDURES</a></h2>
<p class="Pp">Sysctl controls are available for runtime adjustments.</p>
<dl class="Bl-tag">
<dt id="dev.irdma_interface_number_.debug"><var class="Va">dev.irdma<interface_number>.debug</var></dt>
<dd>defines level of debug messages.
<p class="Pp">Typical value: 1 for errors only, 0x7fffffff for full
debug.</p>
</dd>
<dt id="dev.irdma_interface_number_.dcqcn_enable"><var class="Va">dev.irdma<interface_number>.dcqcn_enable</var></dt>
<dd>enables the DCQCN algorithm for RoCEv2.
<p class="Pp">Note: "roce_enable" must also be set for this sysctl
to take effect.</p>
<p class="Pp">Note: The change may be set at any time, but it will be
applied only to newly created QPs.</p>
</dd>
</dl>
</section>
<section class="Ss">
<h2 class="Ss" id="TESTING"><a class="permalink" href="#TESTING">TESTING</a></h2>
<ol class="Bl-enum">
<li>To load the irdma driver, run:
<div class="Bd Pp Bd-indent Li">
<pre>kldload irdma</pre>
</div>
If if_ice is not already loaded, the system will load it on its own. Please
check whether the value of sysctl <var class="Va">hw.ice.irdma</var> is 1,
if the irdma driver is not loading. To change the value put:
<div class="Bd Pp Bd-indent Li">
<pre>hw.ice.irdma=1</pre>
</div>
in <span class="Pa">/boot/loader.conf</span> and reboot.</li>
<li>To check that the driver was loaded, run:
<div class="Bd Pp Bd-indent Li">
<pre>sysctl -a | grep infiniband</pre>
</div>
Typically, if everything goes well, around 190 entries per PF will
appear.</li>
<li>Each interface of the card may work in either iWARP or RoCEv2 mode. To
enable RoCEv2 compatibility, add:
<div class="Bd Pp Bd-indent Li">
<pre>dev.irdma<interface_number>.roce_enable=1</pre>
</div>
where <interface_number> is a desired ice interface number on which
RoCEv2 protocol needs to be enabled, into:
<span class="Pa">/boot/loader.conf</span> , for instance:
<dl class="Bl-tag">
<dt>dev.irdma0.roce_enable=0</dt>
<dd style="width: auto;"> </dd>
<dt>dev.irdma1.roce_enable=1</dt>
<dd style="width: auto;"> </dd>
</dl>
will keep iWARP mode on ice0 and enable RoCEv2 mode on interface ice1. The
RoCEv2 mode is the default.
<p class="Pp">To check irdma roce_enable status, run:</p>
<div class="Bd Pp Bd-indent Li">
<pre>sysctl dev.irdma<interface_number>.roce_enable</pre>
</div>
for instance:
<div class="Bd Pp Bd-indent Li">
<pre>sysctl dev.irdma2.roce_enable</pre>
</div>
with returned value of '0' indicate the iWARP mode, and the value of '1'
indicate the RoCEv2 mode.
<p class="Pp">Note: An interface configured in one mode will not be able to
connect to a node configured in another mode.</p>
<p class="Pp">Note: RoCEv2 has currently limited support, for functional
testing only. DCB and Priority Flow Controller (PFC) are not currently
supported which may lead to significant performance loss or connectivity
issues.</p>
</li>
<li>Enable flow control in the ice driver:
<div class="Bd Pp Bd-indent Li">
<pre>sysctl dev.ice.<interface_number>.fc=3</pre>
</div>
Enable flow control on the switch your system is connected to. See your
switch documentation for details.</li>
<li>The source code for krping software is provided with the kernel in
/usr/src/sys/contrib/rdma/krping/. To compile the software, change
directory to /usr/src/sys/modules/rdma/krping/ and invoke the following:
<dl class="Bl-tag">
<dt>make clean</dt>
<dd style="width: auto;"> </dd>
<dt>make</dt>
<dd style="width: auto;"> </dd>
<dt>make install</dt>
<dd style="width: auto;"> </dd>
<dt>kldload krping</dt>
<dd style="width: auto;"> </dd>
</dl>
</li>
<li>Start a krping server on one machine:
<div class="Bd Pp Bd-indent Li">
<pre>echo size=64,count=1,port=6601,addr=100.0.0.189,server > /dev/krping</pre>
</div>
</li>
<li>Connect a client from another machine:
<div class="Bd Pp Bd-indent Li">
<pre>echo size=64,count=1,port=6601,addr=100.0.0.189,client > /dev/krping</pre>
</div>
</li>
</ol>
</section>
</section>
<section class="Sh">
<h1 class="Sh" id="SUPPORT"><a class="permalink" href="#SUPPORT">SUPPORT</a></h1>
<p class="Pp">For general information and support, go to the Intel support
website at:
<a class="Lk" href="http://support.intel.com/">http://support.intel.com/</a>.</p>
<p class="Pp">If an issue is identified with this driver with a supported
adapter, email all the specific information related to the issue to
<a class="Mt" href="mailto:freebsd@intel.com">freebsd@intel.com</a>.</p>
</section>
<section class="Sh">
<h1 class="Sh" id="SEE_ALSO"><a class="permalink" href="#SEE_ALSO">SEE
ALSO</a></h1>
<p class="Pp"><a class="Xr">ice(4)</a></p>
</section>
<section class="Sh">
<h1 class="Sh" id="AUTHORS"><a class="permalink" href="#AUTHORS">AUTHORS</a></h1>
<p class="Pp">The <code class="Nm">irdma</code> driver was prepared by
<span class="An">Bartosz Sobczak</span>
<<a class="Mt" href="mailto:bartosz.sobczak@intel.com">bartosz.sobczak@intel.com</a>>.</p>
</section>
</div>
<table class="foot">
<tr>
<td class="foot-date">March 30, 2022</td>
<td class="foot-os">FreeBSD 15.0</td>
</tr>
</table>
|