Skip to content

Commit ebaf39e

Browse files
jiriwiesnerdavem330
authored andcommitted
ipv4: ipv6: netfilter: Adjust the frag mem limit when truesize changes
The *_frag_reasm() functions are susceptible to miscalculating the byte count of packet fragments in case the truesize of a head buffer changes. The truesize member may be changed by the call to skb_unclone(), leaving the fragment memory limit counter unbalanced even if all fragments are processed. This miscalculation goes unnoticed as long as the network namespace which holds the counter is not destroyed. Should an attempt be made to destroy a network namespace that holds an unbalanced fragment memory limit counter the cleanup of the namespace never finishes. The thread handling the cleanup gets stuck in inet_frags_exit_net() waiting for the percpu counter to reach zero. The thread is usually in running state with a stacktrace similar to: PID: 1073 TASK: ffff880626711440 CPU: 1 COMMAND: "kworker/u48:4" #5 [ffff880621563d48] _raw_spin_lock at ffffffff815f5480 #6 [ffff880621563d48] inet_evict_bucket at ffffffff8158020b #7 [ffff880621563d80] inet_frags_exit_net at ffffffff8158051c #8 [ffff880621563db0] ops_exit_list at ffffffff814f5856 #9 [ffff880621563dd8] cleanup_net at ffffffff814f67c0 #10 [ffff880621563e38] process_one_work at ffffffff81096f14 It is not possible to create new network namespaces, and processes that call unshare() end up being stuck in uninterruptible sleep state waiting to acquire the net_mutex. The bug was observed in the IPv6 netfilter code by Per Sundstrom. I thank him for his analysis of the problem. The parts of this patch that apply to IPv4 and IPv6 fragment reassembly are preemptive measures. Signed-off-by: Jiri Wiesner <jwiesner@suse.com> Reported-by: Per Sundstrom <per.sundstrom@redqube.se> Acked-by: Peter Oskolkov <posk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
1 parent afd0a80 commit ebaf39e

File tree

3 files changed

+21
-2
lines changed

3 files changed

+21
-2
lines changed

net/ipv4/ip_fragment.c

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -515,6 +515,7 @@ static int ip_frag_reasm(struct ipq *qp, struct sk_buff *skb,
515515
struct rb_node *rbn;
516516
int len;
517517
int ihlen;
518+
int delta;
518519
int err;
519520
u8 ecn;
520521

@@ -556,10 +557,16 @@ static int ip_frag_reasm(struct ipq *qp, struct sk_buff *skb,
556557
if (len > 65535)
557558
goto out_oversize;
558559

560+
delta = - head->truesize;
561+
559562
/* Head of list must not be cloned. */
560563
if (skb_unclone(head, GFP_ATOMIC))
561564
goto out_nomem;
562565

566+
delta += head->truesize;
567+
if (delta)
568+
add_frag_mem_limit(qp->q.net, delta);
569+
563570
/* If the first fragment is fragmented itself, we split
564571
* it to two chunks: the first with data and paged part
565572
* and the second, holding only fragments. */

net/ipv6/netfilter/nf_conntrack_reasm.c

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -341,7 +341,7 @@ static bool
341341
nf_ct_frag6_reasm(struct frag_queue *fq, struct sk_buff *prev, struct net_device *dev)
342342
{
343343
struct sk_buff *fp, *head = fq->q.fragments;
344-
int payload_len;
344+
int payload_len, delta;
345345
u8 ecn;
346346

347347
inet_frag_kill(&fq->q);
@@ -363,10 +363,16 @@ nf_ct_frag6_reasm(struct frag_queue *fq, struct sk_buff *prev, struct net_devic
363363
return false;
364364
}
365365

366+
delta = - head->truesize;
367+
366368
/* Head of list must not be cloned. */
367369
if (skb_unclone(head, GFP_ATOMIC))
368370
return false;
369371

372+
delta += head->truesize;
373+
if (delta)
374+
add_frag_mem_limit(fq->q.net, delta);
375+
370376
/* If the first fragment is fragmented itself, we split
371377
* it to two chunks: the first with data and paged part
372378
* and the second, holding only fragments. */

net/ipv6/reassembly.c

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -281,7 +281,7 @@ static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *prev,
281281
{
282282
struct net *net = container_of(fq->q.net, struct net, ipv6.frags);
283283
struct sk_buff *fp, *head = fq->q.fragments;
284-
int payload_len;
284+
int payload_len, delta;
285285
unsigned int nhoff;
286286
int sum_truesize;
287287
u8 ecn;
@@ -322,10 +322,16 @@ static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *prev,
322322
if (payload_len > IPV6_MAXPLEN)
323323
goto out_oversize;
324324

325+
delta = - head->truesize;
326+
325327
/* Head of list must not be cloned. */
326328
if (skb_unclone(head, GFP_ATOMIC))
327329
goto out_oom;
328330

331+
delta += head->truesize;
332+
if (delta)
333+
add_frag_mem_limit(fq->q.net, delta);
334+
329335
/* If the first fragment is fragmented itself, we split
330336
* it to two chunks: the first with data and paged part
331337
* and the second, holding only fragments. */

0 commit comments

Comments
 (0)