c - When zeroing a struct such as sockaddr_in, sockaddr_in6 and addrinfo before use, which is correct: memset, an initializer or either?

Question

Welcome To Ask or Share your Answers For Others

c - When zeroing a struct such as sockaddr_in, sockaddr_in6 and addrinfo before use, which is correct: memset, an initializer or either?

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

c - When zeroing a struct such as sockaddr_in, sockaddr_in6 and addrinfo before use, which is correct: memset, an initializer or either?

Whenever I look at real code or example socket code in books, man pages and websites, I almost always see something like:

struct sockaddr_in foo;
memset(&foo, 0, sizeof foo); 
/* or bzero(), which POSIX marks as LEGACY, and is not in standard C */
foo.sin_port = htons(42);

instead of:

struct sockaddr_in foo = { 0 }; 
/* if at least one member is initialized, all others are set to
   zero (as though they had static storage duration) as per 
   ISO/IEC 9899:1999 6.7.8 Initialization */ 
foo.sin_port = htons(42);

or:

struct sockaddr_in foo = { .sin_port = htons(42) }; /* New in C99 */

or:

static struct sockaddr_in foo; 
/* static storage duration will also behave as if 
   all members are explicitly assigned 0 */
foo.sin_port = htons(42);

The same can also be found for setting struct addrinfo hints to zero before passing it to getaddrinfo, for example.

Why is this? As far as I understand, the examples that do not use memset are likely to be the equivalent to the one that does, if not better. I realize that there are differences:

memset will set all bits to zero, which is not necessarily the correct bit representation for setting each member to 0.
memset will also set padding bits to zero.

Are either of these differences relevant or required behavior when setting these structs to zero and therefore using an initializer instead is wrong? If so, why, and which standard or other source verifies this?

If both are correct, why does memset/bzero tend to appear instead of an initializer? Is it just a matter of style? If so, that's fine, I don't think we need a subjective answer on which is better style.

The usual practice is to use an initializer in preference to memset precisely because all bits zero is not usually desired and instead we want the correct representation of zero for the type(s). Is the opposite true for these socket related structs?

In my research I found that POSIX only seems to require sockaddr_in6 (and not sockaddr_in) to be zeroed at http://www.opengroup.org/onlinepubs/000095399/basedefs/netinet/in.h.html but makes no mention of how it should be zeroed (memset or initializer?). I realise BSD sockets predate POSIX and it is not the only standard, so are their compatibility considerations for legacy systems or modern non-POSIX systems?

Personally, I prefer from a style (and perhaps good practice) point of view to use an initializer and avoid memset entirely, but I am reluctant because:

Other source code and semi-canonical texts like UNIX Network Programming use bzero (eg. page 101 on 2nd ed. and page 124 in 3rd ed. (I own both)).
I am well aware that they are not identical, for reasons stated above.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T18:58:47+0000

One problem with the partial initializers approach (that is '{ 0 }') is that GCC will warn you that the initializer is incomplete (if the warning level is high enough; I usually use '-Wall' and often '-Wextra'). With the designated initializer approach, that warning should not be given, but C99 is still not widely used - though these parts are fairly widely available, except, perhaps, in the world of Microsoft.

I tend used to favour an approach:

static const struct sockaddr_in zero_sockaddr_in;

Followed by:

struct sockaddr_in foo = zero_sockaddr_in;

The omission of the initializer in the static constant means everything is zero - but the compiler won't witter (shouldn't witter). The assignment uses the compiler's innate memory copy which won't be slower than a function call unless the compiler is seriously deficient.

GCC has changed over time

GCC versions 4.4.2 to 4.6.0 generate different warnings from GCC 4.7.1. Specifically, GCC 4.7.1 recognizes the = { 0 } initializer as a 'special case' and doesn't complain, whereas GCC 4.6.0 etc did complain.

Consider file init.c:

struct xyz
{
    int x;
    int y;
    int z;
};

struct xyz xyz0;                // No explicit initializer; no warning
struct xyz xyz1 = { 0 };        // Shorthand, recognized by 4.7.1 but not 4.6.0
struct xyz xyz2 = { 0, 0 };     // Missing an initializer; always a warning
struct xyz xyz3 = { 0, 0, 0 };  // Fully initialized; no warning

When compiled with GCC 4.4.2 (on Mac OS X), the warnings are:

$ /usr/gcc/v4.4.2/bin/gcc -O3 -g -std=c99 -Wall -Wextra -c init.c
init.c:9: warning: missing initializer
init.c:9: warning: (near initialization for ‘xyz1.y’)
init.c:10: warning: missing initializer
init.c:10: warning: (near initialization for ‘xyz2.z’)
$

When compiled with GCC 4.5.1, the warnings are:

$ /usr/gcc/v4.5.1/bin/gcc -O3 -g -std=c99 -Wall -Wextra -c init.c
init.c:9:8: warning: missing initializer
init.c:9:8: warning: (near initialization for ‘xyz1.y’)
init.c:10:8: warning: missing initializer
init.c:10:8: warning: (near initialization for ‘xyz2.z’)
$

When compiled with GCC 4.6.0, the warnings are:

$ /usr/gcc/v4.6.0/bin/gcc -O3 -g -std=c99 -Wall -Wextra -c init.c
init.c:9:8: warning: missing initializer [-Wmissing-field-initializers]
init.c:9:8: warning: (near initialization for ‘xyz1.y’) [-Wmissing-field-initializers]
init.c:10:8: warning: missing initializer [-Wmissing-field-initializers]
init.c:10:8: warning: (near initialization for ‘xyz2.z’) [-Wmissing-field-initializers]
$

When compiled with GCC 4.7.1, the warnings are:

$ /usr/gcc/v4.7.1/bin/gcc -O3 -g -std=c99 -Wall -Wextra  -c init.c
init.c:10:8: warning: missing initializer [-Wmissing-field-initializers]
init.c:10:8: warning: (near initialization for ‘xyz2.z’) [-Wmissing-field-initializers]
$

The compilers above were compiled by me. The Apple-provided compilers are nominally GCC 4.2.1 and Clang:

$ /usr/bin/clang -O3 -g -std=c99 -Wall -Wextra -c init.c
init.c:9:23: warning: missing field 'y' initializer [-Wmissing-field-initializers]
struct xyz xyz1 = { 0 };
                      ^
init.c:10:26: warning: missing field 'z' initializer [-Wmissing-field-initializers]
struct xyz xyz2 = { 0, 0 };
                         ^
2 warnings generated.
$ clang --version
Apple clang version 4.1 (tags/Apple/clang-421.11.65) (based on LLVM 3.1svn)
Target: x86_64-apple-darwin11.4.2
Thread model: posix
$ /usr/bin/gcc -O3 -g -std=c99 -Wall -Wextra -c init.c
init.c:9: warning: missing initializer
init.c:9: warning: (near initialization for ‘xyz1.y’)
init.c:10: warning: missing initializer
init.c:10: warning: (near initialization for ‘xyz2.z’)
$ /usr/bin/gcc --version
i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)
Copyright (C) 2007 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$

As noted by SecurityMatt in a comment below, the advantage of memset() over copying a structure from memory is that the copy from memory is more expensive, requiring access to two memory locations (source and destination) instead of just one. By comparison, setting the values to zeroes doesn't have to access the memory for source, and on modern systems, the memory is a bottleneck. So, memset() coding should be faster than copy for simple initializers (where the same value, normally all zero bytes, is being placed in the target memory). If the initializers are a complex mix of values (not all zero bytes), then the balance may be changed in favour of an initializer, for notational compactness and reliability if nothing else.

There isn't a single cut and dried answer...there probably never was, and there isn't now. I still tend to use initializers, but memset() is often a valid alternative.

Categories

c - When zeroing a struct such as sockaddr_in, sockaddr_in6 and addrinfo before use, which is correct: memset, an initializer or either?

c - When zeroing a struct such as sockaddr_in, sockaddr_in6 and addrinfo before use, which is correct: memset, an initializer or either?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

GCC has changed over time

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags