linux syscalls learning: setsockopt gotcha!!

Today's learning.

I'm too lazy to put up a proper blog post. So I'll put down my initial thoughts down here. Today's post is about how i decided to be an ape and write some socket programs by hand and ended up relentlessly browsing man pages. I do agree that after the LLM boom, manually looking up manuals and writing code is a pain. I don't wanna go back to it for any practical purposes but when I'm learning it makes so much sense to code and design by hand and type it out. It helps the lesson stick better, I've also been learning and progressing on my understand of O/S internals. And this has been a good exercise for me as I find myself understanding something when I read the manual and then i type the code and then i look the manual again when I want to do something else. Anyway let me get to the complaint that Linux has been strange.

WTF LINUX!!!

This might not be linux's fault at all idk, however since I'm on linux I'm blaming linux. I'll be exploring BSD in the next few months, since I'm interested in the kqueue implementation that they have going anyway that's a talk for some other day. Let me get to topic at hand.

Anyway my today's complaint is about setsockopt(2) the system call that sets options on your socket. Some classic use for this include reusing the port(SO_REUSEPORT (only applicable for same uid)) or timing out when no data is received from the client. e.g: the usage for timeout looks like this

  struct timeval tv = { .tv_sec = 5, .tv_usec = 0 };  // disconnect if no data received within 5s
  setsockopt(sockfd, SOL_SOCKET, SO_RCVTIMEO, &tv, sizeof(tv));

  char buf[1024];
  ssize_t n = recv(sockfd, buf, sizeof(buf), 0);
  if (n < 0 && (errno == EAGAIN || errno == EWOULDBLOCK)) {
    close(sockfd);  // timed out — disconnect
   }

This is all cool, and I looked at setsockopt(2) manual and saw this function signature.

  int setsockopt(int sockfd, int level, int optname,
                        const void optval[.optlen],
                        socklen_t optlen);

I see that optval is the (boolean) val and optlen is the length of that optval. Cool, I was going through a port binding issue, i.e: after i close the program and restart the program was throwing me 'bind: port already in use' even though the original program had quit. Kernel takes some time to free this socket up(a common source of frustration in socket programming) anyway, i decided to use the SO_REUSEADDR option. This is what my snippet of setsockopt(2) usage looked like

  unsigned char yes = 0xff; // use a single byte, not efficient as a bool, but backwards compatibility ugh!
  setsockopt(socketfd, SOL_SOCKET, SO_REUSEADDR, &yes, sizeof(yes));

Since i didn't want to waste 4 bytes on an int, i decided to use a char(a.k.a byte) and set it to 0xff(255). All bits are 1 so it doesn't matter whether the setsockopt parses little endian or big endian, it will work, bloody hell and i happily run my program, expecting it start, instead it gives me this "setsockopt(SO_REUSEADDR) failed: Invalid argument" 🤮 Now this error didn't make sense to me and if you've been programming C for a while, you would have automatically assumed that setsockopt looks at optlen and decides how to parse optval but nope that's not the case. So i end up going and checking setsockopt(2) again and nothing is there. So i check socket(7) and see that SO_REUSEADDR says this

SO_REUSEADDR Indicates that the rules used in validating addresses supplied in a bind(2) call should allow reuse of local addresses. For AF_INET sockets this means that a socket may bind, except when there is an active listening socket bound to the address. When the listening socket is bound to INADDR_ANY with a specific port then it is not possible to bind to this port for any local ad‐ dress. Argument is an integer boolean flag.

It says that 'argument is an integer boolean flag', so i got curious and decided to browse the source and decide where does this decision come form and i see that the boolean that is being parsed is `valbool`(src1: parsing, src2: reuseaddr). It does use a now deprecated method copyfromsockptr.

And i changed my 'char' to 'int' and the program works for now!! Anyway this was an interesting detour in the pages of man.

See you soon Hackers, Happy hacking!!