Signal handling in applications, containers & pods06 May, 2020
When shutting down an application, a container or a pod, we want it to happen smoothly, all serving requests to be completed and resources to be properly released. For this to happen, we need the signal to be propagated from the source (pod, container, OS) to the application.
In this article, I will discuss how signals are handled by applications, containers and pods and how this is affected by some implementation details.
Signals: Inter-process communication
A signal is a form of inter-process communication used in POSIX-compliant operating systems. It is an asynchronous message from the kernel to a process in order to interrupt it’s normal execution. Upon delivery of the signal the registered signal handler will be executed.
kill -l command displays all the supported signals in a system.
Each one has an action defined and a default value.
Below are the most known ones, or at least the ones we are interested in this post:
|SIGINT||2||Term||Interrupt from keyboard|
|SIGQUIT||3||Core||Quit from keyboard|
Signals from command line to process
There are keyboard combinations, which generate the first two signal presented in the table above.
stty command we can find out which combinations correspond to which signals.
Ctl+C sends a SIGINT signal to the running process in the terminal and the
Ctl+\ sends a SIGQUIT.
We can send signals to any running process by issuing the command
kill. The application can determine how it will act once it receives a SIGTERM, it can clean resources or just ignore it. In the contrary, when a SIGKILL is issued, the process can not ignore it.
To be more precise, the processes are not even aware of the SIGKILL signal since it goes directly to the kernel init, and init will force stop the process with the corresponding PID. If the process is waiting for network or disk I/O, the kernel will not be able to kill it. Additionally the kill command has not effect on zombie processes.
Signals from K8s to Docker containers
When Kubernetes wants to delete a pod it will first send SIGTERM to all the running containers in the pod. Then it will wait for a number of seconds, which is known as the termination grace period, before it will forcefully SIGKILL them. It is important to mention that when a pod has more than one containers running, it will signal all containers at the same time.
Signals from Docker to processes
There are two main docker commands that can be used for stopping a running container.
docker stop command is executed, Docker will politely ask the process to terminate by issuing a SIGTERM signal. If the application does not terminate within the specified period, it will be SIGKILL-ed. The SIGTERM signal is send to the root process, PID 1, of the container.
docker kill does not give any grace period to the containers application to shutdown. It sends a SIGKILL signal, which can not be ignored as we saw earlier.
So, when the cluster management tool, aka Kubernetes, runs docker stop, the Docker will send a SIGTERM to the root application. Although it seems a no-brainer, there are a couple of snags on how applications are started in a container in order to be considered root.
CMD & ENTRYPOINT
There are two docker instructions with which we can start an application in a container:
CMD defines default commands and parameters for a container. It can be overwritten from the command line when container is started.
ENTRYPOINT configures a container to run as an executable.
Both commands can be specified in the shell form:
CMD java -jar /path/to/jar.jar
ENTRYPOINT java -jar /path/to/jar.jar
or exec form:
CMD ["java", "-jar", "/path/to/jar.jar" ]
ENTRYPOINT ["java", "-jar", "/path/to/jar.jar"]
What is important here, is what are those form implying and what are the differences in the running container.
Shell vs Exec in docker
In the shell form, all environment variables will be evaluated. The provided command will be run within a shell by prepending /bin/sh -c before it. On the other hand, in the exec form there is no shell processing involved and the defined executable is being called directly.
To experience and actually see the differences, I created two docker images of a Spring boot application.
In the first one, I start the application with the shell form of the CMD
root@88e5d287cc24:/# ps -aux USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1 0.5 0.0 2384 728 ? Ss 19:14 0:00 /bin/sh -c java -jar whatAmI.jar com.protopapa.experimentspring.ExperimentSpringApplication root 6 25.8 2.5 5727008 204328 ? Sl 19:14 0:10 java -jar whatAmI.jar com.protopapa.experimentspring.ExperimentSpringApplication root 44 5.7 0.0 5748 3460 pts/0 Ss 19:14 0:00 bash root 49 0.0 0.0 9388 2964 pts/0 R+ 19:14 0:00 ps -aux
in the second one, I start the application with the exec form of the CMD
root@d1cfa8856d39:/# ps -aux USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1 63.0 2.6 5727008 213132 ? Ssl 19:16 0:10 java -jar whatAmI.jar com.protopapa.experimentspring.ExperimentSpringApplication root 43 8.3 0.0 5748 3504 pts/0 Ss 19:16 0:00 bash root 48 0.0 0.0 9388 2940 pts/0 R+ 19:16 0:00 ps -aux
When we start an application using the exec form, we notice that the application is the root process, while in the container with the shell form, the process with PID 1 is the shell.
As we saw before, when a container or a pod terminates, the SIGTERM signal is send only to the PID 1 process. This means that when we start our application with the shell form it will not receive the termination signal and will not shutdown gracefully.
We have the same result, as with the shell form, when a bash script is used to spin up an executable of our application. To overcome that, we just need to tell the shell to replace itself with the running application using the exec shell command, which replaces the current program in the current process, without forking a new process.
Things to note
The Spring Boot Applications register a shutdown hook with the JVM to ensure that the ApplicationContext closes gracefully on exit. But despite the prevalence of SIGTERM in most of the process managers, many frameworks expect the application to be stopped by other signals, for example an nginx server listens to SIGQUIT signal.