sysdig is a container troubleshooting tools, which provides both opensource and commercial products. For regular troubleshooting, I believe opensourced version is enough.
On top of sysdig, you can also use csysdig and sysdig-inspect as command line interface and GUI.
Setup
# on Ubuntucurl-shttps://s3.amazonaws.com/download.draios.com/DRAIOS-GPG-KEY.public|apt-keyadd-curl-s-o/etc/apt/sources.list.d/draios.listhttp://download.draios.com/stable/deb/draios.listapt-getupdateapt-get-yinstalllinux-headers-$(uname-r)apt-get-yinstallsysdig# on REHLrpm--importhttps://s3.amazonaws.com/download.draios.com/DRAIOS-GPG-KEY.publiccurl-s-o/etc/yum.repos.d/draios.repohttp://download.draios.com/stable/rpm/draios.reporpm-ihttp://mirror.us.leaseweb.net/epel/6/i386/epel-release-6-8.noarch.rpmyum-yinstallkernel-devel-$(uname-r)yum-yinstallsysdig# on MacOSbrewinstallsysdig
Weave Scope is another container monitoring and troubleshooting tool. Compared to sysdig, it provides a simple GUI which describes the entire topology of the cluster.
Weave Scope consists of two components: the app and the probe. The components are deployed as a single Docker container using the scope script. The probe is responsible for gathering information about the host on which it is running. This information is sent to the app in the form of a report. The app processes reports from the probe into usable topologies, serving the UI, as well as pushing these topologies to the UI.
Setup
Access GUI
Known Issues
When running scope on Ubuntu kernel 4.4.0 with option --probe.ebpf.connections (default is enabled), Node may panic because of a kernel issue:
To fix this issue, you could either
Disable eBPF connections, e.g. --probe.ebpf.connections=false
# Refer https://www.sysdig.org/wiki/sysdig-examples/.
# View the top network connections
sudo sysdig -pc -c topconns
# View the top network connections inside the wordpress1 container
sudo sysdig -pc -c topconns container.name=wordpress1
# Show the network data exchanged with the host 192.168.0.1
sudo sysdig fd.ip=192.168.0.1
sudo sysdig -s2000 -A -c echo_fds fd.cip=192.168.0.1
# List all the incoming connections that are not served by apache.
sudo sysdig -p"%proc.name %fd.name" "evt.type=accept and proc.name!=httpd"
# View the CPU/Network/IO usage of the processes running inside the container.
sudo sysdig -pc -c topprocs_cpu container.id=2e854c4525b8
sudo sysdig -pc -c topprocs_net container.id=2e854c4525b8
sudo sysdig -pc -c topfiles_bytes container.id=2e854c4525b8
# See the files where apache spends the most time doing I/O
sudo sysdig -c topfiles_time proc.name=httpd
# Show all the interactive commands executed inside a given container.
sudo sysdig -pc -c spy_users
# Show every time a file is opened under /etc.
sudo sysdig evt.type=open and fd.name
# View the list of processes with container context
sudo csysdig -pc