Troubleshooting Common Docker Errors During Project Setup
Every now and then, you may encounter an issue where while performing a docker-compose up -d
or an .\up.ps1
where it will error with the following message.
ERROR: traefik Container "06bb2d97202e" is unhealthy.
ERROR: Encountered errors while bringing up the project.
It can be extremely annoying and can, if not approaching it the most efficient way, cost you dearly in time management. It may occur inexplicably. And sometimes the simplest solution is all that is needed.
More often than not it typically has to do with whatever was done last in your project or from the most recent pull.
Solution 1: Perform a Down and Reboot
It’s not a guarantee but there are times where simply rebooting your system can resolve the error. Before you do reboot your system however I highly recommend you perform a .\down.ps1
or a docker-compose down
such that you don’t have any active containers.
Solution 2: Verify Enough Disk Space Exists
If you have been running Docker long enough, and you are on perhaps an older version, if your system does shut down or reboot and your container hasn’t performed a proper shutdown, it can eat up disk space fairly quickly. As such, given the requirements of running an XM Cloud Local Environment, if this not enough disk space, it can produce an error that traefik cannot be started.
Solution 3: Verify a Valid License File
Yet another reason, while it seems fairly obvious, but your license file may have expired or may not be in the location provided during the initialization of the environment. This will show up in the CM logs. All it takes is for someone to modify the docker-compose files to specify the location and while you wouldn’t think this may be an issue, it can occur when you are working in a larger group of developers.
Solution 4: Ensure Conflicting Processes Are Shut Down
This includes things like SQL Server, IIS Server or Solr instance. Any of these can cause conflicts with your containers and need to be shut down prior to performing a docker-compose up
or a .\up.ps1
.
You can run the following commands to ensure you don’t have additional processes running that may interfere. The following will check web services running on port 443.
Get-Process -Id (Get-NetTCPConnection -LocalPort 443).OwningProcess
This will check for services such as Solr.
Get-Process -Id (Get-NetTCPConnection -LocalPort 8984).OwningProcess
Solution X: What if It’s None of Those?
If you’ve performed the previous three solutions, you’re probably like, ok, done that, what’s next? How do I figure out really what the problem is?
One thing you can do is you’ll notice that the container that is actually the problem is referenced. There by, you can perform the following command that will tell you which container is the source of the error.
docker ps
You should see the following.
With the container ID from the initial number, you’re now able to accurately determine which container is the one causing traefik to fail. So follow the next steps to determine the error that is occurring.
- Open up Docker Desktop
- Open the offending container to view the log files.
At this point diagnosing the issue becomes very similar to how to diagnose issues within a local instance of Sitecore XP or XM. One challenge I did find recently was my CM instance wasn’t displaying the error. It was due to a misspelling in my config patch file, but my CM logs never showed the error that was occurring.
- Knowing that, recognizing that traefik works top down, via the rendering container, I next checked the logs for the that container ID. Low and behold, the error was shown as part of the build process for the front-end.
The main thing to recognize is that the error isn’t always where you’d expect it to be. We had a CM issue, but the error, for whatever reason, wasn’t being displayed within the CM container, but rather the rendering container.