Troubleshooting after upgrade Digital Access to 6.0.5 and above
This article is valid for Digital Access 6.0.5 and above.
Publishing does not connect services
Problem
After upgrade to 6.0.5 or above, publishing one time does not connect the services.
Solution
Retry publishing or restart services and then publish again.
If publishing multiple times even does not connect the services:
Verify that for each service, the LocalConfiguration file (/opt/nexus/config/<service>/config/LocalConfiguration.xml) has the "Administration Service" object
mHost
attribute's value as "admin" instead of Host IP/127.0.0.1.If not, change
mhost
to "admin" and restart the Digital Access services using the restart services commands. See examples below.
Change mhost to "admin"
<node>
<object key="c000ejp1m5" name="Administration Service" trans="ivjq0838gkxs" ver="50600">
<attribute name="mAllInterfaces" type="boolean" value="false"/>
<attribute name="mPort" type="integer" value="8300"/>
<attribute name="mHost" type="string" value="admin"/>
<attribute name="mType" type="integer" value="5"/>
<attribute name="mId" type="integer" value="1"/>
</object>
Restart services
sudo docker stack rm da
sudo bash /opt/nexus/scripts/start-all.sh
Error message: da overlay network is missing
Problem
This message appear when you restart the services: 'da overlay network is missing' [Error response from daemon: network da_da-overlay not found]
.
Solution
Restart all the services again.
Services does not start: swarm label configuration
Problem
If the nodes are not labelled correctly, the services may not start. The node labels determine which services are running on that node. For example, an Access Point not starting in the internal network is most likely a label problem.
Enter the command
docker stack ps da
to find out if any service does not start on the indicated node.
Solution
List the currently configured labels:
List labels
CODEdocker node ls -q | xargs docker node inspect -f '{{ .ID }} [{{ .Description.Hostname }}]: {{ range $k, $v := .Spec.Labels }}{{ $k }}={{ $v }} {{end}}'
Example output:
CODEzhozhf8idvre7app5yyuh1bpm [ubuntu2004-1]: da-access-point=true da-administration-service=true da-authentication-service=true da-distribution-service=true da-policy-service=true postgres=true ab37r22u342pdp478994cf23 [ubuntu2004-2]: da-access-point1=true da-access-point=true
In the above example output, the
da-access-point
will try to start in both nodes, which is unlikely to be the intention.To remove the
da-access-point
label from the example above, enter this command:Example: Remove a label
CODEdocker node update --label-rm da-access-point ab37r22u342pdp478994cf23
To remove a label, use this command:
Syntax: Remove a label
CODEdocker node update --label-rm <label> <node id>
Digital Access services not running
Problem
When entering the command docker ps
or docker stack ps da
, some or all Digital Access services are missing, or appears only a few seconds.
This is because there is some sort of problem on the startup of the services. The docker ps
command quickly hide any stopped containers so they are either completely missing from docker ps
or only there for a few seconds after which they fail. Since the desired state of the services is to be running, docker will try to start the containers over and over again, but if the problem is permanent it will keep failing.
Solution
To get more information, enter this command:
Show all containers, including stopped ones
CODEdocker ps -a
This will produce a long list where the container which was last to stop is in the top.
Find the ID of that container (alphanumeric string) and enter this command:
Show logs for container
CODEdocker logs <container-id>
In most cases this will give a good hint of the problem.
Docker is quick at cleaning up stopped containers. A pause of more than a minute between the two commands will most likely result in a "container not found" error message.
If you cannot find any information on why the container is not starting in the container logs or the Docker stack logs, it could be in the Docker engine logs. Enter this command to check them:
CODEsudo journalctl -u docker.service
This command retrieves logs specifically for the Docker service using journalctl
. If you need to view more recent logs or continuously monitor them, you can add options to the command:
To view the latest logs:
sudo journalctl -u docker.service -n 100
To continuously monitor the logs in real-time:
sudo journalctl -u docker.service -f
Access denied due to file permissions
If you encounter an "access denied" error related to file permissions, such as:
chmod: changing permissions of '/etc/nexus/policy-service/config/RemoteConfiguration.xml.old': Operation not permitted.
You can resolve this issue by changing the file permissions or removing the file. Do the following:
Change the file permissions
To change the file permissions, use the chmod
command to modify the file permissions. For example:
sudo chmod 644 /etc/nexus/policy-service/config/RemoteConfiguration.xml.old
This command changes the file's permissions, allowing read and write access as necessary.
You might need sudo
to gain the required permissions.
Remove file
If changing the permissions is not an option or does not resolve the issue, you can remove the file:
sudo rm /etc/nexus/policy-service/config/RemoteConfiguration.xml.old
This command deletes the file, which may be necessary if it is causing permission issues. You must have sudo
to execute this command.