How can I troubleshoot DNS and network connectivity issues?
Every now then, we all run into misconfigurations, bugs, or unexpected behavior while administering any complex piece of software. When you consider the granularity of each configuration scope in the Momentum/Ecelerity architecture and how extendable it is in order to accommodate many different use cases, there is a possibility of running into-suboptimal performance, or even instability. When troubleshooting given issues, there are some useful places to start looking:
Logs - The most important and crucial piece of information you can use is your logs. If you a suspect a problem with Ecelerity, please look check your panic log on the affected MTA located at /var/log/ecelerity/paniclog.ec. Another place to search is /var/log/messages or "dmesg"
Network - Some issues can be due to network misconfigurations, starved connections, bad hardware, DNS configuration, etc. Here are some preliminary checks that can be done to spot issues quickly, starting inward then working your way outward to spot any issues:
1) Run 'ifconfig -a'. Can you see all of your interfaces and IP addresses you would normally expect to see? If not, try running 'ethtool eth0' and replace the interface name with the broken one you see in ifconfig. If you see that your network card is running at half-duplex, or if 'Link detected' is set to 'no,' then try renegotiating the connection by running "ethtool -r eth0". Afterwards, you should see some information appear in /var/log/messages and dmesg.
Firewall/NAT - Do you have iptables running on each MTA? Check that you don't have any rules using DROP or REJECT preventing a particular port being listening by running 'iptables -L'. If there's a particular port that you can't reach, verify its status by running netstat -lntpu. Be sure to send this output to support when raising a ticket for a network issue. Also, ping, traceroute, mtr (My Traceroute), etc can be helpful as well.
DNS - Generally, you'll recognize DNS issues when receiving internal bounces containing "454 4.4.4 [internal] no MX or A for domain". In short, the DNS subsystem is not able to retrieve responses for DNS queries. Momentum is capable of executing several hundreds to thousands of DNS queries per second and responses should be received within milliseconds, so it is incredibly important for your DNS subsystem to be working optimally. For the most part, it's best to troubleshoot starting inward then moving outward.
Check /etc/resolv.conf (or the resolv_conf parameter if configured in ecelerity.conf). This will provide the nameservers that your MTA is currently using. Can you ping the nameservers? Can you dig for a domain specifically from that nameserver (i.e., dig @ns.mynameserver.com mx google.com). If you have any trouble getting a response from any of these machines you target while using 'dig,' then you will need to troubleshoot said nameserver/resolver.
WARNING: Do you not run your nameservers locally on each MTA. This may cause resource contention as the server has to now both forward queries and answer them. Additonally, do not use a public resolver such as Google's 126.96.36.199 and 188.8.131.52. A public DNS resolver will likely rate-limit your responses if not block them altogether.
If not, we'll need to find out who's reponsible for providing DNS and fix it. Find your current nameservers by checking /etc/resolv.conf on the problematic MTA. (Also, there is option in ecelerity.conf where you can specify your own nameservers. This option is called "resolv_conf" https://support.messagesystems.com/docs/web-ref/conf.ref.resolv_conf.php).