I administer a number of web servers, mostly CentOS and RedHat Enterprise Linux with cPanel and Web Host Manager. A while back our primary name server (a CentOS server with cPanel & WHM) gave problems when restarting the name service (Bind). The problem seemed like one of insufficient capacity. I was a bit surprised that this could be because we were serving only about 5,000 DNS zones, but nevertheless went ahead with an “upgrade”…
In my infinite wisdom, I used the /scripts/ulimitnamed script to “upgrade” the Bind name server to a version that supposedly supports a very large number of DNS zones. [ Imagine the ego trip thinking that 5,000 zones put us in the “very large” category
] The resulting Bind instance showed up in the process list as named-wrapper (used to named in originally). The new process used quite a bit more resources than the original, but because it solved the apparent capacity problem and that the box in question was primarily a name server (and some server monitoring scripts etc.), I thought the “upgrade” was a job well done.
In recent weeks I noticed a steady increase in CPU usage. This came to a boiling point when the named-wrapper process suddenly started to consume 95% plus CPU for extended periods of time. Domain lookups were slowing down, and newly added DNS zones (on other web servers in the cluster) were getting lost. At the same time (and even before when the initial problem occurred) our secondary name server was working without any problems.
Some googling (I guess I could have binged too) revealed some general unhappiness with the performance of named-wrapper. More than one users on the cPanel forum previously stated that they wanted to revert from the high-capacity version of Bind back to the “standard” version. There were some speculations how to do this, but no definitive method.
I tried to piece the puzzle together, and got lucky. Here is how I “downgraded” (quite graciously may I add) from named-wrapper to named:
- Stopped the name service:
service named stop - Replaced the named startup script (/etc/init.d/named) with the original. (I did this by copying it from the secondary name server to the primary server.)
- Forced a reinstall of bind-libs
yum reinstall bind-libs - Restarted the name service
service named start
The above procedure solved my problem completely. CPU usage is down to almost nothing, and there are no problems starting the name service. (Why the initial apparent capacity problem occurred, I still do not know.) I hope posting it here may help someone else too.
The morals of the story:
- Do not blindly use the utilities in the /scripts folder. Useful as many of them may be, they are generally not well documented and can be dangerous. Be sure what the use of a particular script may be before using it. (The /scripts/ulimitnamed script will reportedly be deprecated in an upcoming version of cPanel – someone else learned the same lesson it seems.)
- Google is your friend. Search and search some more. The chances are someone else have experienced the same problem before and posted a
- solution somewhere.
Keywords: named, named-wrapper, Bind, cPanel, cpu, ulimitnamed