RELIABLE BATTERY SYSTEMS WON’T HAPPEN
WITHOUT PROACTIVE TESTING AND MAINTENANCE
What is required to guarantee a reliable system?
Presented at Electric West
2003
Our world today has become
reliant on systems in support of a global economy and communication network
that are powered by a combination of both DC and AC electrical power sources.
These systems by their very nature must supply uninterrupted continuous
power. The design of an uninterrupted power source must consist of a primary
energy source, such as the national power grid, and a backup energy source that
can power the load during any interruptions in the primary source. The backup
or standby energy source for most critical applications consists of the
following:
A. A
temporary energy source that can instantaneously assume and support the load
during momentary and short duration interruptions (fractions of a second to a
few hours). Examples of such energy sources employed in today’s designs are
batteries and large flywheels.
B. A
longer-term source that can support the critical load for hours or days until
the primary source is restored. Examples include the diesel engine generator
and large stationary battery banks.
There are
a number of other technologies under consideration today and some are actually
being employed in a few mission critical systems however the refinement of such
technologies and their wide spread use has yet to be realized. The lead-acid
battery is still the main energy source in use today and will be for the
foreseeable future. Issues discussed in this paper will be limited to the
lead-acid technology.
The main
reason other technologies are receiving so much attention today is that many
battery backed-up systems have not met the high reliability demanded by most
customers. The keyword here is “many,” because there are numerous
battery systems working reliably.
Why are
not all battery applications considered reliable? It is very simple; most
battery customers do not understand batteries and, therefore, do not perform
the necessary maintenance and test routines to assure themselves a reliable
system.
The
following sections will discuss why batteries fail and how these failures
cannot only be detected early on, but how most of them can also be prevented.
To understand why all
batteries eventually fail, it is necessary to know the basic construction of
the battery and understand the factors that influence battery life and
reliability. The following is a very basic explanation, and the author hopes
that the electrochemists of the world will forgive some of the
oversimplifications made.
The
original, basic lead acid battery (called Planté after its inventor) is made up
of pure lead plates that are placed in an electrolyte solution of diluted
sulfuric acid. If an external voltage source is connected with the polarity
shown, a current will flow through the battery, and the outer surfaces of the
plates begin to transform as shown. The positive plate develops a very fine
outer layer of lead dioxide (PbO2), and the negative plate develops
a layer of something referred to as sponge lead (Pb). See Figure 1.

Figure 1. Basic Lead Acid
Cell
The
conversion of the plates combined with the electrolyte (H2SO4)
represents stored chemical energy that can be converted and delivered as an
electric current through an external load when connected across the positive
and negative terminals. The amount of material in the plates that converts is
referred to as the active material, and the energy that can be stored is
directly proportional to the amount of active material available.
The
chemical formula simply shows the chemical changes that occur during charging
and discharging. Fully charged, the battery is represented as PbO2 +
Pb + H2SO4, which means a positive PbO2 plate
plus a negative Pb plate and dilute sulfuric acid H2SO4. As
the battery discharges, the active material on both plates converts to PbSO4
(lead sulfate), and the electrolyte converts to almost pure water.
The
original Planté design, still in use today, has the drawback that only a small
amount of active material is available; therefore, this battery
requires large plates and many of them in parallel to store sufficient energy
for its applications.
The
shortcoming in energy density of the Planté design led to the development of
the pasted plate technology, which is the most popular plate design in use
today. The pasted plate, shown in Figure 2, utilizes a lattice grid structure
that is filled with active material.

Figure 2. Pasted Plate and
Grid Structure
The
active material, primarily made up of pure lead, is also referred to as
“paste,” since it is applied to the grid in the form of a paste. The paste,
when applied, resembles wet concrete. The volume of paste applied in a fairly
small space gives the pasted plate a great energy density advantage over the
Planté plate.
The grid
is a framework with a dual purpose. The first is to hold or support the active
material, and the second is to carry current to and from the active material.
Most grids are cast using lead alloys, due to their superior mechanical
strength, although pure lead grids are available. Grids are most often
rectangular in shape, with horizontal, vertical and possibly radial members,
and a plate lug on the top edge. The plate lug will ultimately be joined, along
with the lugs of other plates of the same polarity, to the cell post.
Dimensions of the grid, as well as the number of members and their
configuration, vary depending upon the application. For example, long, narrow,
thick grids may be suitable for a cell designed for telecommunications
applications; however, thin grids with an almost square shape may be more
suitable for UPS applications.
As
mentioned earlier, the capacity or amount of energy a battery can store is a
function of available active material and acid available. The simplest way to
increase a battery’s capacity is to connect additional plates in parallel.
Thus, the batteries produced today consist of an assembly of positive and
negative plates insulated from each other by separators. The size of the plates
obviously also impacts the capacity, and the battery manufacturers typically
produce at least five different plate sizes.
The following
two types of lead acid batteries are currently in use:
1. VLA
(Vented Lead Acid) – This is the oldest and most reliable technology,
characterized by an abundance of free electrolyte. All surface areas of the
plates are submersed in the electrolyte. Water, in the form of hydrogen and
oxygen gases generated during charging, is lost and must periodically be
replaced.
2. VRLA
(Valve Regulated Lead Acid) – The VRLA, sometimes referred to as sealed, has
been evolving since the late 1970s and has become very popular, primarily
because of the installed price. The VRLA is designed to theoretically not lose
any water by recombining the hydrogen and oxygen gases given off during
charging. VRLA cells have a small pressure controlled valve that keeps gases
from leaving the cell unless the internal pressures exceed a set value. Another
major difference with the flooded design is that the VRLA does not have any
free-floating electrolyte; the electrolyte is suspended in either an absorbing
glass mat (AGM) or a gel substance (GEL). See Figure 3.
A key
point to remember is that the plates discussed earlier are essentially the same
for both the flooded and VRLA design. The main design differences are how much
electrolyte is stored and how it is stored.
Since the
loss of water is an important maintenance consideration, lets look at how it
happens:
·
When a cell has reached full charge and cannot absorb any
more energy, the charge current energy still flowing begins breaking down the
water in the electrolyte into its two component gases (hydrogen and oxygen).
·
In a VLA of flooded cell, the gases are free to leave the
cell via a flame arresting vent cap, and it is therefore required that water be
added periodically.
·
In a VRLA cell, which was designed to eliminate periodic
water additions, the gases recombine to form water again. The theoretical
recombination efficiency is 99.9%. (In practice, VRLA cells do suffer from
water losses, and this will be discussed in the next section.)

Figure 3. VRLA Battery
Note
the white absorbing material between plates. It holds the electrolyte and
provides electrical isolation between the adjacent positive and negative
plates.
General
It should be noted that all
lead acid batteries have a limited useful life. The normal failure mode that
dictates the end of life of a well-maintained flooded battery is positive grid
corrosion. As the grid corrodes, the effective cross section of the conduction
path narrows, and the internal cell resistance starts to increase. At the same
time, the grid structure starts to swell and deform to the point where the
paste or active material loses contact with the grid structure. This problem
also leads to increased internal cell resistance as the contact resistance
between paste and grid increases. If the resistance increases are ignored,
meaning that the battery is not taken out of service at the appropriate time,
the positive grids will eventually lose their mechanical strength and start to
break apart.
It is the author’s belief
that due to the predictable decay of flooded cells, internal cell resistance
measurements can be used to predict end of life. The normal life of a good
quality flooded battery is twenty years.
VRLA product today only has
about a seven to ten year life span, and these cells do not live long enough to
die of normal positive grid corrosion. The most common cause for their early
demise has been a drying out or loss of water in the electrolyte. There are
also investigations being conducted that indicate that secondary reactions from
internal recombination of hydrogen and oxygen gases may be adversely affecting
the polarization voltage of the negative plates and/or accelerating positive
grid corrosion. Both problems lead to a loss of capacity.
WHY BATTERIES FAIL PREMATURELY
The reasons that most
batteries fail prematurely are related to one or more of the following:
1.
Excessive
cycling
2.
Improper
charging
3.
Lack
of temperature control
4.
Installation
5.
Manufacturing
problems
6.
Operational
issues
Note that the user has
control over most of the conditions that lead to premature failure.
Excessive Cycling
Every time a battery cycles
(a discharge followed by a recharge), the electrochemical generator has to go
to work, which involves converting acid and paste. As the paste on the positive
grid changes from PbO2 to PbSO4, there is a large
increase in volume, which puts pressure on the paste. The more the paste is
expanded and then later contracted, the more the wear and tear on it. This
means that deeper discharges are more harmful to the battery. Also, cycling a
battery causes accelerated corrosion of the grid structure, which leads to
shorter life. This is especially true for lead calcium batteries, which happens
to be the most popular technology in use today.
The lead calcium battery’s cycling
capability depends on the depth of discharge. For example, it is only capable
of 50 deep cycles (the removal of more than 80% of energy), but can deliver 300
cycles for a 25% depth of discharge cycle. A UPS battery which normally only
delivers about 25% of its stored energy during its 15 minute rated reserve time
can deliver 300 such cycles. If the load on the battery is less than 30 seconds
(momentary power glitch), it can handle thousands of these short cycles.
Improper Charging
Battery
manufacturers specify a voltage range for their various cells that must be
adhered to. If the voltage on a given cell is allowed to go either higher or
lower than the recommended value, it will have a detrimental effect on the life
of the battery. It should also be noted that the specified voltage range is
very temperature dependent. The right voltage for a battery at 77°F
would be too high if the battery was operated in an ambient temperature of 90°F.
It is important for a user to understand the interaction between voltage and
temperature.
Low float voltage (Undercharging) – Undercharging causes sulfate crystals to form
on the plate surfaces, since there is not enough current flowing to keep the
battery fully charged. Sulfate crystals that harden over a long period of time
will not go back in solution when proper voltage is applied and, therefore,
result in a permanent loss of capacity. Extended undercharging will also cause
a loss of active material from the negative plates.
High float voltage (Overcharging) – Overcharging causes excessive gassing of
hydrogen and oxygen. This leads to loss of water in flooded cells and dryout in
VRLA cells. High float voltage also causes higher float current, which in turn
causes accelerated corrosion and shedding of active material from positive
plates. The recombination of gases to form water in VRLA cells generates heat,
and heat causes higher float currents. Therefore, excessive gassing in VRLA
cells can lead to thermal runaway.
Lack of Temperature Control
Batteries
are very temperature sensitive, and efforts should be made to maintain the
operating temperature near 77°F. The proper temperature will optimize battery life
and is especially critical for VRLA cells. The recombination of gases within a
VRLA cell can only take place at a certain rate. If this rate is exceeded, gas
pressure will build up beyond the safety valve level, and gases/water will be
vented out and permanently lost. At 77°F, the highest float voltage at which a cell can
still recombine all the gases driven off the plates is approximately 2.32
volts. If the cell temperature increased to 90°F while holding the voltage constant, the cell would
dry out and possibly go into thermal runaway. Thermal runaway leads to a
melting down of the jar and, under worst-case scenario, will lead to an
explosion and fire.
It
should be obvious from the above discussion that all VRLA applications should
have tight temperature controls and/or temperature compensated chargers.
Low temperature – Battery capacity is diminished at low temperatures. For example, at 62°F, capacity is approximately
90% versus 100% at 77°F. At low temperatures, a
higher float voltage is required to maintain full charge and, if the charger is
not adjusted properly, cells may be undercharged, leading to the problems
described under low voltage.
High temperature – High temperature causes loss of life. For every 15°F rise in operating
temperature, the life is cut in half. High temperature causes increased float
current, which means increased corrosion and, therefore, the loss of life. High
temperature also causes gassing, which means loss of water in flooded cells and
dryout and thermal runaway in VRLA cells.
Installation
A lot of battery problems
stem from improper installations. A detailed discussion of these is beyond the
scope of this paper, but some of the more common ones are the following:
Loose intercell connections – These can lead to abrupt
failures, including fires.
Damaged post seals – Improper cell handling or
not supporting cables can damage post seals. This allows acid to migrate up the
post and corrode the post to intercell connection.
Not replacing shipping caps
with vent caps – In flooded batteries, this creates internal gas pressures that will
force gases to escape past the post seals, causing post corrosion.
Manufacturing Problems
Manufacturing problems
actually represent a small number of the total. Some of the more common
problems, which may not show up for years, are the following:
Faulty post seal design – A leaky post seal allows acid
to migrate up to the post/intercell connection area, causing a connection
problem. Sometimes a new design appears to work well, but then suddenly starts
failing after six to eight years in the field.
Internal connection problems
– Quality
problems in the connection between grid tabs and the interconnecting bus have
been reported from time to time. In multicell jars like six or twelve volt
modules, the intercell connection between adjacent cells may fail as a result
of a poor lead burn.
Paste – Problems in the paste
formula or improper curing of the paste can have a major impact on the capacity
the battery can store. Some new batteries have been delivered at less than 50%
capacity.
Operational Problems
·
Discharge without recharge – A fully discharged or near
fully discharged cell will be damaged and possibly ruined if not recharged
within 24 to 48 hours. As a battery discharges, the electrolyte starts changing
from an acid solution to almost pure water when the battery is fully
discharged. Lead dissolves in water, and some of the plate material mixes with
water to form lead hydrate. Lead hydrate causes the plate surfaces to turn
white and, because it is conductive, it forms a short circuit between the
plates, rendering the battery irreversibly damaged.
·
Over discharge –
Over discharge causes abnormal expansion of the active material in the plates, which leads
to permanent damage and also recharges problems. This can happen in lightly
loaded UPS systems that experience an extended power outage.
·
Excessive discharging (same
as excessive cycling) – Some users have local requirements that call
for testing their critical backup systems either weekly or monthly. If this
testing includes cycling the battery, it will severely limit the life of the
batteries.
Failure Analysis Summary
Battery system failure modes
can be broken down into the following two major categories:
1.
Abrupt failure – This is a sudden loss of the battery system without any warning while
the system is trying to perform its intended mission. This is the worst-case
scenario, as it will lead to very expensive failures. In a data center
application, even a momentary loss may result in millions of dollars worth of
damage. An abrupt failure is cause by an interruption in the conduction path.
Typical failures are:
·
Faulty intercell connection – This could be an installation problem or a
severely corroded connection.
·
Internal conductance path problems – Remember, the current has to flow through
the post, to an internal bus, to the grids, through the paste and electrolyte,
to the opposite polarity plate, and then back out through the other terminal
post.
Abrupt failures can result from a totally corroded grid that is breaking apart
and only able to pass a low current flow. It can also result from a terminal
post that has lost its copper insert. High current batteries have a copper
insert in their posts and, if this copper is exposed to acid through a small
void in the lead coating, the copper will dissolve and leave only a small lead
coating to carry the current. VRLA cells that have totally dried out can also
be viewed as conductance path failures, since they have no effective path
between adjacent plates.
2.
Low capacity failure – This is a failure to support the load for the
required period of time. Low capacity results from both mechanical conductance
problems as well as electrochemical problems. As a battery ages and its
conductance path (grid corrosion, paste to grid connection) starts to
deteriorate, the internal resistance increases. When the battery is placed
under load, the voltage drop across the internal resistance will cause the
overall battery voltage to reach the end voltage before its rated time.
The VRLA battery has the additional problem that, as it loses water due to
dryout, it loses capacity. In essence, it loses the energy storage required for
a full capacity battery.
The capacity problem is a slowly developing problem that is easily detected
and, most of the time, does not cause expensive outages, since it is rare that
full capacity is required during an outage. Typically, an outage is a short
momentary event, and an emergency generator is usually part of the backup
scheme.
It should be obvious from
the preceding discussion that almost all battery problems can be detected by an
increase in a cell’s internal resistance, and close monitoring of this
parameter can avoid disastrous failures.
Resistance measurements are
mandatory for applications that cannot tolerate a loss of power. The only
battery test that can provide better information on a system's state of health
is a full capacity test, which is also recommended on a scheduled basis.
Guaranteeing is a strong
word, but the author firmly believes that a competent maintenance organization,
using the right tools, can assure a safe and reliable system.
Again, it
must be reiterated that batteries will and do fail. The key to operating a
reliable system is to detect problems at an early stage before these problems
can cause a system failure.
Monitoring all the important
parameters on a continuous basis and performing proactive tests accomplish
early detection.
Parameters
such as voltage and temperature are primarily measured to provide an indication
of the operating environment, so that optimized conditions can help extend
battery life. Voltage measurements, except for detecting a cell with internal
shorts, do not provide any clues to the battery's ability to support a load.
The only
viable tests for detecting performance problems at an early stage are internal
resistance tests and capacity tests. Both of these tests are recommended by
IEEE standards and have been field proven for many years.
Resistance
testing has become very popular and has been accepted by the industry as a
strong supplement to capacity testing, and many leading battery manufacturers
provide warranty settlements based on these readings. IEEE standards call for
periodic capacity testing, but the typical test intervals are five years for
flooded cells and yearly for VRLA. Most people do not feel comfortable not
knowing their battery’s health status in between capacity tests and supplement
them with quarterly resistance tests. Other battery users who cannot take their
system offline or find the funding for capacity testing rely solely on
resistance measurements.
IEEE Standard 450 (Flooded)
and IEEE Standard 1188 (VRLA) provide recommended maintenance and test
practices for large lead-acid storage batteries. These practices, which call
for monthly, quarterly, and annual inspections, are designed to maintain a
reliable battery system. They also call out the required actions to optimize
battery life. Unfortunately, most battery users, except for the nuclear power
industry, do not perform more than a fraction of the inspections and tests
recommended.
The author, who strongly
believes in the IEEE’s recommendations, but also recognizes that economic and
operational constraints exist in real life, hereby offers the following
specific recommendations, which actually go well beyond the IEEE in scope:
1. Install a
permanently mounted battery monitor system, capable of continuously making all
of the voltage, current, temperature, and resistance measurements called for by
the IEEE standards.
2. Respond
to any out-of-tolerance conditions and take the corrective action recommended
by the IEEE or battery manufacturers.
3. Trend
monthly data from internal cell resistance measurements and take action as
follows:
·
If any resistance reading exceeds the baseline value for
that model cell by 50% or more, then replace the cell without any further
testing.
·
If the resistance value is between 20 to 50% greater than
baseline, then perform a capacity test to verify its state of health. If
capacity is 80% or less, then replace the cell as soon as possible. If greater
than 80%, then continue to watch for further increases in resistance. If
capacity testing is not an economically viable option, then replace the cell.
Keep in mind that capacity testing can be performed on the entire string
offline or can be performed on a single suspect cell online.
4. Perform
capacity testing in accordance with the IEEE recommendations. This should only
require the rental of a load bank, since the monitor will log data during the
discharge test. Of all the capacity tests recommended, the acceptance test
performed right after installation is the most important one and must be
performed. Why would anybody install a backup system without knowing whether it
was going to perform or not?
5. Analyze
data from the monitor at least once monthly, and perform an annual sanity check
on the monitor itself to verify that it is properly calibrated and working
correctly.
·
Increased reliability. The monitor is online continuously,
7x24 coverage versus maintenance visits once or twice a year, and will alarm
immediately when something is wrong.
Too many
people who install monitors believe that, somehow, all will be well with their
battery. Unfortunately, the monitor only collects data and reports on
out-of-tolerance conditions. Someone is still required to analyze the data and
decide on corrective actions.
In order
for monitoring to be completely effective, a central computer monitor site is
required. The central monitor has two-way communication with all battery sites
and receives all alarms. The alarms are logged and forwarded to the appropriate
maintenance personnel via pager and/or fax. The central computer polls all
remote sites at least once a week, primarily to verify the communication link
and make sure the local monitors are functioning.
The
central computer site must either be staffed by or supported by a battery
expert who can identify problems and instruct maintenance personnel in the
field. Analyzing data, especially the trending of internal cell resistance
measurements (state of health checks), requires someone who is well trained in
battery testing.
Who
should do the monitoring? If the decision is made to use in-house resources,
then it is imperative that the battery data be readily available to the
resident battery expert. Some users that do not have the proper personnel or
don’t know how to integrate the data with their present programs should
consider an outside source. As long as the proper monitor is selected,
outsourcing the monitoring of the battery monitors make a lot of sense.
Contracting out the battery monitoring responsibility can range from simply
reporting the problems and recommending corrective actions to the outside
source's taking on the entire responsibility.