October 1, 2005
This is an
analysis on what could be causing the frequent and dreaded "email
failed!" support calls. The light-hearted opening (borrowed from
Eyeful Tower) was created as a
result of numerous ISPs and IT personnel being wrongly blamed for email
failures, due to
user ignorance. Next time your email fails and you pick up the phone,
remember this to save yourself from loss of dignity.
If you're a junior consultant, read this introductory checklist and
start your basic learning, before jumping to conclusions. It helps to
be barking up the right tree.
If A can't talk to B and A is perfectly fine, it must be B's fault.
-Clueless in Seattle.
Dear Clueless: Have you considered it might be X,
Y, Z, etc. along the way between A and B? Besides, the fact that A can
talk to C is not proof that A is perfectly fine. -Samcelot
I've tried everything and it still doesn't work. Must be your fault! -Confused
You're barely aware of 2% of the potential
issues. How could you
have tried "everything?" So, shut up and sit down! Let's get to work,
For your own
sake, stay blissful.
Believe me, life is not easy being in the know.
Just don't assume,
for it makes an ass- out of -u- and -me.
The Email Fairy
Email failures could be caused by numerous
things. Many common items are often completely overlooked. The email
process is complex, and involves more than just the sending and
receiving computers plus the server. Each layer of the process is a
When email failed, the troubleshooting process
is further complicated by the tendency of certain entities to
over-assert their denial of fault, and recklessly point to any
convenient target as a "must be" cause with unwarranted authoritative
confidence. Due to the nature of compensation scheme, these calls are
always treated as hot potatoes by most entities involved. They
employ elaborate official responses and pseudo-scientific
"proofs" to do the trick.
An experienced consultant will systematically
narrow down the issues and focus on the few primary suspects for the
situation at hand. He/she will instinctively know to reject certain
false or illogical responses, and persist as warranted. This has to be
all done within the inherent limitations of the tiered support systems
at multiple organizations. The goal is to expedite escalation of
the case to a properly level of specialists for effective
remedies to commence. Tactfully sense the knowledge, logic and comprehension
level of the tech. Politely protest any irrelevant/inappropriate procedures and tests,
backed by facts and rationale; without alienating anyone.
This also illustrates the importance of
streamlining the number of entities involved in the overall provision of
services. Email failure troubleshooting is exponentially simpler, if you
have direct control over the hosting aspect (actual server-level
control, that means in-house, or
co-location at data center, not just having a reseller plan), provided that you also possess the
necessary advanced expertise.
A resilient and redundant system design at
client site would help to ease the pressure when email failed. This in
turn opens up problem solving options that are less costly and more
effective. Like a good boy scout: Be Prepared! A properly and
holistically designed IT infrastructure for small businesses should
allow for graceful degradation of services, where a single point of
failure would not create cascading and catastrophic operational failure
to the business.
|Key questions to define the exact extend and nature
(Must answer before troubleshooting can begin)
- What are the exact error
messages you received?
- When was the last time it was
known to be working?
- Anything interesting
- Do you have a live
internet connection at this moment?
- Does if affect all stations
at your site?
- Does if affect all users
at your site?
- Does it affect all email
- Does it fail in sending,
receiving, or both?
- Can you receive from anyone at
- Can they send you a new email,
instead of reply?
- Can you send to anyone at all
(by-passing your address book)?
Here is a partial list of common potential issues:
invalid Recipient domain
domain name registration expired
unknown Recipient user
typo in email address (valid, existent, but incorrect domain)
Recipient quota exceeded
"address book" issues
invalid reply domain
incorrect reply address in email clientincorrect default
account in email clientincorrect email
address (or default entry) in Exchange Active Directory
(Organized by categories, not by probability of
destination server unreachable/non-responsive
local DNS hijack
local DNS caching
Recipient domain registrar issues
Recipient domain authoritative name server issues
general DNS issues, including DNS poisoning
SMTP server unreachable/non-responsive
SMTP reject, unauthorized origins (addresses)
SMTP reject, banned destinations
SMTP reject, failed authentication: incorrect login/password
SMTP reject, failed authorization: at wrong physical site
SMTP reject, wrong authentication mode
send size exceeded
send volume exceeded
send frequency exceeded, ISP throttling
Sender blacklisted by Recipient receiving server
Sender listed by international blacklists
Many blacklists are recklessly aggressive, a few are fraudulent
Recipient blacklisted by Sender SMTP
Recipient address/alias black hole(s) (by design or inadvertent)
Recipient domain unattended catch-all alias
lacking valid SPF
SMTP confiscates viral content, could be false positive
form-based/scripted HTML content triggered filtration
SPAM filtering at sending ISP
SPAM filtering at Recipient ISP
SPAM filtering at Recipient Exchange Server/sendmail
SPAM filtering at Recipient PC