|
The case for CAS
MANOJ CHUGH on the many advantages of Content Addressed Storage
UNTIL NOW, the industry-wide embrace of storage networking
and the rapid coupling of countless petabytes of data have shaped conventional
storage thinking in two fundamental ways. One, storage improvements must always
be faster and cheaper. Two, complexity must be tolerated to achieve efficiency.
Such thinking is rooted in the build-out process that evolved storage networks
to where they are todayat the epicentre of a companys business.
But new challenges posed by Web-based service models, increasing regulatory
requirements, heightened security, data preservation and storage optimisation
are ushering out old thinking, and bringing to light the value of an architecture
called Content Addressed Storage, or CAS.
CAS is key to business because it overcomes the threats posed by these new storage
challengesthe unmanaged, exponential growth in capacity and the number
of fixed content objects.
By the end of next year, most of the data stored by every corporation and the
entire US government combined will be fixed content. Any file requiring storage
that isnt changed, updated, or modified when recalled is essentially fixed
content. For example, legal and securities documents, photographic archives,
medical imaging files, product and promotional shots, consumer check images,
media presentations, transit mapsand even instant messages. The daily
flow of business, along with increasing regulations requiring businesses to
store documents for a certain number of years, means fixed content can quickly
become an 800-pound gorilla on the back of a companys storage budget.
New category
When organisations have looked for fixed content storage solutions theyve
faced a dilemma: what they need versus what the storage industry provided, which
was complicated by the judgement of the value of the information as a result
of the cost compared to the frequency of use. For example, the frequency of
using information such as check images, contracts or e-mail usually diminishes
as it ages. However, when the information is needed, the speed of access can
be the defining factor in an organisations ability to take full advantage
of a business opportunity. Until CAS, organisations had to make a hard choice
between the speed of information access (provided by magnetic disk storage solutions)
and assured content authenticity (provided by optical technology). Tape technology,
the choice of some, is considered the least functional or least-cost option.
CAS is a new category of storage. It provides the online performance of magnetic
disk because it is a magnetic disk, and assures content authenticity equal to
or better than optical technology at a total cost of ownership equal to or better
than that of a tape library. It offers more functionality for a lesser cost.
Everything needed and wanted for fixed content is in one solution, which compares
favourably with todays scenario of cobbling together dissimilar technologies.
With CAS, an organisations fixed content is cost-effectively available
online, 24/7, with assured authenticity.
One copy is enough
How unmanaged fixed content can savage an intelligent storage networks
resources can be seen with an e-mail. If a CEO sends a company-wide e-mail to
60,000 employees about new travel guidelines, it is likely to be indefinitely
stored on and routinely viewed from a server or a PC drive. As a matter of convenience,
employees may save the e-mail for reference. For storage administrators, that
e-mail message is now taking up 59,000-times more disk space than necessary.
And, because everyone is in possession of a copy of the original message, the
risk increases that its contents could be tampered with and forwarded to outsiders.
With CAS, instead of sending out thousands of e-mails, an original document
can be created and stored in the CAS repository along with a digital claim
check so pointers or links can be sent to direct authorised employees
to the original. The increased efficiency of storage in a CAS-enabled environment
is thus evident; the optimisation of storage resources by CAS becomes even greater
when that relatively small e-mail is replaced by a large multimedia corporate
sales presentation stored and recalled nearly every day.
Regulatory requirements
Take this same model of CAS efficiency and apply it to securities documents
and medical records. Here, the business role of CAS in ensuring the retention,
preservation and authentication of data in a federally-regulated environment
becomes even more pertinent. The number and type of federal regulations governing
business records and other data is on the rise. Whether it is
the amended Securities and Exchange Commission Act of 1934, Rule 17a-4 (which
requires investment records to be retained for anywhere from five to seven years),
the Health Insurance Portability and Accountability Act (which addresses the
storage and security of health information), or a matter of national security,
companies that cannot produce an unmodified original document can face expensive
legal action.
On the technology front, advances in disk drives have fostered the role of CAS
within businesses. Companies that once stored digitally-archived records on
tape libraries or older WORM (write-once-read-many) optical drives can now take
advantage of cutting-edge ATA in CAS arrays. CAS uses ATA disk technology, but
that is where comparisons to conventional ATA disk arrays end. It is the features
of CAS extensive software layer that fortify its business advantage by
adding peace of mind when it comes to preserving fixed content that absolutely
must not be corrupted. Just imagine pulling out a digitally-archived MRI for
use as defence evidence in a medical malpractice suit, then suddenly discovering
that the data has degraded to the point where a dark spot has appeared on the
body imagea spot that wasnt there before.
Breaking away
When the reputation and future of a business is on the line, just focusing on
the speed and initial acquisition price of a storage technology is not the answer.
Major corporate players who understand this, are investing in CAS now, and realise
that it pays dividends in three significant ways.
First, by optimising storage capacity and streamlining data retrieval though
object-based storage, not complex logical-volume file systems. Second, by ensuring
retention, preservation and authentication of data. Third, by integrating seamlessly
and easily with a companys existing storage area network (SAN) and network
attached storage (NAS) architectures.
CAS breaks away from conventional thinking about storage by offering all these
three benefits, and more, in what is perhaps one of the simplest storage architectures
ever designed. Its arrays attach directly to a companys IP-based Ethernet,
so no changes are required to the companys existing SAN and NAS design
to accommodate a CAS array. By operating independently from its place on the
Ethernet, a CAS array can serve files to authorised clients without complicating
the software-based management of other storage arrays.
CAS arrays also manage the allocation and location of stored object files internally
as a self-managed process. This further simplifies the storage administrators
task by eliminating the need for complex storage application integration, while
simultaneously providing a secure, autonomous and scalable reciprocal for fixed
content within the overall storage networka money-saving proposition.
CAS claim checks are digital fingerprints derived uniquely from
the object itself to create a distinct content address. This one-of-a-kind
content address, along with the fact that fixed content is stored as an object
file, means users are not retrieving logical file volumes of potential duplicate
files based on file name or other search criteria. Instead, users retrieve the
actual original document as it was first placed on the CAS.
Almost unhackable
Keeping fixed content in step with related applications is done using technology
called a content descriptor file, which contains time-stamp information, application-specific
meta data, and the content address of the fixed content. These features simplify
CAS management so that storage administrators do not have to worry about file
system hierarchies which can drain performance when extended beyond set boundaries.
They also enable the application of specific policies regarding how long certain
fixed content should be saved and when it should be deleted. This reduces the
management burden for storage administrators by way of intelligent automation.
Many types of data that fall under the definition of fixed content are highly
regulated or sensitive in nature, such as medical images, patient records and
financial information. This often moves the requirements for confidentiality
and security to the forefront of the discussion. The good news is that CAS arrays
are virtually unhackableperiod. With CAS, authorisation is maintained
at the claim-check level and the systems way of relating fixed content
to its user application adds an extra layer of security above and beyond access
codes and passwords. Advanced CAS arrays can deny user access by the very application
being used to call up the fixed content. With this ability, certain content
stored on a CAS array can be used exclusively by departments running their own
applications, while the CAS array continues to serve content to other departments
and the World Wide Web as a whole.
Manoj Chugh is the president of EMC (India and SAARC)
|