<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="http://feeds.hds.com/~d/styles/rss2full.xsl" type="text/xsl" media="screen"?><?xml-stylesheet href="http://feeds.hds.com/~d/styles/itemcontent.css" type="text/css" media="screen"?><!-- generator="wordpress/2.0.4" --><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">

<channel>
	<title>Michael Hay</title>
	<link>http://blogs.hds.com/michael</link>
	<description />
	<pubDate>Wed, 02 Jul 2008 14:29:10 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.0.4</generator>
	<language>en</language>
			<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" href="http://feeds.hds.com/hds/michaelhay" type="application/rss+xml" /><item>
		<title>Rethinking Unstructured Information</title>
		<link>http://blogs.hds.com/michael/?p=15</link>
		<comments>http://blogs.hds.com/michael/?p=15#comments</comments>
		<pubDate>Thu, 26 Jun 2008 16:37:08 +0000</pubDate>
		<dc:creator>mhay</dc:creator>
		
	<category>Best Practices</category>
	<category>Forward Thinking</category>
	<category>File Storage</category>
	<category>Search</category>
		<guid isPermaLink="false">http://blogs.hds.com/michael/?p=15</guid>
		<description><![CDATA[Over the past several years governmental organizations, commercial ventures and educational institutions have all been &#8220;waking up&#8221; to the fact that electronically stored information is something that society in general has to govern and protect.  You can see the trends of it all around if you look.  For example, the recent tussle over [...]]]></description>
			<content:encoded><![CDATA[<p>Over the past several years governmental organizations, commercial ventures and educational<img width="231" height="106" align="right" title="XML in use" alt="XML in use" src="http://blog.steeleprice.net/Images/LINQandXMLfortheVBDeveloper_B9B4/XML_16.jpg" /> institutions have all been &#8220;waking up&#8221; to the fact that electronically stored information is something that society in general has to govern and protect.  You can see the trends of it all around if you look.  For example, the recent tussle over the standardization of OpenXML versus the Open Document Format.  Both of these standards are essentially after the same thing, a self describing data format which basically puts the power of data ownership, structure and format in the hands of both their owners today and far in the future.   It is almost as if  all at the same time governments, end users, and corporations are realizing that the millenniums of time spent figuring out how to preserve, protect and ensure the authenticity of paper records now must urgently be applied to the digital world.  In a sense it is as if the human consciousness is waking up to the fact that we need to apply a disciplined approach to all that Electronically Stored Information (ESI) that&#8217;s lying around.</p>
<p>Another point of evidence comes from the body of court cases and new regulations all of which put pressure on companies to look again at their unstructured information with fresh eyes.  I&#8217;m sure that everyone reading this knows about ENRON or perhaps may even remember <a target="_blank" title="SOX on Wikipedia" href="http://en.wikipedia.org/wiki/Sarbanes-Oxley_Act">Sarbanes-Oxley</a> which is effectively the United States government&#8217;s attempt to increase both accountability and transparency in the financial markets.  Of course there is also the urban legend that Kenneth Lay was so guilty that he died of a self induced heart attack before his trial could be completed &#8212; and if it wasn&#8217;t one it sure is now.  However, while dramatic and definitely in the news, when compared to the <a target="_blank" title="FRCP" href="http://www.law.cornell.edu/rules/frcp/">Federal Rules of Civil Procedure</a> (FRCP), SOX is &#8220;chump change.&#8221;  My reason for stating this comes from both the ambiguity and the broad applicability of the FRCP.  Applicability includes for any organization&#8217;s main parts in addition to any agents, subsidiaries, or affiliates and even those who are overseas.  Further it is also applicable to any organization (public companies, private companies, and educational institutions) who can get into Federal court for some reason.  The only thing that the FRCP really says is that organizations have to be prepared for a litigation.</p>
<ul>
<li>A records retention policy is really required and more importantly adherence to it</li>
<li>The ability to quickly put items on &#8220;litigation hold&#8221; is required</li>
<li>With the records retention policy in effect it is also a good idea to provide evidence that your organization is following the policy through potentially documented audits</li>
</ul>
<p>In short if you can say to yourself that you&#8217;ve documented what you are supposed to do, you are doing it and you can prove it you are in good shape.</p>
<p>Now there has been a lot written about the FRCP in 2006, 2007 and even now, and I&#8217;m sure by this time your saying like, what the, you&#8217;re not saying anything new here.  If you have come to that conclusion you&#8217;d be right, however I was using this point to illustrate that we as a culture are now considering how to protect and preserve, from a real archival perspective, ESI.  Essentially, we need to look at the paper world as an example of what not to do, learn the lessons and apply them to ESI.  I do want to point out something fairly unique here: namely that when companies are forced into looking at things through regulation it can often lead to efficiencies that they had not thought of in the past.  For getting the unstructured world in order, I firmly believe that we will need to develop unstructured reporting tools which don&#8217;t really exist today.  Instead, a mishmash of content management systems has moved into center stage as the thing to fix this critical issue.  Well, I think that logical approach is doomed from the beginning.  One of the first things to consider, if content management could solve this problem it would already have.  The issue that I see with content management systems is that a priori organizations have to think of all possible outcomes for work flow, structure, permissions and hierarchy.  Pointedly that logical approach is flat wrong in two ways.</p>
<ul>
<li>The world of unstructured information is a very messy one and vast and people largely don&#8217;t know or understand the contents or even what they have, so how can an upfront strategy work if scale and scope aren&#8217;t really known?</li>
<li>Building a structure with messy information is really an emergent property of people interacting with one another and their information, not a premeditated action.</li>
</ul>
<p>I do want to provide an example of where point in time observation can lead to<img width="155" height="217" align="right" src="http://www.trc.govt.nz/environment/animals/images/argyant3.jpg" /> incorrect logical assumptions about premeditation.  Specifically, ants gathering food.  If one were to look at a fully formed ant trail where the ants were clearly taking food from a source and bringing it back to the nest, one could wrongly assume that there was an intelligent controller &#8220;directing&#8221; the ants towards getting the food and bringing it back to the nest.  This assumption would be completely wrong and assumes the controller had a premeditated thought directing the ants to complete the task.  In actuality the ants have some relatively simple programming and social interaction on their side.  Essentially we can think of every ant as knowing about food, knowing how to get back home, knowing how to leave scent trails, and knowing how to follow scent trails. When they run into food they combine these simple programs to gather the food and return to home while leaving scent marks.  If one adds a randomly distributed set of ants across the field then it is probable that many ants will run across the scent trail, follow it back to the food, and return to the next while strengthening the trail.  Repeat this process over many ants and an organized ant trail emerges.  So in the natural world this is a case of organization being an emergent property of time and social interaction, and not something thought of a priori.<br />
<img width="366" height="98" align="left" src="http://www.millennialsconference.com/la/Millennials_Logo.jpg" />Another reason why premeditated application of structure fails the giggle test: there is an entire generation of workers (like almost 80 million) entering the workforce and they are trained by Google and others in the use of modern unstructured information reporting tools such as blogs, RSS, Wikis, digg, etc.  In essence it is through the social interaction between these people that their structure emerges.  So my point is that we as IT professionals today need to start bringing these tools into our companies and teams so that we can start speaking early to these new comers into the work force.  I also firmly believe that as we add these tools into our bag of tricks, we will be solving working with unstructured information and remedying the failed experiment of content management.
</p>
]]></content:encoded>
			<wfw:commentRSS>http://blogs.hds.com/michael/?feed=rss2&amp;p=15</wfw:commentRSS>
		</item>
		<item>
		<title>Hitachi, a Partner to the World</title>
		<link>http://blogs.hds.com/michael/?p=18</link>
		<comments>http://blogs.hds.com/michael/?p=18#comments</comments>
		<pubDate>Sat, 17 May 2008 19:01:57 +0000</pubDate>
		<dc:creator>mhay</dc:creator>
		
	<category>Charity</category>
		<guid isPermaLink="false">http://blogs.hds.com/michael/?p=18</guid>
		<description><![CDATA[So what does that mean anyway, right.  Well, I&#8217;m headed back to when I started blogging in the first place with a statement about being proud for a company that understands what it means to really care about the environment and its people.  When I started talking about being green I was pointing [...]]]></description>
			<content:encoded><![CDATA[<p>So what does that mean anyway, right.  Well, I&#8217;m headed back to when I started blogging in the first place with a statement about being proud for a company that understands what it means to really care about the environment and its people.  When I started talking about being green I was pointing at the argument that Hitachi does more than just follow the fad, but has been living this virtue as well.  In that same approach, I want to point out a couple of things, and I&#8217;m not trying to say hey look what Hitachi has done here.</p>
<p>I&#8217;m sure that everyone is aware of the recent events in Myanmar and China over the past several weeks.  Well the situation is horrible, truly horrible.  When they are combined, in my opinion, it has the potential to approach the scale as the tsunami of several years ago, in terms of the impact of human life.  Well like Katrina in the US Hitachi is there providing what it can in the form of relief.  For China in the form of heavy moving equipment and cash, and for Myanmar in the form of cash.</p>
<p><img width="538" height="223" alt="Hitachi Construction" title="Hitachi Construction" src="http://www.hitachiconstruction.com/en_US/cfd/construction/hitachi_const/media/images/zaxis/zaxis_home_image.jpg" /></p>
<p>For me this is a little more personal as much of family was impacted by Hurricane Katrina.  Fortunately we were blessed by no loss of life and the gift of insurance.  During that time Hitachi also gave generously to the cause and specifically supplied heavy moving equipment to assist in the reconstruction.  While certainly I&#8217;m an employee of the company that has done this, I can remember getting a call from my Mom saying that she saw some of Hitachi&#8217;s equipment all over the city of New Orleans providing assistance to remove debris, so it is also personal.</p>
<p>I&#8217;ve been back to New Orleans just this year and while my parents have moved back into their house there is still a long way to go for New Orleans to become something new, and Hitachi has indeed played a part in doing that.  On a local corporate level several HDS personnel were personally involved in Katrina as I imagine Hitachi employees in Sichuan are doing what they can.  So in that sense, I am reminded about being proud to work for Hitachi, and I want to wish well those doing this hard good work.<br />
If you are interested in the links for the donations Hitachi has made to the disasters in China and Myanmar please see the following links.</p>
<ul>
<li>http://www.hitachi.com/New/cnews/080514a.html</li>
<li>http://www.hitachi.com/New/cnews/index.html</li>
</ul>
<p>Regards,</p>
<p>Michael C. Hay
</p>
]]></content:encoded>
			<wfw:commentRSS>http://blogs.hds.com/michael/?feed=rss2&amp;p=18</wfw:commentRSS>
		</item>
		<item>
		<title>The Grass Really is Greener…</title>
		<link>http://blogs.hds.com/michael/?p=17</link>
		<comments>http://blogs.hds.com/michael/?p=17#comments</comments>
		<pubDate>Mon, 21 Apr 2008 22:53:28 +0000</pubDate>
		<dc:creator>mhay</dc:creator>
		
	<category>Green</category>
		<guid isPermaLink="false">http://blogs.hds.com/michael/?p=17</guid>
		<description><![CDATA[Green was the topic of the day on my first blog post, today I&#8217;m writing about it again.  Specifically with respect to the CoolCenter50, which is about Hitachi practicing the application of Green technologies as a customer.  Of course included in the mix are Hitachi Storage technologies which are green by design.  [...]]]></description>
			<content:encoded><![CDATA[<p>Green was the topic of the day on my first blog post, today I&#8217;m writing about it again.  Specifically with respect to the CoolCenter50, which is about Hitachi practicing the application of Green technologies as a customer.  Of course included in the mix are Hitachi Storage technologies which are green by design.  To quote our Wikibon friends:</p>
<blockquote><p><em><span style="font-size: 11pt">&#8220;Within the IT industry, the Wikibon community believes that Hitachi, Ltd. has the most comprehensive and fully implemented corporate green plan in place,&#8221; says David Vellante, president and CEO of IT Centrix and co-founder of the Wikibon Project. &#8220;Hitachi&#8217;s progress on its emission neutral strategy is impressive and genuine. Initiatives such as the collaboration of various Hitachi groups for a new datacenter design in Yokohama underscore the firm’s commitment and are great drivers for change. Within storage the USP V controller re-designs and the implementation of virtualization, thin provisioning and support for external devices that spin down, have helped improve utilization and reduce power consumption by 63% over previous generations, a substantial milestone that sets an example of leadership for the industry.&#8221;</span> </em></p></blockquote>
<p>Why is it that Hitachi&#8217;s offerings are so mature in comparison to the industry?  Well, we could go all wavy hands and stuff, or we could draw a natural straight forward conclusion.  The Kyoto Protocol was signed in Japan; Hitachi like Toyota is a Japanese company, Japan &#8220;gets it&#8221; therefore, Hitachi &#8220;gets it&#8221;.  That being said, now I have to prove it.  Well for literally over a thousand years there have been well defined Japanese arts which celebrate the Earth. Bonsai, Ikebana, and even some of the early Japanese religions celebrate the Earth spirit.  Aspects of the Earth are used in ritual purification for Aikido and Zen.  In short, &#8220;getting it&#8221; has been in the Japanese culture for a really long time.  Hitachi is a company that is applying being green in delivery of the products we produce, in our development and manufacturing facilities, and frankly in our office building where less cooling is used for humans.  In summary, I think that Hitachi was built green from the beginning and has a natural Japanese legacy to take advantage of.
</p>
]]></content:encoded>
			<wfw:commentRSS>http://blogs.hds.com/michael/?feed=rss2&amp;p=17</wfw:commentRSS>
		</item>
		<item>
		<title>Where’s Centera Headed?</title>
		<link>http://blogs.hds.com/michael/?p=16</link>
		<comments>http://blogs.hds.com/michael/?p=16#comments</comments>
		<pubDate>Fri, 04 Apr 2008 20:34:37 +0000</pubDate>
		<dc:creator>mhay</dc:creator>
		
	<category>File Storage</category>
		<guid isPermaLink="false">http://blogs.hds.com/michael/?p=16</guid>
		<description><![CDATA[When you make a bet on a technology platform you expect the provider to take all the necessary steps to keep the technology going, by retaining key personnel and developing proper contingency plans for when you cannot.  We found this quote today:
“With EMC scaling down the Centera unit and the future of Centera unclear, [...]]]></description>
			<content:encoded><![CDATA[<p>When you make a bet on a technology platform you expect the provider to take all the necessary steps to keep the technology going, by retaining key personnel and developing proper contingency plans for when you cannot.  We found this quote today:</p>
<p class="MsoNormal">“<em>With EMC scaling down the Centera unit and the future of Centera unclear, the chance to join Caringo, which understands the potential of CAS, and partner once again with Paul Carpentier was too good of an opportunity to pass up</em>,” said Van Riel. “<em>The need for CAS solutions is ever increasing and I am excited to participate in the next chapter of the technology at Caringo</em>.” (source: <a target="_blank" href="http://www.storagenewsletter.com/news/people/caringo-jan-van-riel">Storage News Letter</a>)</p>
<p class="MsoNormal">
<p class="MsoNormal">It should be noted that Van Riel is one of the key Centera architects representing the &#8220;soul&#8221; of the technology. This is a big loss to EMC and an even bigger loss to Centera customers.  I have to say if I was an EMC customer, I&#8217;d be really worried!</p>
]]></content:encoded>
			<wfw:commentRSS>http://blogs.hds.com/michael/?feed=rss2&amp;p=16</wfw:commentRSS>
		</item>
		<item>
		<title>When You Care Enough to Copy the Very Best…</title>
		<link>http://blogs.hds.com/michael/?p=14</link>
		<comments>http://blogs.hds.com/michael/?p=14#comments</comments>
		<pubDate>Fri, 14 Mar 2008 16:56:03 +0000</pubDate>
		<dc:creator>mhay</dc:creator>
		
	<category>File Storage</category>
		<guid isPermaLink="false">http://blogs.hds.com/michael/?p=14</guid>
		<description><![CDATA[Last May we debuted HCAP V2.0, earlier this month we updated HCAP V2.4 out.  Our theme, for V2.4, was &#8220;being secure out of the box&#8221; from an administration perspective.  The focus was on Role Based Access Controls, password complexity, linkages to common authentication directories (LDAP and AD), pervasive audit logging of all management [...]]]></description>
			<content:encoded><![CDATA[<p>Last May we debuted HCAP V2.0, earlier this month we updated HCAP V2.4 out.  Our theme, for V2.4, was &#8220;being secure out of the box&#8221; from an administration perspective.  The focus was on Role Based Access Controls, password complexity, linkages to common authentication directories (LDAP and AD), pervasive audit logging of all management transactions, and other goodies like only allowing management transactions over HTTP + TLS/SSL.  These security oriented features joined other capabilities already in the product as of V2.0:</p>
<ul>
<li>Encryption of ingested objects over HTTPS</li>
<li>Encryption of replicated objects when using object level TCP/IP based replication between to archives</li>
<li>A packaging format called AOP (Archive Object Package) which can be utilized by users wanting to backup the archive via NDMP</li>
<li>Finally, AES encryption of data in-flight and at-rest over the SAN fabric, with the intention  of protecting sensitive data on drives finding their way out of the data center</li>
</ul>
<p>The bottom line is that we are leaps and bounds more secure than the competition.  Which brings me to the title of the post.</p>
<p>EMC recently announced &#8220;CentraStar (R) 4.0&#8243; including the following somewhat vague descriptions of their feature set.</p>
<div style="margin-left: 40px">&#8220;In addition to supporting more objects and speedier self-healing, the new version of the CentraStar operating software improves system flexibility and security.  This includes a new capability that enables administrators to segregate, configure and separately manage application, management and replication traffic for the optimal mix of performance and protection.  It also gives administrators additional system logging and auditing capabilities; including the ability to custom configure password complexity rules for enhance security&#8221;</div>
<p>Well, I don&#8217;t know where to start.  If you look at the announcement in detail you can see that wow they can now support 750GB drives and up to 25 million objects per drive.  Okay HCAP can support up to 2TB LUNs (2.7x more capacity than EMC) and is future proofed for up to 16TB LUNs (nearly 22x more capacity than EMC)  in the future.  So in essence when I look at what EMC shows then so what.  As to the number of objects per drive, I honestly don&#8217;t get it.  How many per node I ask?  I can make some assumptions and let you all know what I think that the number is (100 million), but again when I compare this to HCAP (400 million) again we are 4x their size, but again that&#8217;s my estimation, if EMC had provided more detail in their announcement I could be more accurate, but as it stands, Centera doesn&#8217;t scale, sorry.</p>
<p>The other point of EMC&#8217;s announcement is secure administration. Well wow I guess copying is the best form of flattery, then I feel very flattered.  Further, it&#8217;s nice to know that a company who purchased RSA chooses to copy Hitachi rather then innovate on their own.</p>
<div class="flockcredit" style="text-align: right; color: #cccccc; font-size: x-small">Blogged with the <a style="color: #999999; font-weight: bold" target="_new" title="Flock Browser" href="http://www.flock.com/blogged-with-flock">Flock Browser</a></div>
]]></content:encoded>
			<wfw:commentRSS>http://blogs.hds.com/michael/?feed=rss2&amp;p=14</wfw:commentRSS>
		</item>
		<item>
		<title>Recipe - Scheduled Deletion of Objects Not Under Retention for HCAP</title>
		<link>http://blogs.hds.com/michael/?p=12</link>
		<comments>http://blogs.hds.com/michael/?p=12#comments</comments>
		<pubDate>Tue, 11 Mar 2008 20:05:57 +0000</pubDate>
		<dc:creator>mhay</dc:creator>
		
	<category>File Storage</category>
		<guid isPermaLink="false">http://blogs.hds.com/michael/?p=12</guid>
		<description><![CDATA[Due to the object/file centric nature of HCAP it is possible that users can do really cool and simple things with the system.  For example, let&#8217;s say that one wanted to automate the process of deleting objects which a given directory space.  Here&#8217;s how they would go about doing it:

Pick you favorite operating system and [...]]]></description>
			<content:encoded><![CDATA[<p>Due to the object/file centric nature of HCAP it is possible that users can do really cool and simple things with the system.  For example, let&#8217;s say that one wanted to automate the process of deleting objects which a given directory space.  Here&#8217;s how they would go about doing it:</p>
<ul>
<li>Pick you favorite operating system and associated scheduler</li>
<li>Pick your favorite scripting language on that platform &#8212; there are many to choose from, I happen to like <a xhref="http://www.python.org">Python</a></li>
<li>Determine the directory that you automatically want to watch for expired files, in our example we&#8217;ll call it <span style="font-style: italic">/fcfs_data/foo</span></li>
<li>Well, it just so happens that HCAP keeps a metadata directory called <span style="font-style: italic">/fcfs_metadata/foo/.directory-metadata/info/expired</span> containing zero byte representations of files that either aren&#8217;t under retention or have their retention period completed.  If these representations are deleted then the corresponding files/objects are removed from the system</li>
<li>Write your script that will be run from the scheduler you selected which will look at <span style="font-style: italic">/fcfs_metadata/foo/.directory-metadata/info/expired</span>, seek out newly expired files and delete them</li>
</ul>
<p>With all of that done you&#8217;ve got a simple approach to automating deletion of expired objects/files.  I&#8217;d also like to suggest that you log the heck out of this script and broadcast it to a centralized logging infrastructure like syslog.</p>
<p>If you&#8217;d like to see what this would look like in Python, see below.</p>
<p>Script Disclaimer:</p>
<p class="MsoNormal">
<span style="font-size: 7pt; color: #000000">ALL OF THE DESCRIBED SCRIPT IS PROVIDED WITHOUT ANY EXPRESS OR IMPLIED WARRANTY, INCLUDING WITHOUT LIMITATION ANY WARRANTIES THAT IT IS FREE OF DEFECTS, MERCHANTABLE, FIT FOR A PARTICULAR PURPOSE OR NONINFRINGING AND RECEIPIENT WILL NOT SEEK ANY INDEMNIFICATION OR OTHER REMEDY AGAINST HDS OR ARCHIVAS IN CONNECTION WITH THE SCRIPT.  FURTHER NEITHER HDS NOR ARCHIVAS WILL HAVE ANY OBLIGATION TO SUPPORT OR MAINTAIN THE SCRIPT OR THE STEPS DESCRIBED IN THE SCRIPT.</span></p>
<div style="margin-left: 40px">import os</p>
<p>target=&#8221;/fcfs_data/foo/.directory-metadata/info/expired&#8221;files=os.listdir (target)<br />
for fl in files:</p>
<div style="margin-left: 40px">os.remove (os.path.join (target, fl))</div>
</div>
<div class="flockcredit" style="text-align: right; color: #cccccc; font-size: x-small">Blogged with the <a style="color: #999999; font-weight: bold" target="_new" title="Flock Browser" xhref="http://www.flock.com/blogged-with-flock">Flock Browser</a></div>
</p>
]]></content:encoded>
			<wfw:commentRSS>http://blogs.hds.com/michael/?feed=rss2&amp;p=12</wfw:commentRSS>
		</item>
		<item>
		<title>Hitachi File Storage Platforms</title>
		<link>http://blogs.hds.com/michael/?p=10</link>
		<comments>http://blogs.hds.com/michael/?p=10#comments</comments>
		<pubDate>Tue, 11 Mar 2008 16:41:13 +0000</pubDate>
		<dc:creator>mhay</dc:creator>
		
	<category>Forward Thinking</category>
	<category>Green</category>
	<category>File Storage</category>
	<category>Search</category>
		<guid isPermaLink="false">http://blogs.hds.com/michael/?p=10</guid>
		<description><![CDATA[Introduction
Today Hitachi is announcing quite a bit in the area of file storage platforms.  To start with are core updates in our underlying platforms.  From the debut of the Essential NAS Platform (ENP), a hardware update on the High Performance NAS Platform, and a software revision in the Content Archive Platform; to the [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Introduction</strong></p>
<p>Today Hitachi is announcing quite a bit in the area of file storage platforms.  To start with are core updates in our underlying platforms.  From the debut of the Essential NAS Platform (ENP), a hardware update on the High Performance NAS Platform, and a software revision in the Content Archive Platform; to the entrance of a new product suite for us: the Hitachi Data Discovery Suite that includes two offerings.  I want to provide some amount of detail, but not what we are stressing in the announcements.  This will be more about behind the scenes kinds of things.</p>
<p><strong>High Performance NAS Platform (HNAS)<br />
</strong><br />
Before I get too far into this I do want to bring up a point close t o my heart which I hear on a regular basis: well isn&#8217;t HDS just going to drop Bluearc like you did with NetApp?  To start with the relationship with BlueArc is 1000% different.  My colleague, Shmuel Shottan CTO of BlueArc, is my personal mentour and good friend; however, that really doesn&#8217;t summarize the total relationship with BlueArc.  The two companies enjoy a symbiotic relationship which spans mutual respect, trust and a shared vision.  You will see that in the intentional alignment between our two companies as well as the fact that HDS has engineers with source code access at the BlueArc site in the UK.  This was done because when both companies got down to it we realized that in order to implement our mutual strategies in a timely manner the best thing was to increase the engineering resources in key areas like offloading some portion of the full content indexing process and performing HSM migrations to/from externally attached NAS and active archive devices.  Beyond the shared strategy and co-development activities HDS owns an equity stake in BlueArc, something that was not the case with NetApp.  So hopefully this tells our user base that our relationship is entirely different and growing in a positive direction.  Okay just because I don&#8217;t want to hear myself talk and all don&#8217;t take my word for it here&#8217;s what Shmuel had to say about our partnership.</p>
<p><img align="left" title="Shmuel" alt="Shmuel" src="http://bluearc.com/graphics/company/photo_shottan.jpg" />I consider myself extremely fortunate for having the opportunity to work closely with HDS and with Michael in particular over the last 18 months. Being called a mentor by Michael is a great honor. relationships last only when they are symbiotic and reciprocal. I have indeed received much more than I transmitted. While it seems as if Michael and I are trying to compete for the &#8220;most humble&#8221; award, my observation is sincere. The relationship between the companies is a true partnership. BlueArc is more than a technology provider or just a NAS component provider to Hitachi. Together we have improved our product offerings and embarked on a roadmap to deliver value through tighter integration. BlueArc has gained much more than just increasing its routes to market. Michael has provided me with enormous and invaluable insight to what Hitachi&#8217;s customers actually need. This closed loop review has allowed me to channel the innovations we &#8220;dream&#8221; at BlueArc towards building products that solve the customer&#8217;s needs. - <span style="font-style: italic">Shmuel Shottan Chief Technology Officer, BlueArc</span></p>
<p><strong>Essential NAS Platform (ENP)<br />
</strong></p>
<p>Well there is a lot to say here, and like HNAS, I&#8217;ll be focusing on stuff which is &#8220;behind the scenes&#8221;.  One thing that I did want to relate is the attention paid into making this modular NAS device highly reliable.  Unlike some of our competitors, in the modular NAS space, who merely use simple heart-beating mechanisms to make their NAS devices achieve HA, the ENP actually has hardware offloaded HA and support mechanisms, existing below the firmware and independent of the internal kernel.  In the unlikely event that one of the nodes goes out to lunch the other one can actually force the takeover by issuing a command over the HA channel killing the second node.  This kind of attention to detail and processor separation is not something that you will find on another competitive modular NAS device, by the way if you aren&#8217;t getting it I mean NetApp.  I cannot stress enough the level of attention to detail across the board especially when it comes to reliability.  It is kind of hard to relate this level of attention to detail, because one doesn&#8217;t find it valuable until there is an outage and you realize that that Hitachi stuff didn&#8217;t go down.  So it is kind of one of those things that everyone needs but is hopefully rarely used, I guess in that sense it is kind of like life insurance.  There are after all two main types of life insurance whole life and term life.  Generally whole life is an absolute guarantee potentially with a lot of bells and whistles.  Where as term is generally less expensive with less bells and whistles.  The ENP is rather like term life, it gets the job done it is reliable and it is less expensive than the competition in the same market segment.</p>
<p><strong>Content Archive Platform (HCAP)<br />
</strong></p>
<p>As previously mentioned HCAP is a really strong product capable of solving many use cases, like archiving and Web 2.0 storage, in a single bound &#8212; Superman reference intended.  Not only is the product rock solid, but the engineering team that makes HCAP is second to none.  So my behind the scenes point here is really a shout out to the engineering team.  In fact one of them,<img align="right" alt="Jack's photo" title="Jack's photo" src="http://www.pgcon.org/2008/sched/images/person-10-128x128.png" /> Jack Orenstein, is already engaged in a speaking spot at <a target="_blank" title="Jack's talk at PGCon" href="http://www.pgcon.org/2008/sched/events/57.en.html">PGCon </a>since HCAP makes use of patented technology on top of Postgres.  One other topic I want to point to is the reliability of the system, pointing back again to Hitachi&#8217;s attention on extreme reliability.  Soon we will be publishing a white paper on the patent pending technology within HCAP making it 500x more reliable than competitive products.  This work was through the development of a mathematical algorithm to right place the data for extreme reliability.  When coupled to SAIN the combination of the HCA software we are getting the best of both worlds serious protection from Hitachi backed ASIC driven RAID plus protections from the software stack.  (Note that when the paper is available I&#8217;ll do a deeper dive on it.  Further, this technology has been deployed for nearly a year now.)</p>
<p><strong>Data Discovery Suite (HDDS)<br />
</strong></p>
<p>This is a net new offering by Hitachi, breaking new ground for us.  Backing the development effort is Hitachi Software Engineering corporation working in collaboration with Hitachi Data Systems &#8212; code speak for HDS being embedded as an integral member of the engineering activities.  There are a lot of things contained within this product and I&#8217;m quite personally proud of it.  The most striking and important thing is the focus on usability for non-IT users.  We really did spend a lot of time with HR personnel, Intellectual Property/Patent specialists, corporate lawyers, employment attorneys and IT personnel to understand the challenges in this space.  The usability is literally pushed all the way down to the user&#8217;s desktop via a Microsoft Vista Desktop gadget.  In essence a non-IT worker can span a search across multiple NAS and archive devices without being aware of that fact.  Further we looked at common metaphors which these users could grok to assist them with file recovery: shopping carts.  Essentially in the Data Discovery Suite, there is a concept called a collection.  Users may add many files into this collection, and when they are ready download the entire collection as a ZIP package that includes all of the files added into the collection regardless of the location.  I can go on and on here, but I&#8217;ll reserve things for a future post since this is getting a little bit long.
</p>
]]></content:encoded>
			<wfw:commentRSS>http://blogs.hds.com/michael/?feed=rss2&amp;p=10</wfw:commentRSS>
		</item>
		<item>
		<title>Using the Right Tool for the Right Job</title>
		<link>http://blogs.hds.com/michael/?p=9</link>
		<comments>http://blogs.hds.com/michael/?p=9#comments</comments>
		<pubDate>Sun, 17 Feb 2008 23:26:48 +0000</pubDate>
		<dc:creator>mhay</dc:creator>
		
	<category>File Storage</category>
	<category>Block Storage</category>
		<guid isPermaLink="false">http://blogs.hds.com/michael/?p=9</guid>
		<description><![CDATA[I&#8217;m sure that everyone knows the adage: use the right tool for the right job.  This means you really should not try to drive a screw in with a hammer, but instead use a screw driver.  Being a guy and all, when I was a kid I have to admit, and I&#8217;m sure that my [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m sure that everyone knows the adage: use the right tool for the right job.  This means you really should <img width="160" height="90" align="right" alt="mangled screw" title="mangled screw" src="http://www.core77.com/blog/images/screw.jpg" />not try to drive a screw in with a hammer, but instead use a screw driver.  Being a guy and all, when I was a kid I have to admit, and I&#8217;m sure that my dad would frown and all that stuff, I had to nail in a screw to see what would happen.  Well, I can say this, it was all fine and dandy when I drove the screw in, but when you try to get the thing out again, not only have you lost a screw, but a screw driver too.</p>
<p>Hopefully by this point I&#8217;ve piqued your interest and you are thinking: okay Michael what&#8217;s your point.  The point that I&#8217;m trying to drive to is debunking the criticism about why Hitachi has discrete products in a variety of areas.  Okay, let me go back to a part of the first sentence: use the right tool for the right job.  I think that some of our very worthy adversaries get this point like EMC and IBM.  They have discrete tools that solve specific problems, and if you look at IBM they even have many different file systems, some in their OEM products via NetApp, one called GPFS, another called JFS, etc.  Each different tool solves a different problem for IBM.  With EMC again they have different products for different areas, for high scale enterprise class workloads they have the DMX/Symmetrix and of course for other markets they have the Clariion.  Hitachi has a similar product segmentation with our enterprise class rockstar, USP-V(M), our highly reliable modular, AMS, the USP-V-like High Performance NAS platform, ultra-scale active archive, Hitachi Content Archive Platform (HCAP), etc.  You see, early on we recognized that to optimally solve problems different tools are required.  Hitachi&#8217;s sales force has long dealt with that approach, we just don&#8217;t talk about it.  After all like all of the other storage vendors, we&#8217;ve offered two major block storage architectures and have ingrained in our corporate culture how to attack the market with these two products.  We are doing a similar thing with our file storage platforms now with HNAS, HCAP, and our NAS-Blades, different tools for different jobs.  Early on we recognized that if you really needed to store data for long periods of time, a simple file system was insufficient you must have a different product so we went and purchased Archivas last year.  For highly performant NAS there are a whole set of reasons why we work with the company that we have an equity stake in, BlueArc, to resolve really high scale file I/O types of problems.  You see we recognized that if one wanted to consolidate many NetApp filers onto one platform, the traditional bus architecture would simply not do, a different tool was required.<br />
Okay, hopefully you are following my logic different tools and different products are required to solve different problems.  Also I hope that you are getting that I&#8217;ve mentioned IBM, EMC, and Hitachi as recognizing this to be the case.  While mentioned, I&#8217;ve not stated that NetApp yet understands this.  They are acting like their hammer, OnTap, can nail anything in, screws, nails, thumb tacks, bolts, hooks, etc.  NetApp&#8217;s name for this approach is &#8220;Unified Storage&#8221;.  And they literally are putting everything into their hammer, storage optimization, snapshots, block storage, clustered global namespace (Spinnaker ain&#8217;t no global file system, it is a clustered global namespace.), replication, FC-block storage, iSCSI-block storage.   Wait a minute those last two items in most other companies that offer storage products represent at least a different product altogether.  A widely known secret is that NetApp added iSCSI to their filers because Microsoft removed support for running Exchange and SQL server on CIFS shares, and almost as an afterthought, Fibre Channel was added allowing NetApp to dip their toes into the block storage market.  To me this confirms NetApp consistently targets their hammer, OnTap, at any market even remotely within their aim.</p>
<p>So, I&#8217;ll end with, do you want to buy a storage device from a vendor who wants to nail in your screws, or a vendor that uses the right tool in the right market?
</p>
]]></content:encoded>
			<wfw:commentRSS>http://blogs.hds.com/michael/?feed=rss2&amp;p=9</wfw:commentRSS>
		</item>
		<item>
		<title>Where is that stuff anyway?</title>
		<link>http://blogs.hds.com/michael/?p=8</link>
		<comments>http://blogs.hds.com/michael/?p=8#comments</comments>
		<pubDate>Mon, 11 Feb 2008 16:23:51 +0000</pubDate>
		<dc:creator>mhay</dc:creator>
		
	<category>Forward Thinking</category>
	<category>Search</category>
		<guid isPermaLink="false">http://blogs.hds.com/michael/?p=8</guid>
		<description><![CDATA[Let&#8217;s face it search is hot, and quite frankly the market is consolidating down to just a few players.  Google is effectively the king of Internet search, Microsoft recently made a bid for FAST Search and Transfer and Yahoo, Autonomy has been upping the ante by purchasing various companies allowing them to provide more [...]]]></description>
			<content:encoded><![CDATA[<p>Let&#8217;s face it search is hot, and quite frankly the market is consolidating down to just a few players.  Google is effectively the king of Internet search, Microsoft recently made a bid for FAST Search and Transfer and Yahoo, Autonomy has been upping the ante by purchasing various companies allowing them to provide more value.  Further Apple, Microsoft, and Novell/SuSE Linux have embedded search into the desktop operating system.  It is this last part that I want to focus on a bit, not for the fact that it is on the desktop, but more for one consistent point of implementation.</p>
<p>On Linux the desktop search engine is <a target="_blank" title="GNOME desktop search engine" href="http://beagle-project.org/Main_Page">Beagle</a>, a C#/Mono port of <a target="_blank" title="Apache Lucene" href="http://lucene.apache.org/java/docs/">Lucene</a>, which is really closely tied to the<img width="274" height="223" align="right" alt="Beagle" title="Beagle" src="http://beagle-project.org/images/thumb/b/b2/BeagleScreenie_crop.png/400px-BeagleScreenie_crop.png" /> GNOME desktop environment.  While it is possible to set Beagle up to repeatedly scan the file system, a more efficient configuration approach is to work with a Linux kernel service called <a title="INOTIFY at Wikipedia" href="http://en.wikipedia.org/wiki/Inotify">INOTIFY</a>.  This service is set up to generate an event stream telling the search engine about added, deleted or modified files.  This allows the search engine to target the indexing process only at those file system objects that need to be added to the index, removed from the index or updated in the index.  This means that the system is effectively event driven and not set up to regularly poll the file system.  Generally speaking for a lot of different problem domains having an event driven approach is superior to a pure polling approach.</p>
<p><img align="left" title="Apple Spotlight" alt="Apple Spotlight" src="http://images.apple.com/macosx/features/images/300_spotlight_calculations_20071016.png" />The next target on the list is Apple&#8217;s <a target="_blank" title="OS X Spotlight" href="http://www.apple.com/macosx/features/300.html#spotlight">Spotlight</a>, which like Beagle integrates to something analogous to INOTIFY: FSEvents.  This is an internal only interface maintained by the kernel that allows various user space applications to subscribe to file change events generated by the kernel.  Like INOTIFY it removes the need for having to repeatedly poll the file system for modified files.  Further history shows that some of this technology originated in BeOS sometime in the past; an <a target="_blank" title="FSEvents and Associated Mechanisms" href="http://arstechnica.com/reviews/os/mac-os-x-10-5.ars/7">article</a> at ArsTechnica provides a great history of core file system features in OS X.</p>
<p>The final search service to review is the <a title="Windows Desktop Search" target="_blank" href="http://www.microsoft.com/windows/products/windowsvista/features/details/instantsearch.mspx">Windows Desktop Search</a> (WDS) which really came into the<img width="177" height="266" align="right" title="Screenshot of Windows Desktop Search" alt="Screenshot of Windows Desktop Search" src="http://www.microsoft.com/library/media/1033/windows/images/products/windowsvista/features/details/screenshot_startMenu_Search.jpg" /> &#8220;spotlight,&#8221; pun intended as always, for Vista.  It was kind of always there, but not as important until the birth of Vista.  Like all of the other search engines they have implemented a similar strategy of looking at a journal or event stream to construct an event stream to improve the efficiency of the full content index process.  In the case of WDS, it relies on the <a title="Wikipedia article on WDS" target="_blank" href="http://en.wikipedia.org/wiki/Windows_Search">NTFS USN Journal</a> which keeps track of the added, deleted and changed files so that the search/indexing infrastructure can be more efficient.</p>
<p>Okay with all of that said and the recognition that enterprise class search engines like FAST, Autonomy, and Google are all out there one has to ask why there aren&#8217;t a lot of NAS or file storage devices out there with some journal or eventing mechanism that is capable of improving the efficiency of the full content indexing process?  Well I think that one of the reasons is that Hitachi has patent pending technology in this area and sports an implementation in HCAP.  Due to the previous articles on this topic I&#8217;m sure that you are aware of the capabilities of the system being a Web 2.0 storage grid architecture, and part of the capabilities include an eventing mechanism that runs in parallel on each of the storage nodes feeding the full content index a stream of events related to added, deleted and changed files.  This allows the system to implement an approach analogous to the ones described above leading to a more efficient indexing process.  Further we view the full content indexing facilities within HCAP as a component of the storage infrastructure.  Since that is the case, we&#8217;ve acquired rights to the query API for this component so that interested companies, parties and application vendors can take advantage of the full content index to solve new customer problems and implement novel use cases.  Finally, since Hitachi has patent pending technology in this area for file storage devices, I would certainly not be surprised to hear that we were applying it to more of our platforms.
</p>
]]></content:encoded>
			<wfw:commentRSS>http://blogs.hds.com/michael/?feed=rss2&amp;p=8</wfw:commentRSS>
		</item>
		<item>
		<title>Of Web 2.0 Storage - Part 2 of 2</title>
		<link>http://blogs.hds.com/michael/?p=7</link>
		<comments>http://blogs.hds.com/michael/?p=7#comments</comments>
		<pubDate>Sun, 20 Jan 2008 03:27:17 +0000</pubDate>
		<dc:creator>mhay</dc:creator>
		
	<category>Forward Thinking</category>
	<category>File Storage</category>
		<guid isPermaLink="false">http://blogs.hds.com/michael/?p=7</guid>
		<description><![CDATA[In part one of this series I largely talked about what is needed for Web 2.0 storage systems, or at least what customers have asked or talked to me personally about.  Whilst reviewing the core requirements I’ve been witness to, I know that Hitachi can solve many of them today not through hulking infrastructures that [...]]]></description>
			<content:encoded><![CDATA[<p>In part one of this series I largely talked about what is needed for Web 2.0 storage systems, or at least what customers have asked or talked to me personally about.  Whilst reviewing the core requirements I’ve been witness to, I know that Hitachi can solve many of them today not through hulking infrastructures that haven’t yet seen the light of day – rhyme and pun intended.  I’ll list several below, and talk about how Hitachi can respond today.</p>
<ul>
<li>Petabyte scaling under a single system image – yes</li>
<li>Fewer points of management – yes</li>
<li>Ingestion of 10s of thousands of objects/second – yes</li>
<li>Rest style protocol for access – yes</li>
<li>Implementation of capacity optimization features – yes</li>
<li>Implementation of value added services (capacity balancing, automated node management/control, garbage collection, authenticity checking etc.) – yes</li>
<li>Usage of commodity components and/or RAID backed storage and ability to sell software independently of hardware – yes</li>
<li>High performance media streaming – yes</li>
<li>Local and wide area content distribution/replication – yes</li>
<li>Low latency rich media streaming – <a title="Tactix" target="_blank" href="http://www.sdl.hitachi.co.jp/english/people/hitactix/">partially</a></li>
</ul>
<p>These are merely a few requirements I wanted to relate, but the point is that we can and do solve these problems for our customers now.  We have live systems that scale into the hundreds of terabytes, meaning that like Amazon we have a mature platform: quite pointedly EMC is just now approaching their science experiment stage with both Maui and Hulk.</p>
<p>As to what this orderable product is that can do all of this, drum roll please: it is the software running at the core of the <a title="HCAP" target="_blank" href="http://www.hds.com/products/storage-systems/content-archive-platform/index.html">Hitachi Content Archive Platform</a>, internally we code name the platform “Prime” (as in Optimus Prime, or Transformers which are “More than meets the eye”).  The talented engineers behind the HCAP and Prime have been hard at work carefully taking the pulse of the industry, and have made something that very much mirrors what could back an online storage service like S3.  This very novel system has seen significant improvements such as core capabilities that make it ultra reliable, protect data privacy with encryption, utilization of various capacity optimization features (e.g. single instance storage, etc.), and finally being as the system is web based object storage at its core it can be highly customized to meet user requirements or deployed out of the box.</p>
<p><img width="243" height="247" align="right" src="http://farm3.static.flickr.com/2130/2163772411_f3f4bba3ee.jpg?v=0" /><br />
I do want to provide some level of detail on peta-scaling of the platform.  As of HCA V2, Hitachi deploys what we call SAIN (SAN + Array of Independent Nodes) disaggregating the storage from the nodes, meaning that storage and front end processing nodes scale independently.  Specifically this means that each node can sport up to 64 LUs at 2TB each and also includes all of the goodness you’d expect from a SAN attached system such as multi-pathing, encryption of data in-flight/at-rest, swapping of the LUs between nodes, and proven RAID backed full featured Hitachi storage all leading to maximum reliability, performance and efficient scaling.  When storage capacity scaling is coupled to software/node scale out – we’ve tested node scaling to 80 nodes and have no architected limit – a true peta-scale system emerges of over 10PB realizable today – actually the total addressable capacity of the system is 80 nodes x 16TB/LU x 64LU/node = 81920TB but hey whose counting. However, if users want to perform their own hardware procurement, we can and do sell the software apart from the Hitachi storage and nodes.  But, due to the fact that the maximum number of hard drives an x86 class system can hold is 48, it means that to create a similar 10PB scale system would require 1.3 times the number of nodes or 107 nodes – note as of today I’m only aware of SUN’s Thumper that can carry a maximum of 48 hard drives, and I’m assuming each drive can be 1TB of capacity if there is another DAS system than can have more then let me know.  So while it is possible and we do sell software independently of the nodes, at really large capacities the scaling out based on a DAS architecture starts to make a lot less sense then what Hitachi has with SAIN – and yes the pun is intended for SAIN, because those that do not sport a SAIN architecture are inSAIN.
</p>
]]></content:encoded>
			<wfw:commentRSS>http://blogs.hds.com/michael/?feed=rss2&amp;p=7</wfw:commentRSS>
		</item>
	</channel>
</rss>
