BizTalk Utilities CV ,   Jobs ,   Code library  
 
 

Washington, September 15-18, 1999 – London, November 21-24, 1999

Site Server Content Management

Dina Berry

Introduction to Site Server

Site Server is a collection of tools, applications, code, and components to make web site development and maintenance easier. The easiest way to use Site Server is to learn what the marketing terms actually translate into in the product. The two pieces we will be using are terms 'knowledge' and 'publishing'. The other piece is 'analysis'. Knowledge consists of Personalization and Membership (the LDAP database), the Knowledge Manager application, Search, and several components and lots of sample code. Publishing consists of Content Management and Content Publishing. 

If you want to e-commerce enable your web site, you should take a look at Site Server Enterprise. It makes building an e-store very simple. Of course, once you want to change the store, things get more complicated.

Introduction to Content Management

Content Management is the organization and management of files in their native format such as Word, Excel, or PDF. While everyone agrees the browser is becoming the single application interface, this concept is often misused. If the information is stored in a database, it is easy to see why you want to use an ASP/ADO (HTML) page to view the data. However, many people think that information should always be viewed in HTML and take time to convert documents to HTML. When the document is updated in its native format, the document is re-converted to HTML. Content Management aims to catalog files in their native format but allow access via the web. This method removes a lot of unnecessary work and adds a level of detail (document properties) for finding the document you are looking for.

Dina Berry
As the XBuilder Product Manager, Dina Berry focuses on product development, quality assurance, and customer service. She received her B.S. in Computer Science from Principia College in Elsah, IL, and has worked in hardware support for Northern Telecom and systems design at Oppenheimer Mutual Funds. During a three year tenure at Microsoft, Berry worked on the OLE DB and Site Server teams. She is a conference speaker and the published author of Teach Yourself ActiveX Programming in 21 Days and Teach Yourself Active Web Database Programming in 21 Days. Her love of database and Internet technologies has lead to several e-business web development sites in her own time.

What Problem does CM solve?

Content Management brings the organized file server to the web. By organized, I mean the level of searchability beyond what directory or file name is used. When working with a file server, user standards are generally very loose and governed more by file system permissions than document content. It's important that the time spent creating a document is not lost in that the document is rarely referenced as intended or lost altogether after it's initial use.

Content Management allows you to create 'stores' of content. Each store has a unifying purpose whether it is company, department, subject, or location. This is up to you to define. Inside each store, are content types such as White paper, Press Release (or whatever you want to call it). Each content type is defined with properties that make up that content type. This allows you to create a single level hierarchical storage of files by file type and makes these files searchable by all information submitted about each file.

Note: CM is not designed for multi-tiered storage. Yahoo is a good example where you can continue to drill into a content section by level until you have reached the content area you where interested in. A CM store is the final location and does not contain inherent programmability for this multi-tiered storage.

Some content type properties will be universal such as content author and creation date, while some content type properties will be specific to the content type. For example, time-sensitive data will probably have timestamps to indicate the data-range of the document's relevance. This information is not applicable to files that are always relevant regardless of the date range covered by the document.

It is important to understand that the content type and content type properties are the criteria for searching the content store.

Each file type (defined by it's extension such as *.doc) has three types of property types.

The first is properties that can be derived from the file. The HTML file type can be scanned for Meta tags or other well-defined HTML tags. Microsoft Office files have file properties (found in the File menu of an office application) that can be scanned.

The second type of property type is a property type you (as the application developer) define. These property types can be any type of information and will be entered when the file is submitted.

The third type of property type is the minimum set of properties that any file has: data created, file name, size, etc. If the file type is not known (i.e. doesn't have a corresponding filter to read the file's properties) then only second and third property types can be submitted.

While the first property type (derived from the file) is timesaving and less prone to error, the usability relies on the original content author to use these properties. If you have a file type that does not have a filter (any many file types don't have filters), you can still enjoy the rich content management features and searchability.

Why should you use CM?

The idea of managing files in a well-structured and searchable manner is not new. There are several very good and expensive file management systems on the market. Used properly, any CM software will enable to you use documents to their fullest value because you can search on specific criteria. Most file systems give you a bare minimum of information about a file. You would actually have to open each file to find what you are looking for.

The idea behind Site Server's CM is that much of the work has been done in terms of programming and usability. With very little effort, you can enjoy feature-rich management of your file system. Adding more content stores, content types, properties, and views into the files is very easy. Surprisingly easy! You should be able to find all files you are looking for. Once you have narrowed by the list, you should be able to look at the list and find the file you are looking for without opening each one.

While everyone agrees Content Management is important, it is rarely used to its full and time/money saving potential. It is difficult to have an out-of-the-box solution for web servers because someone always feels the need to change the look and feel to suit their own purposes. And with a file server, it is too easy to skip the web-based file submission and just place it on the file server in the appropriate directory. Content Management requires a degree of standard use to make it worth your time, regardless of whose CM software you use.

What can CM be integrated with?

Site Server's CM is a collection of tools, code, components, and LDAP storage. Each piece fits together fast and well. You can change each component (a different upload control, a different database storage) but expect to put time into the integration process. You should think carefully about when to change and why. 99% of the code is written for you or can be copied from other files. Any changes will affect how short the implementation time is.

Access by user id is one of the common integrations. If you use P&M, you can add access via a membership user. You can also control access via NTLM. You should have a great understanding of all the code, before messing with permissions. Most of the files have code regarding NTLM and anonymous access.

A natural integration to CM is version control. Version control is the ability to store different versions of the same file so that you can see the lifetime of a file. This is a common tool used in software development. While the need is real and the integration with Site Server's CM is possible, the real-world experience leads me to say that version control is a lot of work. The problem is that the upload process would have to update the version control software (via some API set) as well as including all relevant information the version control submission would need. Some version control software needs a specific user to be authenticated, which requires authentication in the web-based submission. My last integration with Microsoft's Source Safe ended poorly because the COM object for Source Safe didn't handle authentication being passed via the IIS web server. These are important issues to resolve before development begins.

Examples of Site Server's Content Management

Site Server CM comes with two working examples: FPSample and CMSample. FpSample is meant to be a Front Page application and CMSample is meant to be Front Page-less in that it is straight ASP. In order to show you how to use CM, I'll also build a third sample as part of this discussion. It won't be very complex but it will show you each step and each gotcha.

All the samples have the same basic visual elements:

Ø        Content viewed by content type (such as white paper, press release, etc)

Ø        Submission page (works with IE or netscape)

Ø        Approval page

Ø        Content viewed that I (this user context) have submitted

There are a lot of ways to view the content but the code that is readily available is content-type based or user-context based. If you want other ways to view the code, you should schedule this into the development cycle. Remember that the only searchability is based on content-type and content-property type.

Pieces of Content Management

Content Management is made up of the ASP pages, the LDAP database, the directory structure, the Index Server, and the Upload Control. The ASP pages are used to place information into and get information out of the 'Content Store'. The ASP pages also use support files including images, stylesheets, included asp function files, and Site Server Rule files (*.prf). The LDAP database is the storage of the data definitions for the content store. The name and location of the content store, the content types and the content type properties are kept in the LDAP database. The actual information that is searched is not kept in the LDAP database, but instead kept in the Index Server catalog. The directory structure is the 'content store'. Each subdirectory in the main directory is either a content type or a subsidiary function such as upload or approve. You can see where a file is in the process of getting into the content store by where it is currently located. A file is uploaded to the upload directory. Once the content type is set, the file moves to the appropriate content type directory or the approval directory. The upload control is the component that moves the file from the client-machine to the server.

Architecture of Content Management

The architecture of Content Management may appear complex because most of the work is not apparent to the user or administrator. However, the system is really just moving files from the client machine to the appropriate content type directory. In order to submit any file (regardless of content type or content author), you use the submit.asp page. This page has the upload control. The file is uploaded and placed in the /upload directory. The web site returns the content type choice page. These choices are determined by querying the LDAP database for that particular content store. When a content author chooses a content type, the ASP page queries the LDAP database for content type properties. Once the properties are entered and the content author submits the form, the content properties are indexed and the file is moved to the next appropriate directory. This directory can be the approve directory or the final content directory. If the file needs to be approved, the file is moved from the approve directory to the content directory.

Installation of Content Management

Installation of Site Server is divided into three sections: Publishing, Knowledge, and Analysis. Installation of CM requires that both Publishing and Knowledge are installed. Please refer to Microsoft’s Support site for the most up-to-date information on installation steps and service packs.

The default installation will install the LDAP server as a Microsoft Access database. For the purposes of CM, this is fine. CM doesn’t actually rely to heavily on the LDAP server. If you are also using P&M for a heavy-load web server, you should install the LDAP server as a SQL Server instead of MS Access.

What is a Content Store?

A content store is the top-level directory structure. If you need to move, reinstall or backup the content store, you should backup the directory structure, the LDAP database. Backing up the Index Server is a choice you’ll have to make. You will also need to backup the index catalog but the catalog may have more files in its catalog than just your content store.

Create a New Content Store

Since the Content Store is just the LDAP database and the file directory, these are the objects you must create and manage. The first step to creating the store is creating the appropriate objects in the LDAP directory. This is partially automated for you via a file called makecm.vbs. This is a VBScript file that creates the LDAP objects. While most of the file is not meant to be edited, the top portion (labeled “Start of configurable part”) can be edited. The top portion creates the content types, and the content type properties. The content store application that holds the content types and content type properties is set as part of the command line switches that are used when the makecm.vbs is executed.

The following code is necessary to create each content type and it’s associated properties:

Set DocList.ImageLibrary.Fields = CreateObject(DictionaryProgId)

DocList.ImageLibrary.Fields.ContentAuthor     = "Yes"

DocList.ImageLibrary.Fields.Size              = "Integer"

DocList.ImageLibrary.Fields.Type              = "Yes"

DocList.ImageLibrary.Fields.Title             = "Yes"

Set DocList.ImageLibrary.Fields.CreateDate    = CreateObject(DictionaryProgId)

DocList.ImageLibrary.Fields.CreateDate.Type   = "StringTime"

Set DocList.ImageLibrary.Fields.Topic         = CreateObject(DictionaryProgId)

Doclist.ImageLibrary.Fields.Topic.Type        = "Vocabulary"

DocList.ImageLibrary.Directory                = "ImageLibrary"

DocList.ImageLibrary.HTMLFile                 = "ImageLibrary.asp"

DocList.ImageLibrary.Description              = "ImageLibrary"

DocList.ImageLibrary.Approve                  = 0

The DocList object is a "Commerce.Dictionary". This might imply that you need Site Server Enterprise for the Commerce Dictionary but you don’t. While this is not the standard dictionary object, it functions just the same as the standard dictionary object.

The first section of setting is regarding the content type as it will be added to the LDAP database. The second set of settings will also be added to the LDAP server but deal with the ASP pages and the approval.

For every content type, you will need this section repeated with the correct modifications. Before running the script, make sure to have the directory for the content management created. When you execute the script, you designate the directory location, the server, LDAP server, the IIS virtual root, the LDAP application name, and the user and password for the LDAP security context. After the script has run, everything is done except for the ASP scripts. There is now a config.inc at the root of you content store. If you open it up, you will see something similar to this:

<%

const  DPSVRoot = "/cm"

const  LDAPServer = "dina001:1002"

const  ApplicationName = "dinacm"

const  FromClause = " FROM ContentManagement..SCOPE(' ""Z:\dinacm"" ') "

%>

The first constant is the IIS virtual root. The second constant is the LDAP server path. The third setting is the application name used in the LDAP server. The fourth path is the “from” clause used in the queries against the Index Server.

The last step is to generate the ASP pages that will connect to the content store. In order to correctly generate the ASP pages, copy the following files from the CMSample directory into the root of your content store:

Saveprops.asp

Save properties of content type

Getprops.asp

Get properties of content type

Common.asp

Common functions and values

Submit.asp

Page with upload control

Type.asp

List of content types

Cpview.asp

Content author’s list of submissions

Approve.asp

Approval page

AdminList.asp

[CHECK THIS] list of files awaiting approval

Exists.asp

Checks ability to overwrite file

Repost.asp

Repost

Files.asp

List of files

Filename.asp

List of properties on files

Menu.txt

Navigation in this ASP application

Deploy.asp

Used to propagate from staging to live

Mydocs.prf

Rule set to find content author’s submissions

Approve.prf

Rule set to find files that are not yet approved

Defaultviewtemplate.ast

Template used to create pages for each content type;

Has hardcoded URL values of http://localhost/cmsample/

In order to create each Content Type’s ASP pages, we need to open the HTML version of the Site Server Administration Tool. While you can view the Content Type information from the MMC version, this version won’t create the ASP pages. Go to the following URL and log in to the Site Server Admin site for Publishing:

1)     Select the content store and click on the ‘Properties’ button

2)     Select ‘Content Types’

3)     For each content type:

a.      Select the content type and choose ‘Properties’ button

b.      All information should be correct, check the following boxes and choose ‘Submit’:

                                                                          i.      ‘Generate Properties Form’

                                                                         ii.      ‘Generate View Page’

When you are done, you should have 3 new files for each content type:

Contenttype.asp

Properties Page

ContentTypeView.asp

View Page

ContentType.prf

Rule for finding files from Index Server

A defaultviews.txt file is also created and contains:

ImageLibraryView.asp|ImageLibraryView.asp

PressReleasesView.asp|PressReleasesView.asp

HeadlinesView.asp|HeadlinesView.asp

You will also need to copy or create your own \images\ directory as well as modify the menu.txt for you application’s navigation purposes.

Tips/Tricks

1.      Install SS and CM on FAT partition to begin with. Makes everything much easier until you know the code.

2.      CM used NTLM everywhere. You should have a solid understanding of NTLM and P&M if you plan to use authentication. If you don't plan to use authentication, rip it out of the code. Recognize that if you do use NTLM, you can only use Internet Explorer as your browser.

3.      The original version of Content Management has problems with some of the data types in the generated view and property pages. Make sure you can either fix the code or use different data types before getting too far along. The CM Sample uses text and LDAP lists only. None of the other data types are used.

4.      The Defaultviewtemplate.ast file will have to modified. It has a hard-coded virtual directory of /cmsample/. This isn't very obvious but if you don't change it you'll have errors when you try to run your content store.

5.      The 'Type' column from the rule file is throwing an error on the view page. Remove references to the Type column in the SQL query and result set.

6.      There is no default.htm or default.asp. You will need to create one or turn directory browsing on.

7.      If you want to use the FP Sample or copy it's code, make sure to change the ADO prog id by removing "1.5" at the end of the prog id. This symptom for this error is the "Server.CreateObject Failed" error.

8.      Take a look at the other files in the CM and FP Samples. There are files that are not necessary to copy in order to get your store up and running but the extra files may be just what you are looking for. For example, the CMSample has an alldocs.asp file that shows all docs in the store.

Creating your own view page

The view pages that the Site Server HTML Admin creates are based on Content Type. You will probably want to add more view pages with different criteria. For example, find all files where property X = value Y. In order to make a new view page, you need to copy one of the generated view pages that the HTML Admin created. When you open up the view page in notepad (not IE), you will see the code for the Ruleset object that contains the ruleset filename. You will also notice that the same rule file is included in the file. You only need one of the references. I recommend deleting the object and just using the include statement.

<OBJECT ID="FormatRuleset1" WIDTH=487 HEIGHT=237 CLASSID="CLSID:F78EAED2-F867-11D0-9F89-0000F8040D4E">

<PARAM NAME="_ExtentX" VALUE="12885">

<PARAM NAME="_ExtentY" VALUE="6271">

<PARAM NAME="RecordSetURL" VALUE="http://localhost/cmsample/PressReleases.prf">

<PARAM NAME="FormattingText" VALUE="<% Response.Write "<TR>" %>

<% Response.Write "<td VALIGN=baseline WIDTH=42><img SRC=images/globul2a.gif WIDTH=20 HEIGHT=20 SPACE=11></td><td><font face=Verdana size=2>" %>

<% Response.Write "  <A HREF=""" & GetDocUrl(MemRecordSet("path"),MemRecordSet("ContentType")) & """>" & getfilename(MemRecordSet("FileName")) & " " & MemRecordSet("Size") & " Bytes </A></font></td>" %>

<% Response.Write "</TR>" %> ">

<PARAM NAME="DisplayType" VALUE="3">

</OBJECT>

<!-- #INCLUDE VIRTUAL="/cmsample/PressReleases.prf" -->

If you open this file up with VID, the object will appear as a DTC and you will have to edit the object via the DTC interface. This may be more cumbersome than you want to deal with if you are not familiar with DTCs. Editing the file in notepad won't produce any problems if you edit the file with changes only, no deletions of parameters or object attributes.

Once you have modified the prf file name, you may need to alter the code that prints out the record set. Since the result set generated from Rule Manager always has the same columns, just change the WHERE clause, you can create a single included file to run through the result set. Each result set has the same column names so the code can be very generic. There are also a lot of columns in the result set. You should reduce the columns to just what you need.

If you would prefer to modify the code for each view page, you need the basics for a result set. This includes a loop that rules until the end of the result set (EOF) is reached as well as code to print out each column’s value for each row. The result set is always called ‘MemRecordset’. The following is some simple example code for the View file that is placed after the rule filename include:

Do While Not MemRecordset.EOF

Response.Write MemRecordset(“Title”) & “<br>”

Response.Write MemRecordset(“ContentAuthor”) & “<br>”

Response.Write "  (<A HREF=""" & GetDocUrl(MemRecordSet("path"),MemRecordSet("ContentType")) & """>" & getfilename(MemRecordSet("FileName")) & " " & MemRecordSet("Size") & " Bytes </A>)”

    MemRecordSet.MoveNext

Loop

MemRecordSet.Close

MemRecordset is opened as a persistent, read-only cursor. You will also want to change the look/feel of the page to meet your needs. If you want to return a different mime-type such as XML, you only need to change the view page, not the rule file.

Creating a Rule File (*.prf)

Most of the Content Management code is very obvious. The only tricky part is the rule file. If you are familiar with ADO/SQL query code, the rule file code should be familiar but not exactly as you have seen it before.  You should create your rule file with the Rule Manager application in the Site Server/Tools program group. Once you have created the rule file, you can edit it with notepad. Once you have edited the file outside of Rule Manager don't expect to be able to re-edit with Rule Manager.

When you open the Rule Manager, you will need to specify the ‘application server’, which is the LDAP server. You can create as many rules as you want, and have them applied in a specific order. Each rule in the rule file will have a name for easy referencing. For most of the Content Management work, you will most likely only have one rule in a rule file.

The rule file ‘awards.prf’ provided as part of the CMSample is a good example we can look at. You can open the file in the Rule Manager. There is 1 rule named Awards. Since there is only 1 rule, it has priority of 1. The rule is also visible and is:

    “Select content

        where approved contains yes

        and where ContentType contains awards”

 

The underlining indicates that the word or phrase is selectable. If you have used Outlook’s Rule Manager, you will be familiar with how CM’s rule manager works. There is a four-tab property sheet to help you create your rule set. The rule set is generally divided between AUO/user properties and content type properties. This is to help you match user to content or select just based on user or content.

Once the rule file is generated, open it up to take a look at it. For the above rule, the specific code of the rule file is:

Do

   bstrSSPMQuery = ""

   bstrMemLog = "Awards+"

   fSSPMRunVBScript = FALSE

      bstrMemLog = bstrMemLog & "Awards+"

      If Len(bstrSSPMQuery) > 0 Then

         bstrSSPMQuery = bstrSSPMQuery & " OR "

      bstrSSPMQuery = bstrSSPMQuery & "(CONTAINS(approved,'yes*') >0)"

      If Len(bstrSSPMQuery) > 3 Then

         bstrSSPMQuery = bstrSSPMQuery & " AND "

      End If

      bstrSSPMQuery = bstrSSPMQuery & "(CONTAINS(contentType, 'awards*') >0)"

   If ((Len(bstrSSPMQuery) >0) OR (fSSPMRunVBScript = TRUE))Then

      Exit Do

   End If

Loop While (False)

The rest of the file is the same regardless of what the rule is.  If you want to change the sort order (ORDER BY clause), you need to append to the value of “bstrSSPMQuery”.

You should think about how you would automatically generate rule files. For example, if you want to generate rule files and view files in real time, you will need to understand the rule file syntax and know where to change the code. Repeated creation and use of rule files will solve this problem. This should only be done as an automation process enhancement and is not recommended as a general rule of use.

Summary

Content Management is a web application that eases the file management burden by allowing the files to stored in their native format (doc, xls, pdf, etc.) as well as making them available from a web page. The content store is the application unit and is made up of ASP files and include files, the Index Server, the LDAP database, and the upload control. CM is a tool in that you can modify any or all of the pieces to fit your needs while still being an accessible web application.

 

Recent Jobs

Integration Specialist Needed - Wor
Virtualization Server Infrastructur
A great opportunity to Digital Vide
here is a greate opportunity as a S
A great opportunity as a Network En

View all Jobs (Add yours)
View all CV (Add yours)



Information Online

swimming pool contractor
chicago web site design
halloween mask
Domain Names
fax server
Diesel sunglasses
answering service


    Email TopXML  

Front Page Daily Stuff TopXML Forum XML blogs XML Newsgroups BizTalk Biztalk Utilities Biztalk Utilities Tutorial B2B SAP XML Microsoft .NET Dotnet System XML Soapformatter SQLXML XMLserializer XQuery PHP PHP SimpleXML PHP XML Dom PHP XML RPC PHP XSLT Java Java Java XML Xalan Microsoft ASP ASP Schemas XML SQL Server XML XMLDom XSL XSL Tutorial XSLT Stylesheets General Javascript CSS XHTML WAP