Monday, June 20, 2011

Creating a Document Library / Repository in .net using SQL

Pretty much any serious application online today needs some sort of document management system.  At some point your user will need to upload documents of some type that will need to be able to be downloaded at some point later.  You could use the file structure but this does not guarantee security or readability.  If for some reason your directory is unreadable or not available,  it could lead to some serious errors and problems.  This article will walk you through the process of setting up the needed structure in SQL and the ability to upload, store, and then render through the buffer the document back to the user.  I will attempt to use as little technical jargon as possible while still getting the job done.

The Setup:

At this point, we need a way to store the contents and uploads in MSSQL.  The following is SQL code to give us the tables and procedures that we will need.

dbo.DocumentRepository Table


CREATE TABLE [dbo].[DocumentRepository](
[RepositoryID] [int] IDENTITY(1,1) NOT NULL,
[fileName] [varchar](50) NULL,
[fileLength] [bigint] NULL,
[fileExtension] [varchar](50) NULL,
[fileDescription] [varchar](max) NULL,
[fileUploader] [varchar](50) NULL,
[fileDate] [datetime] NULL,
[fileContent] [varbinary](max) NULL
) ON [PRIMARY]



This SQL Query will create the table that we need to store the document.  In it we use the RepositoryID as the identity field.  Obviously, the fileName, fileLength, fileExtension, fileUploader, and fileDate fields are used to help identify the file in the gridview.  The fileExtension is used so that we can display a pretty icon in the grid that the user can click on to download the file (hence if we have a doc or docx it will have a Word logo).  The fileContent is where the actual file is stored.  Besides the ID, this is the only other required field, but the other fields are more highly suggestive.


dbo.getDocumentRepository

Create PROCEDURE [dbo].[getDocumentLibrary] 

AS
BEGIN

SET NOCOUNT ON;

SELECT DR.[RepositoryID], DR.[fileName], DR.[fileLength], DR.[fileDescription],
DR.[fileUploader], DR.[fileDate], DR.[fileContent]
From dbo.DocumentRepository DR
Order By fileDate Desc
END

For simplicity, I have removed the Left Outer Join on the image to represent the document.  We are using the alias DR for our table and getting all our support information to support the gridview that is used to display the list of documents in the repository. 


The GridView

<telerik:RadGrid ID="rdLibrary" runat="server" AutoGenerateColumns="false" Skin="Hay" AllowPaging="true" PageSize="15">
<ClientSettings>
<Scrolling AllowScroll="True" UseStaticHeaders="true" />
</ClientSettings>
<MasterTableView NoMasterRecordsText="No Documents Are Available" >
<Columns>
<telerik:GridBoundColumn DataField="RepositoryID" HeaderText="Document ID" visible="false"/>
<telerik:GridBoundColumn DataField="filename" HeaderText="File Name" ItemStyle-Width="100px" />
<telerik:GridBoundColumn DataField="filelength" HeaderText="Size" />
<telerik:GridBoundColumn DataField="fileUploader" HeaderText="Added By" />
<telerik:GridBoundColumn DataField="filedate" HeaderText="Date Added" />
<telerik:GridBoundColumn DataField="fileDescription" HeaderText="Description" ItemStyle-Width="180px"/>
</Columns>
</MasterTableView>
</telerik:RadGrid>




While you can use any generic gridview to display the contents of the table as long as you have the required columns, I am using a RadGrid from Telerik.  You can visit the Telerik website here.  There really is not anything mystical or hidden in this view, we are just basically using it to display our values.  Notice that Paging is turned on and it is set to a size of 15 per page (we’ll cover that in the code-behind).


It is IMPORTANT TO NOTE in order to have a method to select the document you need a column that has a commandname.  In this instance we use “DL” for the command. 


The Code-Behind:


This is where it really gets interesting.  I actually use the 3 tier architecture (presentation, data, and business layers) however, referencing class libraries are beyond the scope of this article.


Declarations:
Imports System.IO
Imports Telerik.Web.UI
Imports System.Data
Imports System.Data.SqlClient
Imports System.Web.Configuration.WebConfigurationManager

We use System.IO for our buffering capabilities.  System.Data and System.Data.Sqlclient for our ADO.Net connection to SQL backend and WebConfigurationManager to assist us in getting our connectionString.  We get our connectionString from web.config.  Obviously, you will not need the reference to Telerik.Web.UI if you are not using RadGrid.


Page_Load:
Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
Dim ds As DataSet = getAllDocuments
rdLibrary.DataSource = ds
rdLibrary.DataBind()
End Sub

Here on our Load trigger, we setup a dataset and use a function called getAllDocuments to fill our dataset.  Once returned we give it to the grid as a datasource.  Then we bind it to the grid.


getAllDocuments:
Public Function getAllDocuments() As DataSet
Dim conn As New SqlConnection(ConnectionStrings("someConnString").ToString())
Dim cmd As New SqlCommand("dbo.getDocumentLibrary", conn)
cmd.CommandType = CommandType.StoredProcedure
Try
conn.Open()
Dim da As New SqlDataAdapter(cmd)
Dim ds As New DataSet()
da.Fill(ds)
Catch ex As Exception
Return Nothing
Finally
conn.Close()
End Try
Return ds
End Function



Here we setup our connection string from the web.config.  Declare our SQL command as a procedure in reference to SQLconnection.  From there we enter our TRY/Catch by trying to open our database.  We fill our DataSet.  At any point, if we run into a problem, we capture the exception.  If we wanted to get fancy for logging purposes, we could throw a new exception here or a SqlException.  For simplicity, it is as is.  We hit our Finally block and close the exception as it will always run and we want to always close the connection for security purposes.  At last we return the dataset.
This is actually all that is needed to display the contents of the documentrepository table.  Unfortunately, without a method that will allow the user to download the contents, there really is no point.  In Telerik, you use the ItemCommand, for a regular view you would use the RowCommand event.

 Protected Sub rdLibrary_ItemCommand(ByVal source As Object, ByVal e As Telerik.Web.UI.GridCommandEventArgs) Handles rdLibrary.ItemCommand
If e.CommandName = "DL" Then
Dim dataItem As GridDataItem = CType(e.Item, GridDataItem)
Dim itemValue As String = dataItem("RepositoryID").Text
Dim strQuery As String = "select fileName, fileextension, filecontent from dbo.documentrepository where repositoryid=@id"
Dim cmd As SqlCommand = New SqlCommand(strQuery)
cmd.Parameters.Add("@id", SqlDbType.Int).Value = CInt(itemValue)
Dim dt As DataTable = GetData(cmd)
If dt IsNot Nothing Then
download(dt)
End If
End If
End Sub

Public Function GetData(ByVal cmd As SqlCommand) As DataTable
Dim dt As New DataTable
Dim strConnString As String = System.Configuration.ConfigurationManager.ConnectionStrings("someConnString").ConnectionString
Dim con As New SqlConnection(strConnString)
Dim sda As New SqlDataAdapter
cmd.CommandType = CommandType.Text
cmd.Connection = con
Try
con.Open()
sda.SelectCommand = cmd
sda.Fill(dt)
Return dt
Catch ex As Exception
Response.Write(ex.Message)
Return Nothing
Finally
con.Close()
sda.Dispose()
con.Dispose()
End Try
End Function

Here we enter the event and check to make sure we are working with the right command as you can have many commands per row in a grid.  We create a dataItem that represents the row and then an itemValue that gives use the identifier (or key) to the document we selected.  We create the query and SqlCommands and add our identifier as a parameter.  We fill out dataTable by way of the GetData function.  All the getData function does is talks to SQL and gets the specific file information (to include content – or the actual file).  Of course if we have an error, we need to notify the user.  Otherwise we need to return the dataTable.  From there we move to the download function


Download Function:
Protected Sub download(ByVal dt As DataTable)
Dim bytes() As Byte = CType(dt.Rows(0)("filecontent"), Byte())
Response.Buffer = True
Response.Charset = ""
Response.Cache.SetCacheability(HttpCacheability.NoCache)
Response.ContentType = dt.Rows(0)("fileextension").ToString()
Response.AddHeader("content-disposition", "attachment;filename=" & dt.Rows(0)("fileName").ToString())
Response.BinaryWrite(bytes)
Response.Flush()
Response.End()
End Sub

In the Download Sub we move the content (or actual file) into a byte array.  We do this so that we can transfer the file to the user through the Response.Buffer.  Additionally, you MUST supply the extension and filename for the header.  Finally, we call the BinaryWrite method which does the actual file transfer.  The user is presented with a modal dialog asking them if they want to Open, Save, or Cancel.  They will only see the Open button if they have an association for the extension.


Finally, we need a method to fill our table and put documents in the repository.  You do this by a simple upload file control.  In the click event for the upload file control is where we will do our processing for the file. 

Protected Sub btnUpload_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles btnUpload.Click
Dim dr As SqlDataReader = docLib.getUserInfo(Session("UserID"))
Dim LongName As String = "Anonymous"
While dr.Read
LongName = dr.Item("FirstNM").ToString & " " & dr.Item("LastNM").ToString
End While
Dim filePath As String = fupDocLibrary.PostedFile.FileName
Dim filename As String = Path.GetFileName(filePath)
Dim ext As String = Path.GetExtension(filename)
Dim contenttype As String = String.Empty

Select Case ext
Case ".doc"
contenttype = "application/vnd.ms-word"
Exit Select
Case ".docx"
contenttype = "application/vnd.ms-word"
Exit Select
Case ".xls"
contenttype = "application/vnd.ms-excel"
Exit Select
Case ".xlsx"
contenttype = "application/vnd.ms-excel"
Exit Select
Case ".vsd"
contenttype = "application/vnd.ms-visio"
Exit Select
Case ".pub"
contenttype = "application/vnd.ms-publisher"
Exit Select
Case ".mpx"
contenttype = "application/vnd.ms-project"
Exit Select
Case ".mpp"
contenttype = "application/vnd.ms-project"
Exit Select
Case ".ppt"
contenttype = "application/vnd.ms-powerpoint"
Exit Select
Case ".pps"
contenttype = "application/vnd.ms-powerpoint"
Exit Select
Case ".pst"
contenttype = "application/vnd.ms-outlook"
Exit Select
Case ".jpg"
contenttype = "image/jpg"
Exit Select
Case ".txt"
contenttype = "application/text"
Exit Select
Case ".png"
contenttype = "image/png"
Exit Select
Case ".gif"
contenttype = "image/gif"
Exit Select
Case ".pdf"
contenttype = "application/pdf"
Exit Select
End Select
Dim fs As Stream = fupDocLibrary.PostedFile.InputStream
Dim br As New BinaryReader(fs)
Dim bytes As Byte() = br.ReadBytes(fs.Length)

Dim conn As New SqlConnection(ConnectionStrings("someConnString").ToString())
Dim cmd As New SqlCommand("dbo.insertFile", conn)
cmd.CommandType = CommandType.StoredProcedure
cmd.Parameters.AddWithValue("@filename", SqlDbType.VarChar).Value = filename
cmd.Parameters.AddWithValue("@filelength", SqlDbType.BigInt).Value = bytes.Length
cmd.Parameters.AddWithValue("@fileextension", SqlDbType.VarChar).Value = contenttype
cmd.Parameters.AddWithValue("@filedescription", txtFileDesc.Text)
cmd.Parameters.AddWithValue("@fileuploader", LongName)
cmd.Parameters.AddWithValue("@filecontent", SqlDbType.Binary).Value = bytes
Try
cmd.Connection.Open()
cmd.ExecuteNonQuery()




We start off by getting the username so that we know who the author is.  Then we get a handle on the postedfile from the fupDocLibrary file upload control.  Obviously, the control name will match whatever name you give your control.  Then we get the file extension and set the association.  Once again this is for the header when the user downloads the file.  We stream the file into a byte array so that we can store it in the content column of SQL.  We perform our database connection and execute the non query.  By using ExecuteNonQuery, we could expect to receive an output parameter, however for simplicity we do not expect it in this context. 


The final piece is the insert procedure:

Create PROCEDURE [dbo].[insertFile] 
-- Add the parameters for the stored procedure here
@filename varchar(50),
@filelength bigint,
@fileextension varchar(50),
@filedescription varchar(max),
@fileUploader varchar(50),
@filecontent varbinary(max)
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;

-- Insert statements for procedure here
Insert into dbo.DocumentRepository ([filename], [filelength], [fileextension],
[filedescription], [fileuploader], [filedate], [fileContent]) Values
(@filename, @filelength, @fileextension, @filedescription, @fileuploader,
GetDate(), @filecontent)
END

FINALLY:


There is no security in place in this example.  Additionally, there is no processing for integrity.  You will want to add code that will enable you to ensure that the file the user is uploading does not cause a data integrity problem nor security problem (virus and such).


Once again, this is a very simple example of a document library / repository.  Use as you wish.


No comments:

Post a Comment