Showing posts with label column. Show all posts
Showing posts with label column. Show all posts

Monday, March 26, 2012

Migrating a column from ntext to XML

After SQL Server 2005 migration, I'm trying to change my ntext column to XML type.

The "alter table TableName alter column XMLCol xml" command throw this error :

XML parsing: line 1, character 39, unable to switch the encoding

1. Do I have to reformat my nText column before the migration to XML type ?

2. Is there a way to create an XML data type with UFT-8 in SQL Server 2005 ?

Thank you for your help

Thank you for your help

the ntext column must contain well-formed xml to be converted. My guess is the the data in the ntext column looks something like:

<?xml version="1.0" encoding="utf-8"?><root/>

which specifies the wrong encoding for the the xml in an ntext column. If this is the case convert the column to varchar(max) first to fix the encoding, then to xml. For example:

alter table TableName alter column XMLCol varchar(max)
alter table TableName alter column XMLCol xml

If your ntest columns contain a mixture of

<?xml version="1.0" encoding="utf-8"?><root/>

and

<?xml version="1.0" encoding="utf-16"?><root/>

after alter the columns to varchar(max), update them to replace all the "utf-16" with "utf-8" before doing the alter to xml.

Dan AT pluralsight DOT com


|||

hi,

if you would be converting the column from ntext

and you have several hundereds of record in it already

it would be difficult to pinpoint the problem.

here's my suggestion

1. add a new column with XML data type. leave the old ntext column untouched

2. insert into the new column bby batch of 10, 20 or by hundred depending on the size of your table

3. what would be left are problematic not well formed xml

4. deal with the remaining data and insert to the new column

5. when everything is done. drop the old column

6. rename the new XML column with the previous column

regards,

joey

|||

Converting it to varchar(max) is still having the potential to lead to encoding issues or data corruption. Instead the column should be converted to varbinary(max) first and then to XML.

Best regards

Michael

Migrating a column from ntext to XML

After SQL Server 2005 migration, I'm trying to change my ntext column to XML type.

The "alter table TableName alter column XMLCol xml" command throw this error :

XML parsing: line 1, character 39, unable to switch the encoding

1. Do I have to reformat my nText column before the migration to XML type ?

2. Is there a way to create an XML data type with UFT-8 in SQL Server 2005 ?

Thank you for your help

Thank you for your help

the ntext column must contain well-formed xml to be converted. My guess is the the data in the ntext column looks something like:

<?xml version="1.0" encoding="utf-8"?><root/>

which specifies the wrong encoding for the the xml in an ntext column. If this is the case convert the column to varchar(max) first to fix the encoding, then to xml. For example:

alter table TableName alter column XMLCol varchar(max)
alter table TableName alter column XMLCol xml

If your ntest columns contain a mixture of

<?xml version="1.0" encoding="utf-8"?><root/>

and

<?xml version="1.0" encoding="utf-16"?><root/>

after alter the columns to varchar(max), update them to replace all the "utf-16" with "utf-8" before doing the alter to xml.

Dan AT pluralsight DOT com


|||

hi,

if you would be converting the column from ntext

and you have several hundereds of record in it already

it would be difficult to pinpoint the problem.

here's my suggestion

1. add a new column with XML data type. leave the old ntext column untouched

2. insert into the new column bby batch of 10, 20 or by hundred depending on the size of your table

3. what would be left are problematic not well formed xml

4. deal with the remaining data and insert to the new column

5. when everything is done. drop the old column

6. rename the new XML column with the previous column

regards,

joey

|||

Converting it to varchar(max) is still having the potential to lead to encoding issues or data corruption. Instead the column should be converted to varbinary(max) first and then to XML.

Best regards

Michael

Friday, March 9, 2012

Microsoft.SqlServer.Dts.Pipeline.PipelineBuffer column ordinal from name?

Hi,

I need to access columns from a data flow by ordinal position in a script transformation (I'm parsing an excel file which has several rowsets across the page). The first problem I encountered is the generated BufferWrapper does not expose the columns collection (i.e. Input0Buffer(0) does not work) but I got around that by implementing my own ProcessInputs(InputId, Buffer) method instead of using the wrapper.

My problem now is that the column ordinals are in some random order (i.e. Column "F1" is ordinal 1 but Column "F2" is 243). Where in the object model can I map between the name and the ordinal - it's not jumping out at me?

Dave

PS Why is the script editor modal, it's frustrating having to switch between the Visual Studio environment and the VSA one.

To read and write buffer columns by ordinal position in a script transform, load a name => index dictionary in the PreExecute function. The dictionary key would be the column name, and the dictionary value a structure containing column metadata, including the buffer column index you're referring to, and whatever else you'd like.

In PreExecute() function, load up the dictionary. It will then be available for use when you require direct column access by ordinal position. Here's an example which uses the GetColumnIndexes in lieu of BufferManager.FindColumnByLineageID().

Imports System

Imports System.Data

Imports System.Math

Imports System.Collections.Generic

Imports Microsoft.SqlServer.Dts.Pipeline

Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper

Imports Microsoft.SqlServer.Dts.Runtime.Wrapper

Public Class ScriptMain

Inherits UserComponent

Private inputBuffer As PipelineBuffer

Private cols As Dictionary(Of String, ColumnInfo) = New Dictionary(Of String, ColumnInfo)

Private currentColumnInfo As ColumnInfo = New ColumnInfo

Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

If cols.TryGetValue("GeneratedStr_1", currentColumnInfo) Then

' retrieve column metatdata by column name

inputBuffer.SetString(currentColumnInfo.colIndex, Guid.NewGuid().ToString())

End If

End Sub

Public Overrides Sub ProcessInput(ByVal InputID As Integer, ByVal Buffer As Microsoft.SqlServer.Dts.Pipeline.PipelineBuffer)

' Get the Pipeline Buffer for subsequent ordinal column access

inputBuffer = Buffer

MyBase.ProcessInput(InputID, Buffer)

End Sub

Public Overrides Sub PreExecute()

BuildColumnDictionary()

End Sub

Private Sub BuildColumnDictionary()

Dim indexes() As Integer

Dim input As IDTSInput90

Dim col As IDTSInputColumn90

Dim offset As Integer = 0

input = Me.ComponentMetaData.InputCollection(0)

'presumes GetColumnIndexes order matches iterator order

'as BufferManager is not available to my knowledge in ScriptComponent

indexes = Me.GetColumnIndexes(input.ID)

For Each col In input.InputColumnCollection

Dim columnStructure As New ColumnInfo

With columnStructure

.colName = col.Name

.colLength = col.Length

.colIndex = indexes(offset)

'Normally, BufferManager would be used, but its not exposed in Script Component

.colPrecision = col.Precision

.colScale = col.Scale

.colType = col.DataType

End With

Me.Log(String.Format("Name {0} Buffer Index {1} offset {2} ", col.Name, indexes(offset), offset), 0, Nothing)

cols.Add(col.Name, columnStructure)

offset += 1

Next

End Sub

Public Structure ColumnInfo

Dim colName As String

Dim colType As DataType

Dim colIndex As Int32

Dim colLength As Int32

Dim colPrecision As Int32

Dim colScale As Int32

End Structure

End Class|||

Hi Jaegd

Thanks for the code, capturing the PipelineBuffer in ProcessInputs for later use in _ProcessInputRow is a good idea. I'd also not noticed that PipelineBuffer has Get{Type} \ Set{Type} methods.

Your assumption that the 'GetColumnIndexes order matches iterator order' does not appear to hold though (at least for the Excel Data Source) but I found an alternative that appears to be ok

With columnStructure
.colName = col.Name
.colLength = col.Length
.colIndex = input.InputColumnCollection.GetObjectIndexByID(col.ID)


.colPrecision = col.Precision
.colScale = col.Scale
.colType = col.DataType
End With

Dave

Microsoft.SqlServer.Dts.Pipeline.BlobColumn

I am using Component Script to do - Transforming Comma-delimited list row data to column

and I want to use MessageBox to see the value

Dim DataPnts As String


DataPnts = Row.DataPnts.ToString() -- this is my input column (data type = text in Source table and I put as Unicode string [DT_WSTR] in Output column)

MessageBox.Show(DataPnts, "DataPoints1", MessageBoxButtons.OK)

and why can't I see it. It gives me some message with Microsoft.SqlServer.Dts.Pipeline.BlobColumn. Why?

Values = DataPnts.Split(CChar(","))

Please point me to more info on how to do transform Comma-delimited list row data to column.

Thanks.

Vita wrote:

and why can't I see it. It gives me some message with Microsoft.SqlServer.Dts.Pipeline.BlobColumn. Why?

It would be very valuable if you posted here the full message you're receiving.

|||

The only thing I see in the message box is - "Microsoft.SqlServer.Dts.Pipeline.BlobColumn" - exact words.

And that is it. No value at all.

And if I do

Values = Row.DataPntsConv.Split(CChar(",")) - it shows me the error that Split is not a member of Microsoft.SqlServer.Dts.Pipeline.BlobColumn.

This is why I created a Variable as String and tried to do a Split on it.

I think the problem with text data type I have in the Source table for this field.

|||

Look at the type of the object, it is not a string type or something that supports casting top a string. It is a complex type, a Microsoft.SqlServer.Dts.Pipeline.BlobColumn class actually.

Take a closer look at the methods on Row.DataPnts, I think there should be so read and write methods. Use them to get the data you want. Just bear in mind that this is a big texttype, not a string, and you don't want it leaving up to 2GB of data in a string for you!

|||

Ok, I must be extra-dense today. But the light bulb just came on.

The problem is that the Microsoft.SqlServer.Dts.Pipeline.BlobColumn class inherets its implementation of the ToString method from the Object base class. This means that it just returns the name of the class: "Microsoft.SqlServer.Dts.Pipeline.BlobColumn." As you can probably imagine, this would handily explain the MessageBox message you've reported.

So... You actually need to use the BlobColumn.GetBlobData method instead: http://msdn2.microsoft.com/en-us/library/microsoft.sqlserver.dts.pipeline.blobcolumn.getblobdata.aspx

Enjoy!

|||

Thanks!

Can you give the example of it?

How to use it in

DataPnts = Row.PointsValue.ToString()

Values = DataPnts.Split(CChar(",")

May be I am missing some references to use it, but if I put GetBlobData - the code shows error.

I use already

Namespace: Microsoft.SqlServer.Dts.Pipeline
Assembly: Microsoft.SqlServer.PipelineHost (in microsoft.sqlserver.pipelinehost.dll)

Or may be I should put something before the word GetBlobData. Like buffer or something like it.

Thanks.

|||

I do not have an example to share. As Darren pointed out, this BLOB colum can be very large, so it's non-trivial to handle all cases, as depending on the size of the data in each row you may be able to get it all in one pass, or you may need to batch it into many smaller passes.

Basically you need to declare a variable of type Byte array, and then assign that value with the value returned from the call to Row.PointsValue.GetBlobData(). I can't really get more specific than this without writing the code myself...

|||How do I do a split on BlobData?|||You can't do it directly - you have to get the blob into a string first.