Parse simple XML document, even if malformed?

tryingtolearn

Member
Joined
Mar 13, 2018
Messages
12
Programming Experience
Beginner
I have two questions, I'd love any help offered. I’ve spent the entire weekend searching for a solution to this problem. We have an XML file that needs to be read. It’s small and only contains one node. However, it is malformed (and we know it). There is nothing we can do about this since it’s created from another process. I have to work with what we have.

HTML:
<ServerRequest userKey="01223345671234123"  flag="ACTIVE" >
<raw>
TEST MESSAGE
</raw>
</ServerRequest>

That's it. As you can see, it doesn't contain a root. The tutorials I've seen online use a process called deserialization that require a main root. I've thought about prepending the XML file with "<root>" but I don't know how to do this on the fly. I can't touch the original file. I just need the userKey, flag, and the TEST MESSAGE inside of it. How can I extract this?

I have one other question that has to do with theory/best practices. As you can see, the file is small (one node). But it's in a folder that contains about 100 files just like the above. My goal is for my program to loop through each file and reads the one node in the file, and put it in a treeview. Do you think I should create an array of objects? Use a basic array? Heck, a SQL DB? Just looking for a push in the right direction.

Any help is greatly appreciated.

Rob
 
Last edited:
If anyone is interested, I got this to work. It's a bit ugly, but it does the trick to reading the malformed XML:

C#:
[COLOR=blue]string[/COLOR] text = System.IO.[COLOR=#2b91af]File[/COLOR].ReadAllText(filename);
         
           [COLOR=blue]var[/COLOR] startTag = [COLOR=#a31515]"userKey=\""[/COLOR];
           [COLOR=blue]int[/COLOR] startIndex = text.IndexOf(startTag) + startTag.Length;
           [COLOR=blue]int[/COLOR] endIndex = text.IndexOf([COLOR=#a31515]"\""[/COLOR], startIndex);
          [COLOR=blue]return[/COLOR] text.Substring(startIndex, endIndex - startIndex);
 
As you can see, it doesn't contain a root.
ServerRequest is the root node.
var doc = XDocument.Load(filename);
var value = doc.Root.Attribute("userKey").Value;
 
ServerRequest is the root node.
var doc = XDocument.Load(filename);
var value = doc.Root.Attribute("userKey").Value;

You. Are. Awesome. Thank you! ::pours a Corona for you::
 
Before that Corona, could I trouble you to assist me in grabbing the RAW data? I am so close:

C#:
[LEFT][COLOR=#333333][COLOR=blue]var[/COLOR] node = doc.Descendants().Where(n => n.Name == [COLOR=#a31515]"raw"[/COLOR]).FirstOrDefault();
[/COLOR][/LEFT]

It works, but there is one little bug. *CERTAIN* files will contain an extra attribute, after the flag, that says "xmls":

C#:
[LEFT][COLOR=#000080][FONT=monospace]<ServerRequest userKey=[COLOR=#0000FF]"01223345671234123"[/COLOR]  flag=[COLOR=#0000FF]"ACTIVE"[/COLOR] [/FONT][/COLOR][/LEFT]xmlns="org">

The word XMLNS destroy the program. For files that don't include that attribute, I am able to see the inner text of RAW. For files that do have that attribute, nada!
 
Last edited:
xmlns is the xml namespace. This should work for both cases:
var ns = doc.Root.GetDefaultNamespace();
var raw = doc.Descendants(ns + "raw").First().Value;
 
Back
Top Bottom