I want to write a C# application which will compare the tree structure of two XML files. I.e. I do not care about the content of the nodes, but only if there are the same tags used in both files (the order doesn't matter).

I would then like to output the nodes only found in one of the files.

Is there a library available for this goal? I guess there should be as comparing XMLs is a feature realized plenty of times already. So far I only found the seemingly outdated Microsoft Diff and Patch library which, as far as I can tell, doesn't offer the option to ignore the content of the nodes. It also feels wrong to use a library which seems to be that outdated.

2 upvote
  flag
Maybe, it is an idea to create an XML Schema for your document and validate both documents against the schema. If your use case is more general (pairs of varying document types), there also might be a way to generate a schema from one of the 2 documents and then validate the other against it... – BitTickler
upvote
  flag
upvote
  flag
@BitTickler this sounds interesting, I will try this approach! – Jacob S
upvote
  flag
When you said "seemingly outdated", I hope you seriously test it out. Something old can work flawlessly. – Lex Li
upvote
  flag
Maybe XNode.DeepEquals method helps you. Load xml to XElement, remove all content, then use it. – Alexander Petrov

1 Answers 11

When you don't want to compare content and care just about structure, let's get rid of the content!

You can use the built-in XmlDocument class to load both XML files into memory and then start from the root and go down the tree and remove all content (as you don't care about it). Then you can take these augmented files and run them through a ordinary file diff-library (for example Google-diff-match-patch)

upvote
  flag
I forgot to mention that the order in which the child nodes appear doesn't matter (edited now). Therefore I would most likely have to also create the diff manually. I guess I'll use this approach if there is no library which realizes this already. – Jacob S
upvote
  flag
Well, you can always sort the nodes on the same level by their name, which should again give you the same, diff-friendly structure on both sides – Martin Zikmund

Not the answer you're looking for? Browse other questions tagged or ask your own question.